1
00:00:00,040 --> 00:00:05,240
Prompting is the secret skill 
that taps into a is real 

2
00:00:05,240 --> 00:00:09,480
capabilities, transforming large
language models from flashy 

3
00:00:09,480 --> 00:00:12,880
demos into engines of real world
productivity. 

4
00:00:13,280 --> 00:00:19,440
Today on CXO Talk number 883, we
unpack how prompting works, what

5
00:00:19,440 --> 00:00:23,000
it is, why it matters, and how 
to get it right. 

6
00:00:23,680 --> 00:00:27,840
I'm Michael Krigsman, and with 
me is Nate B Jones, a widely 

7
00:00:27,840 --> 00:00:33,160
respected AI expert whose sharp 
insights and no nonsense advice 

8
00:00:33,160 --> 00:00:36,800
have earned him nearly 300,000 
TikTok followers. 

9
00:00:36,880 --> 00:00:40,720
We need to talk a lot about 
prompting for two reasons. 

10
00:00:41,600 --> 00:00:45,360
One, human language is fairly 
vague. 

11
00:00:45,480 --> 00:00:48,120
That's why we invented computer 
language back when we were 

12
00:00:48,120 --> 00:00:50,840
programming computers in the 1st
place, because it's much more 

13
00:00:50,840 --> 00:00:53,600
precise. 
And now we're using effectively 

14
00:00:53,600 --> 00:00:57,000
natural language to program 
computers again, and that's 

15
00:00:57,000 --> 00:01:00,160
challenging. 
The second reason is that even 

16
00:01:00,160 --> 00:01:03,360
though these models are very 
intelligent in certain respects,

17
00:01:03,800 --> 00:01:09,640
they are not incredibly reliable
yet at inferring your intent. 

18
00:01:10,120 --> 00:01:13,280
If you are not precise about 
what you mean or want, they 

19
00:01:13,280 --> 00:01:16,560
don't do that reliably. 
They guess, and they might guess

20
00:01:16,560 --> 00:01:17,760
right, and they might guess 
wrong. 

21
00:01:18,280 --> 00:01:23,760
And so both because we have to 
get clear with our language and 

22
00:01:23,760 --> 00:01:28,000
because models don't yet infer 
with tremendous precision, 

23
00:01:28,680 --> 00:01:30,160
prompting is what bridges that 
gap. 

24
00:01:30,640 --> 00:01:35,600
So when we are prompting, we 
we're programming the AI. 

25
00:01:36,000 --> 00:01:38,920
This is really going to take you
back, but like in the old days 

26
00:01:39,120 --> 00:01:42,360
in the 60s with punch card 
computing, you would literally 

27
00:01:42,360 --> 00:01:45,360
bring your little, your little 
punch card and, and put into the

28
00:01:45,360 --> 00:01:48,880
computer and you would run it 
and you would see in 20 or 30 

29
00:01:48,880 --> 00:01:51,640
minutes whether you got that 
right or not, or maybe longer 

30
00:01:51,640 --> 00:01:53,040
than that if it was a big 
program. 

31
00:01:54,080 --> 00:01:56,680
We're doing exactly the same 
thing with natural language now.

32
00:01:56,680 --> 00:02:00,400
We're handing the prompt to an 
inference model, maybe O3 Pro. 

33
00:02:00,400 --> 00:02:02,400
It does take that long, 20 or 30
minutes. 

34
00:02:02,400 --> 00:02:04,600
And we're going to come back and
we're going to see if our little

35
00:02:04,640 --> 00:02:06,400
natural language program did 
anything. 

36
00:02:07,000 --> 00:02:10,080
It's fascinating how time is a 
circle in that regard, where 

37
00:02:10,080 --> 00:02:15,280
we're back to where we started. 
So the logic of prompting is 

38
00:02:15,440 --> 00:02:18,480
effectively the logic of 
software development. 

39
00:02:18,480 --> 00:02:21,360
Is that a correct way to say it?
You could say it as like the 

40
00:02:21,360 --> 00:02:24,640
marriage of software development
and business intent. 

41
00:02:25,240 --> 00:02:28,560
So in a sense, software 
development has been predicated 

42
00:02:28,560 --> 00:02:34,120
primarily on building interfaces
that allow business operations 

43
00:02:34,120 --> 00:02:36,960
to be conducted, business logic 
to be encoded, etcetera. 

44
00:02:37,520 --> 00:02:40,920
But now, because these models 
have the ability to sort of 

45
00:02:40,960 --> 00:02:44,880
bring intelligence to bear, 
you're not just asking it to do 

46
00:02:44,880 --> 00:02:47,600
1 specific thing, You're not 
spending your time engineering a

47
00:02:47,600 --> 00:02:51,000
specific interface. 
Instead, you're asking the model

48
00:02:51,000 --> 00:02:53,920
to think with you. 
And so it's this weird mix of 

49
00:02:54,240 --> 00:02:59,240
the principles of engineering 
with the business clarity of 

50
00:02:59,240 --> 00:03:03,080
intent that has always 
characterized a very strong 

51
00:03:03,080 --> 00:03:09,400
executive brief, for example. 
As the models get better, what 

52
00:03:09,400 --> 00:03:13,920
does that do to prompting? 
Does it make prompting easier or

53
00:03:13,920 --> 00:03:17,880
more difficult? 
On the one hand, you don't have 

54
00:03:17,880 --> 00:03:23,080
to do some of the stage 
management that you had to do in

55
00:03:23,080 --> 00:03:27,520
2022 and 2023 anymore. 
You'll recall when ChatGPT first

56
00:03:27,520 --> 00:03:30,560
came out, the prompting guides 
were like, OK, tell it to 

57
00:03:30,560 --> 00:03:33,080
pretend it's the best editor in 
the world. 

58
00:03:33,480 --> 00:03:36,760
Tell it to pretend this or that.
And then it began to sort of 

59
00:03:36,760 --> 00:03:38,560
turn into chain of thought 
prompting. 

60
00:03:38,560 --> 00:03:41,520
Tell it to think step by step, 
do this and do this and do this 

61
00:03:41,520 --> 00:03:44,680
and do this. 
That stage management is 

62
00:03:44,680 --> 00:03:47,280
thankfully no longer really 
necessary. 

63
00:03:48,160 --> 00:03:52,680
You can instantiate the model by
saying you are in a particular 

64
00:03:52,680 --> 00:03:55,840
space, like you're in a 
consultative strategist space, 

65
00:03:55,840 --> 00:03:58,840
you're in a CFO space, whatever.
You can say that, but you don't 

66
00:03:58,840 --> 00:04:01,680
have to like put in the 
adjectives and hope and pray 

67
00:04:01,680 --> 00:04:03,160
that the model understands what 
you mean. 

68
00:04:03,520 --> 00:04:04,960
You can just say this is where 
we are. 

69
00:04:04,960 --> 00:04:07,040
You don't have to specify chain 
of thought anymore. 

70
00:04:07,320 --> 00:04:10,040
The frontline models know to use
chain of thought when they need 

71
00:04:10,040 --> 00:04:12,680
to. 
And so in that regard, prompting

72
00:04:12,680 --> 00:04:16,160
has gotten simpler. 
On the other hand, the 

73
00:04:16,160 --> 00:04:19,839
importance of specifying what 
you're looking for, what success

74
00:04:19,839 --> 00:04:23,240
criteria looks like, what the 
constraints are, that's only 

75
00:04:23,240 --> 00:04:27,280
gotten more important because 
these models are much more 

76
00:04:27,280 --> 00:04:31,160
powerful. 
And so before, like if it was a 

77
00:04:31,160 --> 00:04:35,240
very simple ask that you had for
a smaller model, you could go 

78
00:04:35,240 --> 00:04:37,560
back and forth a few times and 
figure out what you wanted and 

79
00:04:37,560 --> 00:04:39,280
it was fine. 
But if you give something to a 

80
00:04:39,280 --> 00:04:43,040
frontline model and like it's 
running for 6 minutes, 8 

81
00:04:43,040 --> 00:04:47,680
minutes, 10 minutes, 20 minutes,
it comes back and you just did 

82
00:04:47,680 --> 00:04:50,960
not clearly specify the scope, 
you're going to be frustrated 

83
00:04:52,000 --> 00:04:53,760
because you wasted all of that 
compute. 

84
00:04:54,160 --> 00:05:00,080
And so in that sense, some of 
the some of the, the, the stage 

85
00:05:00,080 --> 00:05:02,440
management and scripting that 
you're used to, you don't have 

86
00:05:02,440 --> 00:05:04,720
to do anymore, but the 
importance of specifying the 

87
00:05:04,720 --> 00:05:08,840
work very clearly has grown. 
Like you have to really take 

88
00:05:08,840 --> 00:05:12,440
that seriously now. 
So you really need to match the 

89
00:05:12,440 --> 00:05:16,480
prompt to the model. 
A lot of the art of it is in 

90
00:05:16,480 --> 00:05:20,880
figuring out what is this 
subject, what is my intent, what

91
00:05:20,880 --> 00:05:23,320
is the right model for that? 
And once I have all of that 

92
00:05:23,320 --> 00:05:26,680
figured out, now how do I craft 
a prompt and then bring in the 

93
00:05:26,680 --> 00:05:29,880
context the model needs so it 
can do a good job for me? 

94
00:05:30,200 --> 00:05:37,200
For example, with open AIA, 
number of the models allow you 

95
00:05:37,200 --> 00:05:42,480
to either include deep reasoning
or research or not. 

96
00:05:42,560 --> 00:05:46,960
And actually other other 
companies as well other LLM same

97
00:05:46,960 --> 00:05:48,800
thing. 
So give us some examples of 

98
00:05:48,800 --> 00:05:51,280
this. 
This is one of the things where 

99
00:05:51,800 --> 00:05:55,080
model makers have not done a 
great job at the product surface

100
00:05:55,080 --> 00:05:56,760
of explaining what their models 
do. 

101
00:05:57,960 --> 00:06:01,640
For example, deep research is 
really a very narrow web agent 

102
00:06:02,080 --> 00:06:07,160
that is trained as a research 
assistant to go out to look 

103
00:06:07,160 --> 00:06:11,040
across the entire browsable web,
it doesn't yet look behind 

104
00:06:11,040 --> 00:06:15,360
paywalls and to come back with a
consolidated view. 

105
00:06:15,560 --> 00:06:18,160
And they train it specifically 
on citation. 

106
00:06:18,160 --> 00:06:21,320
So it's good at citations. 
It lists what it knows and why. 

107
00:06:22,560 --> 00:06:25,920
Open AI pioneered this with Deep
Research, but Deep Research is 

108
00:06:26,240 --> 00:06:29,240
now available on Perplexity. 
It's available on Claude. 

109
00:06:29,240 --> 00:06:32,160
It's available with Google. 
Lots of others have picked this 

110
00:06:32,160 --> 00:06:34,400
up because it turns out that 
reasoning across the web is a 

111
00:06:34,400 --> 00:06:37,400
lot of what we do. 
And so there's just inherent 

112
00:06:37,400 --> 00:06:43,520
value in report generation. 
But people don't realize that 

113
00:06:43,560 --> 00:06:48,280
all you're getting with Deep 
Research is the O3 model if it's

114
00:06:48,280 --> 00:06:51,720
ChatGPT specifically tuned to 
web search. 

115
00:06:52,880 --> 00:06:57,200
And that is different from 
whatever else you've been 

116
00:06:57,200 --> 00:07:00,080
talking about with whatever 
model you've been talking about 

117
00:07:00,080 --> 00:07:02,080
in ChatGPT previously. 
So if you've been having a 

118
00:07:02,080 --> 00:07:05,200
conversation with four O for a 
bit and then you turn on deep 

119
00:07:05,200 --> 00:07:09,640
research, it's not that four O 
suddenly picks up a Cape and 

120
00:07:09,640 --> 00:07:11,920
becomes a superhero and turns 
into deep research. 

121
00:07:12,200 --> 00:07:16,040
It's that you are invoking a 
separate agentic tool, getting a

122
00:07:16,040 --> 00:07:21,160
separate prompt in, starting a 
new flow, and then that report 

123
00:07:21,160 --> 00:07:23,200
is going to come back and you're
going to be able to continue the

124
00:07:23,200 --> 00:07:25,320
chat. 
And I think that a lot of people

125
00:07:25,320 --> 00:07:28,680
don't think about it that way. 
And it's become even more 

126
00:07:28,680 --> 00:07:32,280
confusing in the last week 
because O3 Pro on the surface 

127
00:07:32,280 --> 00:07:36,160
looks very, very similar. 
It's got a long thinking time. 

128
00:07:36,680 --> 00:07:39,160
You give it a prompt, it goes 
away, it comes back. 

129
00:07:39,160 --> 00:07:42,720
And so people have asked me, did
they just release a clone of 

130
00:07:42,720 --> 00:07:46,120
Deep Research and rename it? 
And the answer is no. 

131
00:07:46,600 --> 00:07:50,040
The answer is that O3 Pro is a 
generalizable model with a lot 

132
00:07:50,040 --> 00:07:52,040
of different tool calls under 
the surface. 

133
00:07:52,920 --> 00:07:56,560
But precisely because it's under
the surface, it's difficult to 

134
00:07:56,560 --> 00:07:59,680
know that staring at the chat 
window when it takes a similar 

135
00:07:59,680 --> 00:08:04,400
amount of time and comes back. 
And so I think that some of what

136
00:08:04,400 --> 00:08:08,720
I do is just try and convey the 
nuances of these models and how 

137
00:08:08,720 --> 00:08:11,360
understanding them with a little
bit of a fingertip feel can 

138
00:08:11,360 --> 00:08:16,280
shape the way we prompt. 
Again, there's a level of 

139
00:08:16,320 --> 00:08:21,120
confusion here. 
I mean, I use so many different 

140
00:08:21,120 --> 00:08:28,800
models every single day and I am
on an ongoing basis having to 

141
00:08:29,440 --> 00:08:32,640
kind of experiment. 
You know, it's like this whole 

142
00:08:32,720 --> 00:08:37,679
domain is very immature because 
the models are changing and the 

143
00:08:37,679 --> 00:08:40,440
models give indeterminate 
results in any case. 

144
00:08:40,720 --> 00:08:43,880
And that means you keep having 
to adjust your prompts on an 

145
00:08:43,880 --> 00:08:46,240
ongoing basis. 
It's it's really a waste of 

146
00:08:46,240 --> 00:08:48,720
time. 
I think if it was a waste of 

147
00:08:48,720 --> 00:08:53,400
time, we wouldn't be seeing the 
kind of tremendous uptake we see

148
00:08:53,400 --> 00:08:57,720
on ground groundswell usage with
these models. 

149
00:08:57,960 --> 00:09:01,720
One of the biggest challenges 
with IT and security this year 

150
00:09:01,720 --> 00:09:06,120
is shadow IT where people are 
finding these models so useful 

151
00:09:06,120 --> 00:09:10,000
for the work that they do that 
they are using them even outside

152
00:09:10,000 --> 00:09:12,000
traditional IT security 
practices. 

153
00:09:12,520 --> 00:09:16,600
And in that sense I share your 
frustration. 

154
00:09:17,280 --> 00:09:20,800
I find that like when I am not 
getting what I want, there's, 

155
00:09:20,800 --> 00:09:23,200
there's nothing more frustrating
than sort of pounding my head on

156
00:09:23,200 --> 00:09:25,960
the wall and trying to figure 
out what the model needs to hear

157
00:09:25,960 --> 00:09:29,040
from me so that it can give me 
what I want. 

158
00:09:30,040 --> 00:09:34,920
But net, net, if I look across 
my overall productivity for the 

159
00:09:34,920 --> 00:09:40,080
day, for the week, I am so much 
more productive now, even with 

160
00:09:40,080 --> 00:09:42,640
all of that factored end than I 
was two years ago. 

161
00:09:42,640 --> 00:09:46,360
And it's because I'm learning 
enough about how to work with 

162
00:09:46,360 --> 00:09:49,680
these models that I'm able to 
get a tremendous amount of value

163
00:09:49,680 --> 00:09:51,320
back. 
And I think a lot of people are 

164
00:09:51,320 --> 00:09:54,160
having that experience. 
And maybe I shouldn't say it's a

165
00:09:54,160 --> 00:09:56,680
waste of time, although I do 
think it's a waste of time, but 

166
00:09:56,680 --> 00:10:03,360
let's just say that there's a 
lot of overhead that seems like 

167
00:10:03,360 --> 00:10:06,000
it shouldn't be there. 
That's a really fair call. 

168
00:10:06,360 --> 00:10:09,600
That's basically a complaint and
not that it makes any difference

169
00:10:09,640 --> 00:10:14,080
at all, because that's the 
nature of the maturity of these 

170
00:10:14,080 --> 00:10:17,840
models as products at this point
in time. 

171
00:10:18,240 --> 00:10:22,400
What I am curious to see 
answered by the model makers in 

172
00:10:22,400 --> 00:10:27,560
the next probably 18 months is 
the extent to which prompting 

173
00:10:28,080 --> 00:10:33,320
remains a durable skill set that
provides tremendous alpha to 

174
00:10:33,320 --> 00:10:36,600
people who know how to use it 
well versus the extent to which 

175
00:10:36,600 --> 00:10:39,480
it commoditizes. 
Not necessarily because everyone

176
00:10:39,480 --> 00:10:42,880
learns the same amount, but 
because models get very, very 

177
00:10:42,880 --> 00:10:46,560
good at inferring intent across 
a range of prompts for the same 

178
00:10:46,560 --> 00:10:51,160
subject, and people are widely 
divergent on what they think 

179
00:10:51,160 --> 00:10:54,200
will happen. 
My own view is I'm trying to 

180
00:10:54,200 --> 00:10:58,320
take seriously the fact that I 
expected initially prompting to 

181
00:10:58,320 --> 00:11:03,880
be a very one off 2022-2023 
edge, and that's not been the 

182
00:11:03,880 --> 00:11:05,720
case. 
It's been stronger and stronger 

183
00:11:05,720 --> 00:11:08,280
over time instead. 
So I tend to lean toward the 

184
00:11:08,280 --> 00:11:11,400
idea that at least for the 
intermediate term, prompting is 

185
00:11:11,400 --> 00:11:13,480
going to continue to have a 
tremendous amount of value 

186
00:11:13,760 --> 00:11:15,800
because that's what we've seen 
as a trend so far. 

187
00:11:16,040 --> 00:11:19,120
There are people who think that 
if we can get to a level of 

188
00:11:19,120 --> 00:11:22,440
generalizability with these 
models, we will suddenly unlock 

189
00:11:22,440 --> 00:11:26,480
a tipping point and we will find
a way to infer very reliably 

190
00:11:26,480 --> 00:11:28,760
where we haven't before. 
And that might be. 

191
00:11:29,040 --> 00:11:31,760
And if that's the case, then 
suddenly prompting will become 

192
00:11:32,400 --> 00:11:36,280
less painful and less needed 
somewhere in the next 18 months.

193
00:11:36,720 --> 00:11:40,960
Subscribe to the CXO Talk 
newsletter so you can be part of

194
00:11:41,120 --> 00:11:43,880
our community. 
We have amazing shows coming up.

195
00:11:44,280 --> 00:11:49,120
What makes a good prompt? 
Do you have any practical 

196
00:11:49,320 --> 00:11:52,600
advice? 
Number one, be really clear 

197
00:11:52,960 --> 00:11:56,920
about the outcome that you are 
looking for and about how the 

198
00:11:56,920 --> 00:12:01,640
model can know that it's done. 
I think a lot of people will be 

199
00:12:01,640 --> 00:12:05,040
fairly loose about specifying 
the outcome or they'll be loose 

200
00:12:05,040 --> 00:12:06,960
about the goal. 
They'll be very, very loose or 

201
00:12:06,960 --> 00:12:09,640
non existent about how the model
can know that it's finished 

202
00:12:09,640 --> 00:12:12,280
adequately. 
And the more you can specify and

203
00:12:12,280 --> 00:12:14,680
be clear about what you're 
looking for and what good looks 

204
00:12:14,680 --> 00:12:17,000
like, the better off you're 
going to be for the rest of the 

205
00:12:17,000 --> 00:12:23,880
prompt #2 you want the model to 
have all the context that it 

206
00:12:23,880 --> 00:12:28,920
needs to do that job, and you 
would prefer it to not have any 

207
00:12:28,920 --> 00:12:31,080
extra context that it doesn't 
need. 

208
00:12:32,120 --> 00:12:36,000
A lot of what we call 
hallucinations are effectively 

209
00:12:36,600 --> 00:12:40,880
models reasoning outside your 
desired context window. 

210
00:12:41,360 --> 00:12:45,200
And so if you can be more clean 
and clear about this is what I 

211
00:12:45,200 --> 00:12:48,440
want you to focus on in a web 
search or here's some documents 

212
00:12:48,440 --> 00:12:50,600
I want you to review. 
I want you to keep your thinking

213
00:12:50,600 --> 00:12:53,000
focused around this particular, 
you know, set of meeting 

214
00:12:53,000 --> 00:12:56,960
transcripts or whatever it is. 
It will really help the model to

215
00:12:56,960 --> 00:13:02,000
be confident that it's doing the
right job and able to deliver a 

216
00:13:02,960 --> 00:13:08,000
reasoned results that closely 
matches the kind of work you 

217
00:13:08,000 --> 00:13:09,840
were looking for. 
And so the context piece is 

218
00:13:09,840 --> 00:13:11,800
another one. 
And then the third is really 

219
00:13:11,800 --> 00:13:18,120
making sure that you understand 
the the constraints and 

220
00:13:18,120 --> 00:13:19,840
guardrails that you want to put 
around. 

221
00:13:19,840 --> 00:13:22,800
So if you have a, if you have an
outcome or goal, if you have 

222
00:13:22,800 --> 00:13:25,560
context, you feed it. 
You then need to make sure that 

223
00:13:25,560 --> 00:13:30,120
the model knows don't do this. 
Where do I not go? 

224
00:13:30,720 --> 00:13:35,000
And I find that that is often 
one that people either barely 

225
00:13:35,000 --> 00:13:37,880
put in or tend to avoid because 
we tend to be thinking in a 

226
00:13:37,880 --> 00:13:40,960
positive stance of like, hey, 
this is what I want done. 

227
00:13:40,960 --> 00:13:42,720
Let me just give the task and 
go. 

228
00:13:43,200 --> 00:13:45,960
And maybe this is because we're 
anthropomorphizing models. 

229
00:13:46,280 --> 00:13:49,960
Anthropomorphizing models. 
We don't tend to regard a senior

230
00:13:49,960 --> 00:13:54,320
colleague as someone who needs a
tremendous number of warnings 

231
00:13:54,320 --> 00:13:57,080
and constraints for a task. 
We just say, hey, go tackle 

232
00:13:57,080 --> 00:13:58,720
this. 
I'm sure you'll do a great job. 

233
00:13:58,720 --> 00:14:01,080
Come back and let me think. 
Think about what you get. 

234
00:14:01,360 --> 00:14:03,400
These models need those 
constraints still. 

235
00:14:03,400 --> 00:14:07,000
Even if they in many ways are 
very senior in their thinking, 

236
00:14:07,520 --> 00:14:11,400
they still need helpful 
constraints so that they know 

237
00:14:11,400 --> 00:14:13,160
where the guardrails are in the 
space. 

238
00:14:13,160 --> 00:14:17,960
And they don't start to reason 
off the rails into a direction 

239
00:14:17,960 --> 00:14:20,080
that that isn't helpful. 
Because at the end of the day, 

240
00:14:20,080 --> 00:14:22,920
what they're really trying to do
is just infer from your 

241
00:14:22,920 --> 00:14:24,400
utterance what they think you 
mean. 

242
00:14:24,800 --> 00:14:29,480
Figure out where in latent space
they can go and get a reasonable

243
00:14:29,480 --> 00:14:31,600
pattern match. 
Do some searching across the 

244
00:14:31,600 --> 00:14:33,520
web. 
In the case of an inference 

245
00:14:33,520 --> 00:14:36,200
model, do a lot of that 
iteratively so they can figure 

246
00:14:36,200 --> 00:14:38,160
out what's best and then put 
together something. 

247
00:14:38,440 --> 00:14:40,680
And so they do need those 
guardrails to constrain. 

248
00:14:40,960 --> 00:14:44,640
Models are people too, and you 
can't. 

249
00:14:44,640 --> 00:14:52,160
Just as you can't expect your 
friend or your spouse to read 

250
00:14:52,160 --> 00:14:58,080
your mind, how can we expect 
models to anticipate every 

251
00:14:58,080 --> 00:15:02,440
possibility that's out there and
map it to what happens to be in 

252
00:15:02,440 --> 00:15:05,520
your mind at this given time, 
what you want when you write 

253
00:15:05,520 --> 00:15:07,760
this prompt? 
And that's the need to be 

254
00:15:07,760 --> 00:15:10,160
explicit. 
And that's why I say models are 

255
00:15:10,160 --> 00:15:13,600
people too. 
We as humans are very, very good

256
00:15:13,600 --> 00:15:18,280
at retaining long context from 
multiple conversations with our 

257
00:15:18,280 --> 00:15:22,280
colleagues and extracting what's
really important out of that and

258
00:15:22,280 --> 00:15:25,360
and getting to clear points of 
discussion. 

259
00:15:25,360 --> 00:15:27,960
Like I can talk with a software 
development manager about a 

260
00:15:27,960 --> 00:15:29,840
project that's been going for 
six months. 

261
00:15:30,120 --> 00:15:32,320
We can have a really meaningful 
discussion about the sticking 

262
00:15:32,320 --> 00:15:34,240
point decisions we've made in 
the past. 

263
00:15:34,240 --> 00:15:37,080
What we need to change. 
That is how humans have done 

264
00:15:37,080 --> 00:15:40,040
work for a long time. 
We we iterate over time 

265
00:15:40,040 --> 00:15:43,040
effectively, the prompt evolves 
through conversation over time. 

266
00:15:43,280 --> 00:15:46,160
It's shared work together with 
the model. 

267
00:15:46,680 --> 00:15:49,680
We can't have the same iterative
conversation. 

268
00:15:49,680 --> 00:15:53,440
We actually have to front load 
all of that thinking and give it

269
00:15:53,440 --> 00:15:56,200
to it in a really clean prompt 
so we can get a really clean 

270
00:15:56,200 --> 00:15:57,600
answer. 
And I think part of what's hard 

271
00:15:57,600 --> 00:16:00,640
for us about prompting is we're 
conversational people. 

272
00:16:00,760 --> 00:16:02,760
We like to chat just like you 
and I are chatting. 

273
00:16:03,000 --> 00:16:06,800
We make meaning that way, but 
the model needs us to sort of 

274
00:16:06,800 --> 00:16:10,160
compress that semantic meaning 
into a, like a really clean 

275
00:16:10,160 --> 00:16:13,440
initial prompt that will help it
to work effectively. 

276
00:16:13,840 --> 00:16:18,200
I think this is a very important
point that you're making is 

277
00:16:18,560 --> 00:16:24,200
there is the need to, as you 
said, to compress the context 

278
00:16:25,160 --> 00:16:32,560
into a digestible set of words 
and chunks that the model can 

279
00:16:32,560 --> 00:16:36,200
then use to execute the explicit
task. 

280
00:16:36,480 --> 00:16:39,160
Going back to an earlier 
comment, you made that in effect

281
00:16:39,160 --> 00:16:43,320
you are programming into the 
model, into the and driving the 

282
00:16:43,320 --> 00:16:45,680
conversation through that 
programming essentially. 

283
00:16:46,120 --> 00:16:50,200
Right, because we humans 
effectively, collectively derive

284
00:16:50,200 --> 00:16:53,200
intent and collectively reach 
decisions through conversation. 

285
00:16:53,200 --> 00:16:55,560
But the model needs you to be 
the one that provides the 

286
00:16:55,560 --> 00:16:57,360
intent, that provides the 
driving force. 

287
00:16:57,360 --> 00:17:01,240
There's a higher expectation of 
human agency in prompting. 

288
00:17:01,680 --> 00:17:06,480
Let's jump to some questions. 
If you're watching on Twitter, 

289
00:17:06,800 --> 00:17:10,640
pop your questions into Twitter 
using the hashtag CXO Talk. 

290
00:17:11,079 --> 00:17:14,640
If you're watching on LinkedIn, 
pop your questions into the 

291
00:17:14,640 --> 00:17:18,640
LinkedIn chat. 
And so this first question on 

292
00:17:18,640 --> 00:17:25,000
Twitter X goes to Arsalan Khan, 
who says it seems like asking 

293
00:17:25,000 --> 00:17:30,280
specific questions in your 
industry to the AI would deter 

294
00:17:30,280 --> 00:17:34,280
fake experts. 
But how would an end client know

295
00:17:34,280 --> 00:17:39,120
the difference between real 
experts using AI versus just 

296
00:17:39,120 --> 00:17:43,720
good or fake experts or 
salespeople using AI? 

297
00:17:44,080 --> 00:17:49,080
And if I can restate that 
question in terms of prompting, 

298
00:17:49,880 --> 00:17:53,240
if somebody is really, really 
good at prompting, can't they 

299
00:17:53,240 --> 00:17:57,760
present the appearance of being 
an expert where it's almost 

300
00:17:57,760 --> 00:18:02,160
impossible to tell them apart 
from somebody that has the PhD 

301
00:18:02,160 --> 00:18:03,720
in whatever the subject might 
be? 

302
00:18:04,120 --> 00:18:08,720
That is more true than many of 
us would like to admit. 

303
00:18:08,720 --> 00:18:13,760
I think it's part of why there 
are so many, many consultants 

304
00:18:13,760 --> 00:18:16,000
springing up, so many tools 
springing up. 

305
00:18:16,000 --> 00:18:20,720
The industry has a need for 
authenticity, but AI by its 

306
00:18:20,720 --> 00:18:24,000
nature is enabling many people 
to claim expertise that they 

307
00:18:24,000 --> 00:18:28,840
don't genuinely have. 
And there's just like there's 

308
00:18:28,840 --> 00:18:34,400
not a silver bullet solution for
detecting text in student essays

309
00:18:34,400 --> 00:18:37,040
and saying who, who wrote which 
which bit of text. 

310
00:18:38,080 --> 00:18:42,120
There's also not a silver bullet
solution for detecting 

311
00:18:42,120 --> 00:18:46,280
expertise. 
I find in practice, what tends 

312
00:18:46,280 --> 00:18:49,600
to be most helpful at 
distinguishing a true expert 

313
00:18:49,600 --> 00:18:56,680
from the sort of AI generated 
straw man expert is acknowledge 

314
00:18:56,680 --> 00:18:58,600
the source material. 
It's probably going to be very 

315
00:18:58,600 --> 00:19:01,040
good because AI helped prepare 
it, it's very thorough, 

316
00:19:01,040 --> 00:19:04,320
etcetera. 
Make sure you understand it and 

317
00:19:04,320 --> 00:19:09,080
then ask a question that's 
designed to push off balance, 

318
00:19:09,760 --> 00:19:13,120
push out of the comfort zone, 
and a true expert will be able 

319
00:19:13,120 --> 00:19:15,280
to adjust and have an 
interesting and thoughtful 

320
00:19:15,280 --> 00:19:18,960
perspective and not get too 
frustrated or flustered. 

321
00:19:19,440 --> 00:19:23,080
And someone who's depending 
heavily on the prompting is 

322
00:19:23,080 --> 00:19:26,080
often going to struggle because 
they won't be able to actually 

323
00:19:26,080 --> 00:19:28,920
have that flexible intelligence 
across the domain that 

324
00:19:28,920 --> 00:19:31,960
characterizes true expertise. 
So you're saying that there is a

325
00:19:31,960 --> 00:19:37,160
level of depth that the model 
doesn't, or that a person who's 

326
00:19:37,160 --> 00:19:39,160
simply relying on the model 
doesn't have. 

327
00:19:39,160 --> 00:19:41,560
Is that another way of saying? 
I'll give you an example. 

328
00:19:41,560 --> 00:19:47,240
So I was doing an article on 
prompting an O3 Pro versus O3, 

329
00:19:47,240 --> 00:19:51,360
right, Because O3 Pro came out 
this week and I asked it to 

330
00:19:51,360 --> 00:19:54,120
prepare a road map because I'm 
very familiar with Rd. maps that

331
00:19:54,120 --> 00:19:55,480
came up through product 
management. 

332
00:19:55,480 --> 00:19:57,400
I've seen more of them than I 
would care to admit. 

333
00:19:58,000 --> 00:20:02,800
And I asked, I asked for that 
because I knew I could judge it.

334
00:20:02,800 --> 00:20:05,040
I knew I had the expertise to 
assess it. 

335
00:20:05,680 --> 00:20:09,720
And I was talking with someone 
afterward and I was saying O3 

336
00:20:09,720 --> 00:20:11,600
Pro did a much better job on the
road map. 

337
00:20:12,160 --> 00:20:14,400
And they were like, well, how 
did you know? 

338
00:20:14,400 --> 00:20:18,400
Aren't Rd. maps subjective? 
And I immediately pulled up 3 or

339
00:20:18,400 --> 00:20:20,520
4 reasons why Rd. maps are not 
subjective. 

340
00:20:20,520 --> 00:20:22,040
Why? 
It's actually a craft you can 

341
00:20:22,040 --> 00:20:24,800
understand and and 9 out of 10 
experts will agree with you that

342
00:20:24,800 --> 00:20:27,040
a particular road map is better 
than another because it's a 

343
00:20:27,040 --> 00:20:30,600
proactive stance. 
It takes into account all of the

344
00:20:31,040 --> 00:20:32,920
strategic advantages the company
has. 

345
00:20:33,200 --> 00:20:34,760
It thoroughly understands the 
marketplace. 

346
00:20:34,760 --> 00:20:36,840
I could just go on and on. 
It's all at the top of my head. 

347
00:20:37,120 --> 00:20:40,480
And so having that expertise 
helps you to assess the true 

348
00:20:40,480 --> 00:20:43,400
quality of model response. 
And in a sense, what we're 

349
00:20:43,400 --> 00:20:46,760
seeing here is that these models
are getting to a level of 

350
00:20:46,760 --> 00:20:51,280
intelligence where their very 
best work takes an expert to 

351
00:20:51,280 --> 00:20:54,920
truly understand and appreciate.
We have some really interesting 

352
00:20:54,920 --> 00:20:57,680
questions that are coming in on 
LinkedIn right now. 

353
00:20:58,120 --> 00:21:03,040
And Greg Walters is responding 
from the to the point you made 

354
00:21:03,520 --> 00:21:06,200
earlier. 
Nate, where you where you were 

355
00:21:06,200 --> 00:21:09,280
describing the need for a 
compressed highly efficient 

356
00:21:09,280 --> 00:21:12,480
prompt. 
And Greg says this isn't the 

357
00:21:12,480 --> 00:21:17,760
magic in prompt iteration. 
Instead of having one 

358
00:21:17,760 --> 00:21:22,960
compressed, highly efficient and
explicit explicit task or 

359
00:21:22,960 --> 00:21:28,000
prompts shouldn't there? 
Shouldn't we be collectively 

360
00:21:28,440 --> 00:21:32,400
prompting prompting? 
It depends on the kind of task 

361
00:21:32,400 --> 00:21:36,880
that you're looking for. 
So this gets back to the 

362
00:21:36,880 --> 00:21:39,840
relationship between prompting 
and model selection. 

363
00:21:40,440 --> 00:21:45,520
For certain kinds of models, 
they're more suitable to 

364
00:21:46,120 --> 00:21:48,960
iterative thinking and iterative
brainstorming. 

365
00:21:48,960 --> 00:21:51,840
We haven't really talked about 
the relationship between model 

366
00:21:51,840 --> 00:21:57,600
and interface, but I find if I'm
using advanced voice mode, it's 

367
00:21:57,600 --> 00:22:01,360
just a very different experience
for my brain because I'm talking

368
00:22:01,360 --> 00:22:04,880
instead of typing and I am much 
looser and it's much more 

369
00:22:04,880 --> 00:22:07,120
conversational. 
It is in a sense much more 

370
00:22:07,120 --> 00:22:09,280
iterative and I keep it that way
on purpose. 

371
00:22:09,960 --> 00:22:16,560
But if I'm working with a long 
inference time model, and it's 

372
00:22:16,560 --> 00:22:19,040
not just that ChatGPT has a 
monopoly on those. 

373
00:22:19,040 --> 00:22:21,160
Opus 4 is a great example from 
Claude. 

374
00:22:22,520 --> 00:22:26,520
I want to be clear in what I'm 
looking for because frankly, it 

375
00:22:26,520 --> 00:22:30,480
is expensive to iterate when the
cycle times take that long. 

376
00:22:30,720 --> 00:22:33,840
And so I pick the problem and I 
pick the model and that guides 

377
00:22:33,840 --> 00:22:37,680
me to a prompting style. 
We have a really interesting 

378
00:22:37,680 --> 00:22:43,160
question from Wayne Anderson on 
LinkedIn and he says this. 

379
00:22:43,600 --> 00:22:47,360
How would you address the fear 
that leaders using large 

380
00:22:47,360 --> 00:22:53,040
language models could inhibit 
and erode decision making and 

381
00:22:53,560 --> 00:22:57,080
critical thinking? 
When does effective prompting 

382
00:22:57,080 --> 00:23:01,280
help and when do you think 
leader should avoid using AI? 

383
00:23:01,640 --> 00:23:05,200
It's kind of like asking, do you
want your doctor to avoid using 

384
00:23:05,320 --> 00:23:07,560
AI? 
If we have studies that show 

385
00:23:07,560 --> 00:23:10,000
that medical reasoning is 
something that these models are 

386
00:23:10,000 --> 00:23:14,440
very good at, I would love my 
doctor to use AI as long as my 

387
00:23:14,440 --> 00:23:16,360
doctor understands how to use it
well. 

388
00:23:16,800 --> 00:23:20,800
And so in that sense, my 
response is I want AI leaders to

389
00:23:20,800 --> 00:23:24,400
be using AI all the time. 
I just want them to understand 

390
00:23:24,560 --> 00:23:27,360
the limitations of these models 
and where they need to think 

391
00:23:27,360 --> 00:23:30,720
beyond the edges. 
And so really, I think it's more

392
00:23:30,720 --> 00:23:34,280
precise to say these are 
extraordinary models. 

393
00:23:34,280 --> 00:23:38,600
In some places, they are 
advancing the far edges of human

394
00:23:38,600 --> 00:23:40,920
thought and research. 
We have AI developed drugs and 

395
00:23:40,920 --> 00:23:46,560
pipeline, but they're narrow. 
They have like particular ways 

396
00:23:46,560 --> 00:23:49,080
in which we can prompt them that
generate extraordinarily 

397
00:23:49,080 --> 00:23:51,400
effective results. 
And the strength of a good 

398
00:23:51,400 --> 00:23:55,000
leader is not being only narrow,
it's at that T shaped leader 

399
00:23:55,000 --> 00:23:57,160
where you have that breadth of 
experience as well. 

400
00:23:57,600 --> 00:24:00,520
And So what I would look for a 
great leader to do with AI is to

401
00:24:00,520 --> 00:24:06,840
know when he or she needs to go 
to AI for a deep, precise, 

402
00:24:06,840 --> 00:24:09,640
thoughtful perspective on 
something and then to bring that

403
00:24:09,640 --> 00:24:13,160
generalized experience of the 
business to bear to say this is 

404
00:24:13,160 --> 00:24:18,720
how I would contextualize that 
and understand it for my broader

405
00:24:18,720 --> 00:24:21,240
problem set. 
But let me just go back to the 

406
00:24:21,240 --> 00:24:24,520
comment that Wayne Anderson 
made. 

407
00:24:24,760 --> 00:24:29,360
When I write certain things, 
I'll write something and I'll 

408
00:24:29,560 --> 00:24:33,640
ask ChatGPT or whatever the 
model is. 

409
00:24:33,640 --> 00:24:37,600
What do you think? 
And it will make suggestions and

410
00:24:37,600 --> 00:24:41,920
this new canvas feature, I guess
it's not so new anymore of 

411
00:24:41,920 --> 00:24:45,840
ChatGPT makes it really easy to 
like drill down to very small 

412
00:24:45,840 --> 00:24:48,920
segments. 
It produces good results. 

413
00:24:48,920 --> 00:24:54,160
But in the back of my mind, I'm 
thinking to myself, it's giving 

414
00:24:54,160 --> 00:24:59,800
me kind of the least common 
denominator, mass market 

415
00:25:00,480 --> 00:25:03,600
generalized. 
Solution. 

416
00:25:03,880 --> 00:25:06,240
Not necessarily. 
That might be your prompting. 

417
00:25:06,600 --> 00:25:08,960
And I think that's what's 
interesting about these models 

418
00:25:08,960 --> 00:25:14,000
is that you, you are correct 
that if you're not intentional 

419
00:25:14,000 --> 00:25:20,200
about how you frame the models 
position in latent space, it 

420
00:25:20,200 --> 00:25:23,920
will default towards something 
that's more highly probable, 

421
00:25:23,920 --> 00:25:26,840
which we often translate as the 
least common denominator. 

422
00:25:29,600 --> 00:25:32,880
If you are intentional though, 
and you want to lean in and say 

423
00:25:32,880 --> 00:25:36,080
I don't want a mid answer, I 
don't want a common answer. 

424
00:25:36,080 --> 00:25:38,560
I want a really creative answer.
I want a really thoughtful 

425
00:25:38,560 --> 00:25:40,880
answer. 
I want an answer that you 

426
00:25:40,880 --> 00:25:44,920
haven't heard or seen elsewhere.
Models are perfectly capable of 

427
00:25:44,920 --> 00:25:48,600
going that far and thinking more
creatively, thinking more 

428
00:25:48,600 --> 00:25:53,400
substantively, but they don't do
it by default because the way 

429
00:25:53,400 --> 00:25:57,360
they're trained is to be helpful
for as much of the population as

430
00:25:57,360 --> 00:25:58,880
possible. 
And so in a sense, our own 

431
00:25:58,880 --> 00:26:02,920
population distribution shapes 
the way the model makers are 

432
00:26:02,920 --> 00:26:05,160
tuning these models for general 
helpfulness. 

433
00:26:05,480 --> 00:26:08,320
And so it's up to us if we want 
something more on the far side 

434
00:26:08,320 --> 00:26:10,400
of the distribution to push for 
it. 

435
00:26:10,720 --> 00:26:12,880
We're drifting from prompting 
here. 

436
00:26:13,120 --> 00:26:14,400
Oh, you. 
Do that with prompting. 

437
00:26:14,960 --> 00:26:20,680
It's prompting going to create 
the next, you know, set of Bach 

438
00:26:20,680 --> 00:26:23,280
inventions. 
No, I don't think so. 

439
00:26:23,280 --> 00:26:26,080
And I think especially in the 
creative arts, like I would say 

440
00:26:26,080 --> 00:26:30,000
that like humans have tried, 
I've actually a huge fan of Bach

441
00:26:30,000 --> 00:26:31,400
in the cello suites. 
I love them. 

442
00:26:31,400 --> 00:26:33,040
I listen to them almost every 
weekend. 

443
00:26:34,240 --> 00:26:42,680
And people have tried to expand 
on to invent after Bach even 

444
00:26:42,680 --> 00:26:46,480
through the 20th century. 
And in my view, no one has done 

445
00:26:46,480 --> 00:26:48,640
for the cello what Bach has done
for the cello. 

446
00:26:49,160 --> 00:26:53,560
And so no, I don't believe that 
we are in any danger of a 

447
00:26:53,560 --> 00:26:56,640
machine coming along and doing a
better job than Bach at Cello 

448
00:26:56,640 --> 00:26:59,640
Suites. 
Let's jump over to Twitter from 

449
00:26:59,640 --> 00:27:05,240
Chris Peterson, who says tokens 
and time measures are all very 

450
00:27:05,240 --> 00:27:09,720
well, but doesn't every round 
trip of prompting eat up more 

451
00:27:09,720 --> 00:27:13,360
electricity and water for 
cooling, thus making some of the

452
00:27:13,360 --> 00:27:16,920
numbers from Open AI and others 
highly misleading? 

453
00:27:17,320 --> 00:27:22,000
No or yes it does, and yes it 
matters in aggregate, and yes we

454
00:27:22,000 --> 00:27:23,960
should talk about power use in 
aggregate. 

455
00:27:23,960 --> 00:27:26,000
I think it's an appropriate 
conversation to have. 

456
00:27:26,000 --> 00:27:31,840
But individual, individual 
prompt usage by people doesn't 

457
00:27:31,960 --> 00:27:35,640
compare to some of the other 
things we do day-to-day that use

458
00:27:35,800 --> 00:27:38,360
energy and water. 
So taking a hot bath is much, 

459
00:27:38,360 --> 00:27:42,120
much more expensive in water 
than any kind of ChatGPT prompt 

460
00:27:42,120 --> 00:27:44,800
you're going to run. 
Watching an hour of football on 

461
00:27:44,800 --> 00:27:47,960
the big screen is much, much 
more expensive in electricity. 

462
00:27:48,320 --> 00:27:52,480
Then I think it runs up to a 
couple of 100 ChatGPT prompts. 

463
00:27:52,480 --> 00:27:55,360
And so does it matter in 
aggregate? 

464
00:27:55,360 --> 00:27:57,920
Yes, because suddenly a billion 
of us are using this. 

465
00:27:57,920 --> 00:27:59,640
It's important, we should talk 
about it. 

466
00:27:59,960 --> 00:28:02,600
Not saying that we don't have 
relevant conversations to have, 

467
00:28:02,880 --> 00:28:07,160
but I think the idea that an 
individual prompt is 

468
00:28:07,160 --> 00:28:12,360
fantastically expensive is 
incorrect when we actually 

469
00:28:12,360 --> 00:28:15,800
factor in the energy usage of a 
day-to-day life. 

470
00:28:16,200 --> 00:28:18,320
Let's jump over to another 
question. 

471
00:28:18,320 --> 00:28:23,240
This is from Chris Chablonsky on
LinkedIn who says Do you have 

472
00:28:23,240 --> 00:28:28,280
any tips for using Gen. 
AI for a data analyst to process

473
00:28:28,280 --> 00:28:32,960
a large data set and generate 
visualizations? 

474
00:28:33,320 --> 00:28:35,640
I'm not sure if it's the right 
tool for the job. 

475
00:28:35,640 --> 00:28:39,120
I've sort of talked about this a
little bit with folks who are 

476
00:28:39,120 --> 00:28:45,560
managing large data sets. 
And what I find AI 

477
00:28:45,560 --> 00:28:51,480
extraordinarily good at is 
handling data sets that don't 

478
00:28:51,480 --> 00:28:54,160
have clean numeric, numeric 
data, right? 

479
00:28:54,160 --> 00:28:56,960
If you have clean numeric data, 
we have fantastic tools for 

480
00:28:56,960 --> 00:29:00,200
that, and they may include 
machine learning or they may be 

481
00:29:00,200 --> 00:29:04,040
just traditional sequel, but 
we're very, very good at 

482
00:29:04,040 --> 00:29:05,920
handling that efficiently with 
compute. 

483
00:29:07,680 --> 00:29:12,240
I don't know why we would switch
that out and ask a large 

484
00:29:12,240 --> 00:29:15,240
language model to do that when 
the language model wasn't even 

485
00:29:15,480 --> 00:29:20,280
designed primarily to be numbers
driven. 

486
00:29:20,280 --> 00:29:25,040
They use Python And other tools 
to handle numbers now, and 

487
00:29:25,040 --> 00:29:26,720
that's great. 
But if you're talking about a 

488
00:29:26,720 --> 00:29:32,000
truly large data set, we have 
tools that handle those data 

489
00:29:32,000 --> 00:29:33,960
sets and visualizations really 
effectively. 

490
00:29:33,960 --> 00:29:36,920
And what I find people using in 
practice when they're looking at

491
00:29:36,920 --> 00:29:40,400
large data sets and AI is 
they're using AI to help them 

492
00:29:40,400 --> 00:29:44,000
craft SQL statements. 
They're using AI to help them 

493
00:29:44,000 --> 00:29:47,920
think through the data schema 
that they want to set up. 

494
00:29:47,920 --> 00:29:52,160
Sometimes they're using AI to 
help them prototype 

495
00:29:52,160 --> 00:29:54,440
visualizations that they will 
want to get to quickly. 

496
00:29:54,440 --> 00:29:59,160
Claude is great for that. 
And all of those are sort of, by

497
00:29:59,160 --> 00:30:01,600
the way, uses of AI that help 
you to use that data more 

498
00:30:01,600 --> 00:30:03,720
effectively. 
But that's different from the 

499
00:30:03,720 --> 00:30:06,080
traditional assumption that you 
can just sort of type the query 

500
00:30:06,080 --> 00:30:09,680
in and you will magically get a 
better answer than you would get

501
00:30:09,680 --> 00:30:14,720
with really efficient sequel. 
It's a really, really good point

502
00:30:15,120 --> 00:30:19,480
that you've got to have an 
understanding of the particular 

503
00:30:19,480 --> 00:30:24,440
tool that you're using and what 
will be the most effective use 

504
00:30:24,440 --> 00:30:28,040
of that tool. 
And as you said, prompts are 

505
00:30:28,120 --> 00:30:32,440
great if you have a body of data
and you're trying to figure out 

506
00:30:32,440 --> 00:30:36,840
what have I got here and how can
I present it? 

507
00:30:36,840 --> 00:30:38,880
And is there something that I'm 
missing? 

508
00:30:39,040 --> 00:30:41,560
Prototyping, as you say. 
But there are tools out there 

509
00:30:41,560 --> 00:30:46,440
that are designed for, you know,
millions of records and that do 

510
00:30:46,440 --> 00:30:48,400
it really well. 
I don't think you'd want to put 

511
00:30:48,400 --> 00:30:50,480
millions of records into 
ChatGPT. 

512
00:30:50,840 --> 00:30:53,680
It's not really designed, if you
think about what we mean when we

513
00:30:53,680 --> 00:30:58,400
talk about context and prompts. 
It's designed to look across the

514
00:30:58,400 --> 00:31:00,960
overall picture. 
And oftentimes with data, we 

515
00:31:00,960 --> 00:31:03,040
don't just want an overall 
picture. 

516
00:31:03,040 --> 00:31:05,920
We want precision. 
And that's something that we 

517
00:31:05,920 --> 00:31:08,160
have existing tools that do 
very, very well at. 

518
00:31:08,560 --> 00:31:13,080
Let's get into the structure, 
the nature of large language 

519
00:31:13,080 --> 00:31:18,960
models, how they think, think in
quotes and operate, and what 

520
00:31:18,960 --> 00:31:22,800
that means for prompts. 
Maybe take us down that path a 

521
00:31:22,800 --> 00:31:25,640
little bit. 
It's probably worth calling out 

522
00:31:25,640 --> 00:31:29,040
that a lot of the difference in 
how prompting has evolved is 

523
00:31:29,040 --> 00:31:33,920
being driven by this movement 
from large language models that 

524
00:31:33,920 --> 00:31:37,160
are what I would call vanilla. 
So that it's just coming back 

525
00:31:37,160 --> 00:31:41,000
with a response based on weights
and vector space developed 

526
00:31:41,640 --> 00:31:45,640
through pre training data. 
Which is what we had into 2024. 

527
00:31:46,280 --> 00:31:49,800
And then the newer version, 
which is inference time, 

528
00:31:49,800 --> 00:31:53,200
computes models where they have 
that same underlying 

529
00:31:53,200 --> 00:31:57,360
architecture, but at the time 
you press enter and send in your

530
00:31:57,360 --> 00:32:01,760
query, they are running threads 
in the background trying to 

531
00:32:01,760 --> 00:32:03,160
figure out what the correct 
response is. 

532
00:32:03,160 --> 00:32:04,600
And there's different ways of 
doing that. 

533
00:32:04,600 --> 00:32:08,160
Sometimes it's a combination of 
expert models in the background 

534
00:32:08,160 --> 00:32:10,720
that are sort of coming up with 
answers and deciding amongst 

535
00:32:10,720 --> 00:32:12,800
themselves. 
Sometimes it's running the same 

536
00:32:12,800 --> 00:32:15,920
query multiple times in parallel
in the background trying to find

537
00:32:16,120 --> 00:32:19,760
the most common answer. 
Regardless of the underlying 

538
00:32:19,760 --> 00:32:25,480
architecture, the effect of 
having more time to run cycles 

539
00:32:25,640 --> 00:32:29,560
on your query is tremendous. 
It's it's a night and day 

540
00:32:29,560 --> 00:32:31,720
difference in terms of the 
intelligence that the model is 

541
00:32:31,720 --> 00:32:36,120
able to respond with. 
And so that is a lot of what has

542
00:32:36,120 --> 00:32:39,040
shaped different prompting. 
A lot of the reason we don't 

543
00:32:39,040 --> 00:32:43,560
have to give chain of thought 
instructions anymore is because 

544
00:32:43,720 --> 00:32:49,560
the models already have a way of
deeply processing the queries we

545
00:32:49,560 --> 00:32:51,920
give them when they are 
inference models and they don't 

546
00:32:51,920 --> 00:32:55,560
need our help to do so anymore. 
And so when I say you don't need

547
00:32:55,560 --> 00:32:57,720
chain of thought, but you want 
to be clear on your goals and 

548
00:32:57,720 --> 00:33:03,200
basically saying try and write a
prompt that understands that you

549
00:33:03,200 --> 00:33:07,720
are going to be running multiple
parallel streams of thought in 

550
00:33:07,720 --> 00:33:10,200
the background or multiple 
parallel streams of tokens in 

551
00:33:10,200 --> 00:33:13,120
the background. 
Constrain it. 

552
00:33:13,120 --> 00:33:15,440
Like if you if you have 10 that 
are going to run, you don't know

553
00:33:15,440 --> 00:33:16,800
it's time. 
But let's pretend it's time for 

554
00:33:16,800 --> 00:33:21,200
simplicity's sake. 
Make sure all 10 are focused on 

555
00:33:21,200 --> 00:33:23,920
what you care about because you 
want to constrain the scope of 

556
00:33:23,920 --> 00:33:27,600
the query so that it's actually 
focused on where you want to go 

557
00:33:27,600 --> 00:33:30,040
with a conversation. 
And so that's why I emphasize so

558
00:33:30,040 --> 00:33:32,080
much. 
Set a goal, make sure the model 

559
00:33:32,080 --> 00:33:34,760
knows what good looks like. 
Make sure you set guardrails, 

560
00:33:34,760 --> 00:33:38,480
etcetera, etcetera. 
Describe to us what you mean by 

561
00:33:38,480 --> 00:33:42,080
a chain of thought prompt. 
It's where you said I want you 

562
00:33:42,080 --> 00:33:46,160
to answer my query to a 
traditional model using pre 

563
00:33:46,160 --> 00:33:49,240
training data weights and it 
would come back and answer. 

564
00:33:49,240 --> 00:33:53,520
But you wanted the token stream 
to go through a particular 

565
00:33:53,520 --> 00:33:56,320
sequence. 
And so it's going to go through.

566
00:33:56,320 --> 00:33:58,000
And from a transformer 
perspective, like the 

567
00:33:58,000 --> 00:34:01,720
Transformers there, it's 
basically using your query, it's

568
00:34:01,720 --> 00:34:04,960
matching it in vector space. 
Once it vectorizes it with what 

569
00:34:04,960 --> 00:34:07,400
it has for weights and pre 
training data, it's coming back.

570
00:34:07,880 --> 00:34:11,280
And you're basically saying, let
me give you a deeper query with 

571
00:34:11,280 --> 00:34:13,639
a lot of things I want you to 
think about and do. 

572
00:34:13,880 --> 00:34:16,840
So start with, this is who you 
are. 

573
00:34:16,840 --> 00:34:20,440
You're an expert on marketing. 
Second, I want you to think very

574
00:34:20,440 --> 00:34:24,960
deliberately about this campaign
that I want to launch. 

575
00:34:24,960 --> 00:34:27,480
Third, these are the steps of 
thinking I want you to go 

576
00:34:27,480 --> 00:34:29,080
through. 
First, develop a plan. 

577
00:34:29,320 --> 00:34:32,480
Second, critique your plan. 
Third, understand the 

578
00:34:32,480 --> 00:34:34,840
consequences of the plan in the 
market and you can kind of go 

579
00:34:34,840 --> 00:34:35,880
through. 
And that's like chain of 

580
00:34:35,880 --> 00:34:38,239
thought, right? 
When you do that, you're 

581
00:34:38,239 --> 00:34:43,360
basically being very particular 
about the places in vector space

582
00:34:43,360 --> 00:34:46,679
that you want the model to go 
and hit when it's generating the

583
00:34:46,679 --> 00:34:49,159
response. 
And because models read like 

584
00:34:49,159 --> 00:34:53,280
humans do, they read top down, 
when the model hits that point, 

585
00:34:53,560 --> 00:34:58,640
it's going to be effectively 
sequentially reasoning back to 

586
00:34:58,640 --> 00:35:01,640
you because of the way you 
programmed it. 

587
00:35:01,640 --> 00:35:04,040
And so this gets back earlier in
our conversation, Michael, when 

588
00:35:04,040 --> 00:35:07,800
we talked about this idea of 
natural language programming, 

589
00:35:07,800 --> 00:35:10,640
that we are effectively 
programming the model, that was 

590
00:35:10,640 --> 00:35:14,000
sort of what we were doing. 
And all we're saying now is we 

591
00:35:14,000 --> 00:35:17,080
still have to program the model.
We don't have to program it 

592
00:35:17,080 --> 00:35:20,440
quite that way anymore. 
How do we program it today? 

593
00:35:20,840 --> 00:35:24,560
Today when we program the model,
we want to be focused more on 

594
00:35:24,560 --> 00:35:28,800
outcomes and goals. 
And in the past it was focused 

595
00:35:28,800 --> 00:35:34,240
more on process. 
And so today if I'm looking for 

596
00:35:34,360 --> 00:35:40,080
a report like I digested a, you 
know, 130, a 140 page economic 

597
00:35:40,080 --> 00:35:43,240
report this morning from the 
world, I think it was the World 

598
00:35:43,240 --> 00:35:44,640
Economic Forum, something like 
that. 

599
00:35:45,480 --> 00:35:49,760
It I I wanted the model to 
understand what I wanted out of 

600
00:35:49,760 --> 00:35:52,760
the report and the goal of the 
summary. 

601
00:35:52,760 --> 00:35:54,480
I didn't just want a vanilla 
summary. 

602
00:35:54,480 --> 00:35:57,680
And so my focus was on making 
sure it knew the the angle I 

603
00:35:57,680 --> 00:36:01,080
wanted on the report. 
And I trusted it to know how to 

604
00:36:01,080 --> 00:36:04,600
read, digest, summarize, think 
through all the things I would 

605
00:36:04,600 --> 00:36:08,600
have had to specify earlier. 
In that case, again, the the 

606
00:36:08,600 --> 00:36:14,360
context, giving it the 
background and the goals becomes

607
00:36:14,360 --> 00:36:20,120
the key focus of the prompt as 
opposed to telling the LLM how 

608
00:36:20,120 --> 00:36:23,280
to do its job. 
That's right, we have another 

609
00:36:23,280 --> 00:36:26,360
question from Arsalan Khan on 
Twitter. 

610
00:36:26,520 --> 00:36:29,040
Arsalan says to prompt or not to
prompt? 

611
00:36:29,040 --> 00:36:33,000
When is it appropriate and when 
is it just a rabbit hole for 

612
00:36:33,000 --> 00:36:36,160
your confirmation bias? 
I think one of the biggest 

613
00:36:36,160 --> 00:36:40,240
differences in the way people 
use models right now is people 

614
00:36:40,240 --> 00:36:44,240
who are focused with their 
models can use the model as a 

615
00:36:44,240 --> 00:36:46,760
mirror that focuses on a 
particular subject really 

616
00:36:46,760 --> 00:36:48,680
effectively. 
And there are people who are 

617
00:36:48,680 --> 00:36:51,520
less focused and the mirror 
becomes a scatterer for them. 

618
00:36:51,520 --> 00:36:53,880
It scatters their thinking. 
They become more confused as 

619
00:36:53,880 --> 00:36:56,280
they use it. 
And I've seen both. 

620
00:36:57,240 --> 00:37:00,160
What I find is interesting about
the critical thinking piece. 

621
00:37:00,960 --> 00:37:03,960
Imagine the mirror, and 
typically it faces you right. 

622
00:37:03,960 --> 00:37:05,640
Then it becomes a reflection of 
yourself. 

623
00:37:05,640 --> 00:37:07,840
You're absolutely right. 
There's no critical thinking 

624
00:37:07,840 --> 00:37:09,000
there. 
It's just coming back with 

625
00:37:09,000 --> 00:37:11,600
confirmation. 
But if you're smart, you can 

626
00:37:11,600 --> 00:37:14,720
turn the mirror away from 
yourself and you can focus it on

627
00:37:14,720 --> 00:37:17,880
something else and you can come 
back with a disconfirming or 

628
00:37:17,880 --> 00:37:20,960
divergent opinion. 
And so I will frequently ask the

629
00:37:20,960 --> 00:37:24,040
model to fight with me. 
I will ask it to disagree. 

630
00:37:24,320 --> 00:37:27,440
I will ask it to come up with a 
Steel Man argument because I 

631
00:37:27,440 --> 00:37:31,520
think it's much more interesting
and my thinking gets sharpened 

632
00:37:31,520 --> 00:37:34,240
when I do that. 
Actually I do the same thing. 

633
00:37:34,240 --> 00:37:39,240
I very often will say to the 
model, be very critical. 

634
00:37:39,720 --> 00:37:43,040
Don't worry about hurting my 
feelings, be sharp. 

635
00:37:43,440 --> 00:37:46,000
That's right, like an iron 
sharpens iron vibe is what I 

636
00:37:46,000 --> 00:37:48,160
like to go go for. 
Makes sense. 

637
00:37:48,160 --> 00:37:53,680
What's the best way to again 
craft that prompt? 

638
00:37:53,680 --> 00:37:55,960
We were trying to accomplish 
something. 

639
00:37:56,440 --> 00:37:59,360
Shall I show my screen? 
Would it be helpful to just kind

640
00:37:59,360 --> 00:38:01,160
of take a peek at a prompt I 
wrote? 

641
00:38:01,480 --> 00:38:03,800
Sure, let's do that. 
All right. 

642
00:38:03,800 --> 00:38:08,120
This is a real prompt that I 
wrote and this is an example of 

643
00:38:08,120 --> 00:38:10,880
me picking something where I 
feel pretty good about sort of 

644
00:38:10,880 --> 00:38:14,440
my overall ability to assess 
quality of response. 

645
00:38:14,440 --> 00:38:16,920
But I don't have a direct answer
to this question. 

646
00:38:18,040 --> 00:38:20,800
And this was part of a sub stack
article that I was writing to 

647
00:38:20,800 --> 00:38:23,120
test O3 Pro. 
So this is four O 3 Pro. 

648
00:38:23,120 --> 00:38:26,400
You can see it up there and I'm 
asking it to step through this 

649
00:38:26,400 --> 00:38:29,160
analysis with me. 
So think you're a senior product

650
00:38:29,160 --> 00:38:32,600
leader brought in to design a 12
month AI adoption road map for a

651
00:38:32,600 --> 00:38:37,400
real firm. 
First I I could have given the 

652
00:38:37,400 --> 00:38:40,520
model the choice of firm and I 
tried a separate prompt where I 

653
00:38:40,520 --> 00:38:42,520
gave it that option. 
That was very interesting. 

654
00:38:42,800 --> 00:38:45,360
In the end, I wanted something 
with a company that I was 

655
00:38:45,360 --> 00:38:47,480
familiar with since I was 
working on it for testing 

656
00:38:47,480 --> 00:38:49,080
purposes. 
So I used Datadog. 

657
00:38:49,640 --> 00:38:53,840
I ask it to do some very 
specific information gathering. 

658
00:38:54,000 --> 00:38:57,120
So build the source corpus. 
I want publicly available 

659
00:38:57,120 --> 00:39:00,480
information, I want 10 KS, I 
want job postings, I want SEC 

660
00:39:00,480 --> 00:39:06,080
FINRA guidance. 
And then I want 3 responses. 

661
00:39:06,080 --> 00:39:09,880
And I actually specify the word 
count output and I specify what 

662
00:39:09,880 --> 00:39:12,160
I want there, right? 
There's a strategy memo first, 

663
00:39:12,440 --> 00:39:15,640
there's a tech stack overview, 
and there's a regulatory 

664
00:39:15,640 --> 00:39:19,360
constraints piece. 
And So what what's interesting 

665
00:39:19,360 --> 00:39:23,920
is by using the word internal, I
am suggesting to the model that 

666
00:39:23,920 --> 00:39:27,480
the model can craft these inside
the chain of thought that it's 

667
00:39:27,480 --> 00:39:31,040
running behind the scenes 
without me having to see it. 

668
00:39:31,480 --> 00:39:34,720
And then Step 2, produce. 
Now I'm starting to ask for 

669
00:39:34,720 --> 00:39:37,160
output. 
I'm starting to ask the model to

670
00:39:37,160 --> 00:39:41,920
come back with one document with
an executive summary, month by 

671
00:39:41,920 --> 00:39:46,040
month road map, a KPI per 
quarter, anticipated failure 

672
00:39:46,040 --> 00:39:48,480
modes and mitigations, and an 
advisor briefing. 

673
00:39:49,160 --> 00:39:51,120
And then I'm giving it styling 
that I want. 

674
00:39:51,240 --> 00:39:53,160
So I want it to be really 
brutally honest. 

675
00:39:53,160 --> 00:39:55,480
This is an example of not 
looking for confirmatory 

676
00:39:55,480 --> 00:40:00,040
thinking. 
I do not want tables, I want 

677
00:40:00,320 --> 00:40:03,000
just bullets if need be and I 
would like. 

678
00:40:03,800 --> 00:40:07,080
To get a sense of what shaped 
your recommendations right, I 

679
00:40:07,080 --> 00:40:09,880
want to know where you got some 
of this thinking from. 

680
00:40:10,040 --> 00:40:12,040
And then I give it a limit at 
the top. 

681
00:40:12,040 --> 00:40:14,200
It can't be more than 7500 
words. 

682
00:40:14,560 --> 00:40:18,720
So it ran, it thought about it, 
it was a 6-7 minute, basically a

683
00:40:18,720 --> 00:40:22,880
7 minute run. 
So it chooses Datadog, which I 

684
00:40:22,880 --> 00:40:26,720
specified, It does a little bit 
of basic research on Datadog. 

685
00:40:27,320 --> 00:40:31,280
It builds the source corpus, so 
it gives you a sense of what's 

686
00:40:31,280 --> 00:40:35,640
in the box there. 
Market contacts, Datadog's Edge.

687
00:40:36,760 --> 00:40:38,440
It's starting to adopt the 
persona. 

688
00:40:38,440 --> 00:40:42,560
So it's saying where we lag and 
talking about sort of other 

689
00:40:42,560 --> 00:40:46,080
updates in the competitive 
space, getting into growth 

690
00:40:46,080 --> 00:40:48,360
goals. 
And what I love here is that it 

691
00:40:48,360 --> 00:40:52,520
actually called out like the 
statement by the CEO by by 

692
00:40:52,520 --> 00:40:55,640
Olivier around what they're 
looking for and why. 

693
00:40:55,640 --> 00:40:58,880
And it's taking that into effect
and taking that into account the

694
00:40:58,880 --> 00:41:01,600
way, frankly, a good road map 
builder should. 

695
00:41:02,160 --> 00:41:07,360
It's looking at client mix. 
And this is a situation where 

696
00:41:07,360 --> 00:41:10,640
it's done its own research to 
come up with that assessment and

697
00:41:10,640 --> 00:41:12,760
given sort of a very rough 
assessment of that. 

698
00:41:13,480 --> 00:41:18,000
It's looking at the AI 
aspirations it can find from 

699
00:41:18,000 --> 00:41:20,360
each of the different C-Suite 
members. 

700
00:41:20,760 --> 00:41:22,720
It's looking at strategic gaps 
to close. 

701
00:41:22,960 --> 00:41:26,040
This is all just in preparation 
and hasn't even really started 

702
00:41:26,040 --> 00:41:27,840
the assignment yet. 
It's just kind of thinking it 

703
00:41:27,840 --> 00:41:30,720
through. 
It's now going into the current 

704
00:41:30,720 --> 00:41:35,080
stack in great detail, looking 
at vendor contracts, security 

705
00:41:35,080 --> 00:41:37,320
posture. 
You can see where we're going 

706
00:41:37,320 --> 00:41:40,080
here. 
Eventually it's actually going 

707
00:41:40,080 --> 00:41:45,800
to get to what it wants to say. 
And it's actually a very cogent 

708
00:41:45,800 --> 00:41:49,480
thesis. 
It talks about how you sort of 

709
00:41:49,480 --> 00:41:51,880
dominate the data exhaust space 
and what that means. 

710
00:41:51,880 --> 00:41:54,280
And then it starts to get into 
the the road map piece. 

711
00:41:54,280 --> 00:41:56,400
But my point here, like we could
go through this, but we don't 

712
00:41:56,400 --> 00:41:58,680
have time. 
The the point is basically 

713
00:41:58,680 --> 00:42:02,120
because I structure the prompt 
carefully, I got exactly what I 

714
00:42:02,120 --> 00:42:05,120
was looking for back. 
What if you're not trying to get

715
00:42:05,240 --> 00:42:09,600
a research report, but you're 
trying to do, say, small, small 

716
00:42:09,600 --> 00:42:12,640
research? 
Find out the answer to some set 

717
00:42:12,640 --> 00:42:15,240
of questions, for example. 
I did 1. 

718
00:42:15,240 --> 00:42:17,000
I don't know if I have it handy 
or not. 

719
00:42:17,000 --> 00:42:21,720
I will see if it's there may be 
emerging trends and investing. 

720
00:42:22,400 --> 00:42:25,400
Yeah, I think it's this one. 
This is a much shorter report. 

721
00:42:25,400 --> 00:42:26,640
See that? 
That's the whole thing right 

722
00:42:26,640 --> 00:42:29,600
there like that. 
It feels short comparatively and

723
00:42:29,600 --> 00:42:33,280
it was a very short ask. 
Please analyze this economic 

724
00:42:33,280 --> 00:42:36,480
report and I'm really interested
in again, I'm trying to push it.

725
00:42:36,480 --> 00:42:41,520
I want to understand emerging 
trends and I want to understand 

726
00:42:41,600 --> 00:42:43,840
areas not commonly discussed, 
right? 

727
00:42:43,840 --> 00:42:46,800
I'm looking for it to sort of 
push beyond, but it's not a very

728
00:42:46,800 --> 00:42:50,760
long prompt per SE. 
And then it just jumps right in.

729
00:42:50,760 --> 00:42:53,640
It reads all 138 pages. 
It gives me a snapshot. 

730
00:42:53,640 --> 00:42:57,760
It talks about commodities and 
where they're at globally, miss 

731
00:42:57,760 --> 00:43:03,160
pricing issues driven by LNG, 
and it basically goes through 

732
00:43:03,160 --> 00:43:05,560
themes it's seen. 
And then at the end, it Nets it 

733
00:43:05,560 --> 00:43:08,000
out like this is the big picture
assessment for the next 6 

734
00:43:08,000 --> 00:43:12,000
months. 
This is the macro assessment and

735
00:43:12,000 --> 00:43:14,720
this is how you start to lean 
in. 

736
00:43:14,720 --> 00:43:17,760
And what's interesting is this 
is much more specific. 

737
00:43:17,760 --> 00:43:21,320
I ran the same prompt with O3 
and O3 Pro and it was 

738
00:43:21,320 --> 00:43:23,480
interesting to see the 
relationship between the two 

739
00:43:23,480 --> 00:43:28,160
because O3 focused on on this 
sort of bifurcation piece and O3

740
00:43:28,160 --> 00:43:29,960
Pro had a slightly different 
perspective. 

741
00:43:29,960 --> 00:43:31,760
And I know we have like 8 
minutes so we probably don't 

742
00:43:31,760 --> 00:43:34,120
have time to get into it, but I 
thought it was fascinating to 

743
00:43:34,120 --> 00:43:36,600
run a short prompt on both and 
see the differences. 

744
00:43:36,960 --> 00:43:42,880
We have a question from LinkedIn
and this, and I was going to ask

745
00:43:42,880 --> 00:43:47,600
something very similar, which is
this is from Laura Finlayson and

746
00:43:47,600 --> 00:43:51,280
she says with a prompt like 
this, which model will do the 

747
00:43:51,280 --> 00:43:55,400
best at retaining the prompt 
information for future use? 

748
00:43:55,880 --> 00:44:00,960
She built and refined her job 
application prompt in Gemini, 

749
00:44:01,560 --> 00:44:05,360
but it seems to forget some of 
the deliverables each time she 

750
00:44:05,360 --> 00:44:07,600
goes back with a new job 
description. 

751
00:44:08,000 --> 00:44:12,320
Claude allows her to save a 
project, but she doesn't love 

752
00:44:12,720 --> 00:44:18,920
its writing. 
And so we need to talk about 

753
00:44:19,280 --> 00:44:23,600
which model is the better model 
to use and how do you choose. 

754
00:44:23,600 --> 00:44:26,680
And we only have a few minutes 
left, but this, you know, we 

755
00:44:26,680 --> 00:44:28,720
could go on forever here. 
So what do we do? 

756
00:44:29,160 --> 00:44:30,720
There's two things going on 
there. 

757
00:44:30,720 --> 00:44:36,000
The 1st is memory. 
And ChatGPT really has a killer 

758
00:44:36,000 --> 00:44:40,160
feature edge with memory right 
now because they do have, it's 

759
00:44:40,160 --> 00:44:43,720
not perfect, but they have a 
memory feature that enables the 

760
00:44:43,720 --> 00:44:47,320
models to start to actually have
a living context of information 

761
00:44:47,320 --> 00:44:51,440
about other chats you've had 
inside the same model surface, 

762
00:44:51,440 --> 00:44:53,440
right? 
So if it's in ChatGPT, it 

763
00:44:53,440 --> 00:44:56,320
doesn't matter if you're talking
with O3 or 4-O or whatever. 

764
00:44:56,640 --> 00:44:58,960
There's going to be a loose 
understanding of recent 

765
00:44:58,960 --> 00:45:02,760
conversations you've had along 
with some specific facts that 

766
00:45:02,760 --> 00:45:05,280
the model has remembered about 
you that you can actually audit 

767
00:45:05,280 --> 00:45:07,040
and check in the settings 
section. 

768
00:45:07,920 --> 00:45:11,120
That turns out to be very useful
for problems like this where you

769
00:45:11,120 --> 00:45:14,400
want it to do a repetitive task 
and you want it to have a sense 

770
00:45:14,400 --> 00:45:20,640
that it's done the task before. 
Even so, I still find I want to 

771
00:45:20,640 --> 00:45:24,240
be precise about each of the 
assets I needed to process if I 

772
00:45:24,240 --> 00:45:26,040
need that. 
And that is one of the reasons 

773
00:45:26,040 --> 00:45:29,960
why I do tend to favor long 
prompts that I will keep in a 

774
00:45:29,960 --> 00:45:33,240
Notion page or keep elsewhere 
that I can just copy and paste 

775
00:45:33,240 --> 00:45:36,800
in as needed because I don't 
want it to forget anything. 

776
00:45:36,840 --> 00:45:39,320
I don't want to go to that 
trouble of writing out that 

777
00:45:39,320 --> 00:45:42,040
prompt again. 
I just wanted to remember every 

778
00:45:42,040 --> 00:45:46,480
single thing and do it again. 
And I wish that I had an answer 

779
00:45:46,480 --> 00:45:50,800
for you, that these were going 
to be flexible deep memory 

780
00:45:51,040 --> 00:45:53,800
models that just would remember 
that you did exactly like this 

781
00:45:53,800 --> 00:45:56,400
and never forget that step. 
We're just not there yet. 

782
00:45:57,000 --> 00:45:58,920
And so prompts are part of how 
we bridge that gap. 

783
00:45:59,320 --> 00:46:04,880
One of the problems that I have 
is I like to try prompts on 

784
00:46:04,880 --> 00:46:08,960
different models to see the 
results and compare the results.

785
00:46:08,960 --> 00:46:14,600
I think it's it leads to a lot 
of creative thinking and it 

786
00:46:14,600 --> 00:46:19,720
becomes a a real burden and an 
obstacle because I've interacted

787
00:46:19,720 --> 00:46:26,760
for 20 minutes with Model A and 
now I want to go to Model B and 

788
00:46:26,760 --> 00:46:29,160
I've got to start all over 
again. 

789
00:46:29,480 --> 00:46:32,200
There is a way to make that 
slightly less painful. 

790
00:46:32,880 --> 00:46:36,400
So what I like to do is if I 
want to transition, I like to 

791
00:46:36,400 --> 00:46:38,480
ask the model I've been chatting
with. 

792
00:46:38,480 --> 00:46:43,320
Could you please give me a very 
detailed summary of our 

793
00:46:43,320 --> 00:46:48,000
conversation so far and make 
sure it's as clean and clear as 

794
00:46:48,000 --> 00:46:51,160
you possibly can make it? 
And then it will do that and it 

795
00:46:51,160 --> 00:46:52,960
will give me a great summary of 
the conversation. 

796
00:46:52,960 --> 00:46:56,560
I then pull that summary into a 
new model conversation I'm 

797
00:46:56,560 --> 00:46:59,160
starting and say, here's where 
we're at right now. 

798
00:46:59,960 --> 00:47:02,000
I would love to continue this 
conversation with you. 

799
00:47:02,000 --> 00:47:03,320
And this is what I'm looking 
for. 

800
00:47:03,320 --> 00:47:06,440
And it's still a little bit 
painful, but it's it's less 

801
00:47:06,440 --> 00:47:08,040
painful than it would be 
otherwise. 

802
00:47:08,440 --> 00:47:14,640
Which model or which company do 
you gravitate towards the most? 

803
00:47:14,640 --> 00:47:17,640
You have access to everything. 
What do you use the most? 

804
00:47:17,960 --> 00:47:24,320
The memory feature is one of the
most powerful product features I

805
00:47:24,320 --> 00:47:28,080
can remember on ChatGPT because 
I find that the fact that it 

806
00:47:28,080 --> 00:47:32,800
remembers something about me 
drives a recursive behavioral 

807
00:47:32,800 --> 00:47:34,840
loop for me. 
I'm very aware of it, right? 

808
00:47:34,840 --> 00:47:36,840
Like I've I've worked in product
for a long time, I know what 

809
00:47:36,840 --> 00:47:40,480
they're doing, but it still 
works because I find that having

810
00:47:40,480 --> 00:47:43,120
a model that remembers me a bit 
is super, super helpful. 

811
00:47:43,120 --> 00:47:45,560
And so Chet GPT drives a lot of 
interest for me. 

812
00:47:45,560 --> 00:47:50,480
O3 is a daily driver for me. 
I pick it up by default, but 

813
00:47:50,480 --> 00:47:53,240
that doesn't stop me from going 
other places. 

814
00:47:53,240 --> 00:47:58,320
Like when I am working on a 
complex piece of writing, I will

815
00:47:58,320 --> 00:48:01,480
use perplexity. 
I will use both of the new 

816
00:48:01,480 --> 00:48:05,840
Claude models, Opus 4 and Sonnet
4I will sometimes go to Gemini 

817
00:48:05,840 --> 00:48:08,800
2.5 Pro. 
And so I almost look at those as

818
00:48:08,800 --> 00:48:11,680
like additional pieces that I 
want to go to for specific 

819
00:48:11,680 --> 00:48:15,000
things. 
Sonnet 4 is great for writing. 

820
00:48:15,000 --> 00:48:18,800
I love it. 
Opus 4, I love the way it does 

821
00:48:18,800 --> 00:48:22,960
like very thoughtfully 
considered reading and research.

822
00:48:22,960 --> 00:48:25,560
There's something qualitative 
about it that's very strong 

823
00:48:25,560 --> 00:48:29,160
there. 
And so even though I end up in 

824
00:48:29,160 --> 00:48:34,320
Chat GPTA lot, that doesn't stop
me from reaching my fingers into

825
00:48:34,320 --> 00:48:37,040
the rest of the ecosystem and 
and grabbing what's useful. 

826
00:48:37,280 --> 00:48:40,840
I also really like ChatGPT in 
general. 

827
00:48:40,840 --> 00:48:44,760
They make it pretty easy. 
And that memory feature, you 

828
00:48:44,760 --> 00:48:49,480
know, you start, you start with 
your prompt and you feel like it

829
00:48:49,480 --> 00:48:53,360
has so bizarre to say this. 
It has like this intuition of 

830
00:48:53,880 --> 00:48:57,040
what makes sense for you. 
Right. 

831
00:48:57,040 --> 00:49:02,360
And that sense of being 
recognized, I think is very 

832
00:49:02,360 --> 00:49:04,800
powerful from a product 
experience perspective and 

833
00:49:04,800 --> 00:49:07,760
people respond to it. 
Marketers talk about 

834
00:49:07,760 --> 00:49:12,040
personalization and usually 
personalization is well, we've 

835
00:49:12,040 --> 00:49:16,280
watched their shopping cart and 
in the past they've bought XYZ 

836
00:49:16,920 --> 00:49:20,200
product and so we'll recommend 
the next product that they'll 

837
00:49:20,400 --> 00:49:22,600
really, really like or the next 
movie or whatever. 

838
00:49:22,960 --> 00:49:26,920
But we're talking here about a 
level of subtlety with 

839
00:49:26,920 --> 00:49:30,720
personalization that's like 
light years beyond the typical 

840
00:49:30,720 --> 00:49:32,960
marketing personalization that 
it that we know of. 

841
00:49:33,320 --> 00:49:36,080
It really is and it's going to 
be super interesting in the next

842
00:49:36,080 --> 00:49:42,040
6 months or 1818 months to see 
how the product platform evolves

843
00:49:42,040 --> 00:49:45,000
for ChatGPT as they build on 
this memory feature and they add

844
00:49:45,000 --> 00:49:46,960
more you know new models, 
etcetera. 

845
00:49:48,280 --> 00:49:52,840
I, my sense is especially as 
they lean into the partnership 

846
00:49:52,840 --> 00:49:55,680
with Shopify, they're going to 
learn more into commerce. 

847
00:49:55,680 --> 00:49:57,040
They're going to be 
opportunities for 

848
00:49:57,040 --> 00:49:59,560
personalization with commerce 
that we've never seen before. 

849
00:49:59,560 --> 00:50:01,200
But we'll just have to see how 
that evolves. 

850
00:50:01,560 --> 00:50:07,920
Should I pay the $200 a month 
for ChatGPT pro? 

851
00:50:07,920 --> 00:50:10,800
I know you do, but should I? 
Is it worth it or should I just 

852
00:50:10,800 --> 00:50:12,800
pay the 20 bucks a month that I 
pay right now? 

853
00:50:13,200 --> 00:50:17,080
That depends on the kind of user
that you are. 

854
00:50:17,080 --> 00:50:20,400
And so I have seen the article 
that came out, I think it was 

855
00:50:20,440 --> 00:50:24,360
this week that basically said 
ChatGPT has done such a 

856
00:50:24,360 --> 00:50:28,360
phenomenal job pushing value 
down the chain to the free tier.

857
00:50:28,360 --> 00:50:31,960
Why would we pay it all? 
Because it's it's so impressive.

858
00:50:31,960 --> 00:50:36,680
And I think for a lot of average
daily use, that is the correct 

859
00:50:36,680 --> 00:50:41,360
assessment That is true for me. 
I want, it's not, it's not even 

860
00:50:41,360 --> 00:50:43,760
just that I want it so I can 
test it and show it to people. 

861
00:50:43,760 --> 00:50:49,960
It's that I want to have no 
token limits and no usage limits

862
00:50:49,960 --> 00:50:54,080
on the smartest models out there
because I find myself doing 

863
00:50:54,080 --> 00:50:57,400
better with my own brain if I 
have the smartest thinking 

864
00:50:57,400 --> 00:51:00,320
partner possible. 
And so we've talked a lot around

865
00:51:00,320 --> 00:51:02,520
the edges of what sort of 
thinking and intelligence means.

866
00:51:02,520 --> 00:51:07,360
That's probably a conversation 
for another day, but from a 

867
00:51:07,480 --> 00:51:11,440
economics perspective, if I have
a thinking partner like that and

868
00:51:11,440 --> 00:51:14,720
I push the edges like that and 
it helps me make one or two 

869
00:51:14,720 --> 00:51:17,880
better decisions in a given 
month, the ROI is off the 

870
00:51:17,880 --> 00:51:21,120
charts. 
At $200 a month, it's very, very

871
00:51:21,120 --> 00:51:23,840
easy to do that math. 
And so I think it depends on 

872
00:51:23,840 --> 00:51:25,200
what you're looking for it to 
achieve. 

873
00:51:25,680 --> 00:51:30,320
I want the best possible results
from the thinking because if I'm

874
00:51:30,800 --> 00:51:32,440
spending time. 
Pay the 200. 

875
00:51:32,480 --> 00:51:38,040
You'll get O3 Pro and it's, it's
qualitatively better in a way 

876
00:51:38,040 --> 00:51:41,920
that you will notice it. 
It has a resonance to it where 

877
00:51:41,920 --> 00:51:46,240
like the insights it has stick 
in my head and I'm like chewing 

878
00:51:46,240 --> 00:51:48,360
them over in a way that I 
haven't had with other models, 

879
00:51:48,360 --> 00:51:49,440
which is super interesting. 
Well. 

880
00:51:49,640 --> 00:51:52,120
I'll have to try it. 
Well, with that, we are out of 

881
00:51:52,120 --> 00:51:56,200
time. 
A huge thank you to Nate Jones. 

882
00:51:56,200 --> 00:51:59,840
Thank you so much for taking 
your time to be with us today. 

883
00:51:59,840 --> 00:52:01,720
It's so valuable for us when 
you're here. 

884
00:52:02,200 --> 00:52:04,160
It was such a delight. 
I enjoyed it. 

885
00:52:04,160 --> 00:52:06,080
A tremendous thank you for 
having me, Michael. 

886
00:52:06,520 --> 00:52:08,040
I'm glad we got to talk about 
prompting. 

887
00:52:08,040 --> 00:52:10,080
I felt like we could have gone 
on for hours because it's such 

888
00:52:10,080 --> 00:52:13,720
an interesting topic, but I 
think we really got to a lot of 

889
00:52:13,720 --> 00:52:16,000
cool stuff over the course of 
this 60 minutes together. 

890
00:52:16,440 --> 00:52:17,880
And thank you to everybody who 
watched. 

891
00:52:17,880 --> 00:52:23,920
Now, before you Go, subscribe to
the CXO Talk newsletter so you 

892
00:52:23,920 --> 00:52:27,920
can be part of our community. 
We have amazing shows coming up.

893
00:52:27,920 --> 00:52:34,600
We have the chief technology 
officer of AMD coming up, and we

894
00:52:34,600 --> 00:52:36,320
have all kinds of amazing 
people. 

895
00:52:36,640 --> 00:52:42,040
So go to cxotalk.com, subscribe 
to the newsletter, and we'll see

896
00:52:42,040 --> 00:52:44,280
you again next time. 
Thanks so much everybody and 

897
00:52:44,280 --> 00:52:45,120
hope you have a great day.