1
00:00:00,080 --> 00:00:02,200
ML OPS is hard. 
They're always saying it's hard.

2
00:00:02,200 --> 00:00:05,800
It's not the beginner's job. 
It's very easy to deploy an AI 

3
00:00:05,800 --> 00:00:08,280
application if you ignore all 
the hard parts. 

4
00:00:08,400 --> 00:00:12,640
If you're using open AI or GPT 
5, we can't even guarantee that 

5
00:00:12,720 --> 00:00:16,960
today, like at 9:00 and at 12:00
is exactly the same model. 

6
00:00:17,040 --> 00:00:20,080
They may have changed something,
you have no idea and suddenly 

7
00:00:20,080 --> 00:00:21,760
your system doesn't behave 
anymore. 

8
00:00:22,000 --> 00:00:24,640
Step by step, you could trick it
into revealing something that 

9
00:00:24,640 --> 00:00:27,360
you're not supposed to know. 
How do you even put guardrails 

10
00:00:27,360 --> 00:00:28,720
in that? 
No one knows. 

11
00:00:28,720 --> 00:00:31,840
Do I really need this complex? 
Do I even need an agent? 

12
00:00:31,840 --> 00:00:34,040
Do I need an LLM? 
Those are the best questions. 

13
00:00:34,320 --> 00:00:35,480
If you don't need it, don't use 
it. 

14
00:00:35,720 --> 00:00:38,280
That's why I believe ML OPS will
be. 

15
00:00:38,360 --> 00:00:40,200
Very, very popular in an 
upcoming years. 

16
00:00:40,920 --> 00:00:43,440
If you're a software engineer 
and responsible for bringing AI 

17
00:00:43,440 --> 00:00:45,840
models to production, this 
episode is for you. 

18
00:00:46,000 --> 00:00:49,240
We go through the basics of ML 
OPS specifically for software 

19
00:00:49,240 --> 00:00:52,320
engineers, how to build out your
proof of concept into a 

20
00:00:52,320 --> 00:00:56,440
production worthy solution with 
Evals, observability, and much, 

21
00:00:56,440 --> 00:00:58,440
much more. 
Joining me today is Maria 

22
00:00:58,440 --> 00:01:01,120
Vechtomova, cofounder of 
Marvelous Envelopes and 

23
00:01:01,120 --> 00:01:03,400
cofounder of Cauchi. 
And she's been in the envelope 

24
00:01:03,400 --> 00:01:06,440
space for over a decade, which 
makes her the perfect person to 

25
00:01:06,440 --> 00:01:08,160
have this conversation with. 
So enjoy. 

26
00:01:11,920 --> 00:01:14,840
What excites me personally the 
most is that when you deliver 

27
00:01:14,840 --> 00:01:18,600
value, and that's the tricky 
part because everyone is just 

28
00:01:18,600 --> 00:01:21,760
overly excited about the genetic
development and trying to apply 

29
00:01:21,760 --> 00:01:24,680
it everywhere where it also 
doesn't necessarily make sense. 

30
00:01:25,160 --> 00:01:28,200
And that part doesn't 
necessarily excite me. 

31
00:01:29,880 --> 00:01:36,280
But yeah, I think the whole LLM 
hype did a lot of good for ML 

32
00:01:36,280 --> 00:01:38,680
OPS. 
Finally, we have attention to 

33
00:01:38,680 --> 00:01:41,760
monitoring. 
That's a topic that, you know, 

34
00:01:41,760 --> 00:01:44,720
was always important, but no one
cared about it that much. 

35
00:01:44,720 --> 00:01:48,120
It's kind of yeah, monitoring. 
We we know it's important. 

36
00:01:48,120 --> 00:01:51,280
We don't have it. 
And now suddenly everyone 

37
00:01:51,280 --> 00:01:55,880
started paying attention to it 
also because software engineers 

38
00:01:55,960 --> 00:01:59,200
started doing it more right 
before they were just data 

39
00:01:59,200 --> 00:02:02,080
scientists. 
Now it's coming from a different

40
00:02:02,080 --> 00:02:04,800
angle. 
Yes, tangible and a product. 

41
00:02:05,080 --> 00:02:09,080
Yes, indeed so that that's one 
of the things that I I 

42
00:02:09,120 --> 00:02:12,040
definitely like about it. 
And also finally, we pay 

43
00:02:12,040 --> 00:02:16,320
attention to serving that 
serving your agent becomes an 

44
00:02:16,320 --> 00:02:20,680
important problem and in the 
past, all of these tools are, 

45
00:02:21,080 --> 00:02:25,400
you know, kind of wacky. 
I, I don't like, I, you, you 

46
00:02:25,400 --> 00:02:29,920
bump into limits very fast and 
now they realise that's a 

47
00:02:29,920 --> 00:02:31,640
limitation and they're working 
on it. 

48
00:02:31,760 --> 00:02:34,080
So that's what I also like about
this. 

49
00:02:34,240 --> 00:02:36,400
Yeah, I like that a lot. 
What are some of the 

50
00:02:36,720 --> 00:02:40,280
implementations you've seen that
really deliver value that 

51
00:02:40,280 --> 00:02:41,600
actually work? 
Yeah. 

52
00:02:42,360 --> 00:02:44,760
Well, I don't know how much I 
can talk really about the use 

53
00:02:44,760 --> 00:02:46,600
case. 
Yeah. 

54
00:02:47,520 --> 00:02:51,040
Well, I think for internal 
processes, I think it makes a 

55
00:02:51,040 --> 00:02:54,640
lot of sense when you have some 
kind of documents and you need 

56
00:02:54,640 --> 00:02:58,920
to do something and every week 
and you spend hours on it, you 

57
00:02:58,920 --> 00:03:02,480
can largely automate it. 
And parts of this decision 

58
00:03:02,480 --> 00:03:07,240
making you know can be can be 
outsourced an agent. 

59
00:03:07,840 --> 00:03:12,320
So that's where I see a lot of 
value that time saving also NLP 

60
00:03:12,320 --> 00:03:17,200
heavy applications in the past. 
I think telecom you have all 

61
00:03:17,200 --> 00:03:21,000
this custom interactions and you
need to forward to the right 

62
00:03:21,000 --> 00:03:23,240
agent or to the right problem 
solution. 

63
00:03:24,880 --> 00:03:28,440
And in the past, I mean I worked
for a telco before, so I know 

64
00:03:28,480 --> 00:03:31,440
this process is already pretty 
well defined. 

65
00:03:32,560 --> 00:03:36,120
It's just someone had to go 
through it, routed to the right 

66
00:03:37,560 --> 00:03:39,880
place. 
And now all of that can be done 

67
00:03:39,880 --> 00:03:42,560
with agents. 
And I think it's, yeah, it's a 

68
00:03:42,560 --> 00:03:46,960
great opportunity and definitely
it saves a lot of money there, 

69
00:03:47,600 --> 00:03:49,680
yes. 
I've seen it specifically at 

70
00:03:49,680 --> 00:03:53,360
booking for example, applied 
between the client, which is the

71
00:03:53,360 --> 00:03:56,640
user that just booked something,
and then the person that offers 

72
00:03:56,880 --> 00:03:59,000
the booking like a hotel or 
something. 

73
00:03:59,280 --> 00:04:01,520
And they have something in 
between that gets the context 

74
00:04:01,520 --> 00:04:04,200
from the dialogue that's going 
on and sees and checks for tone 

75
00:04:04,200 --> 00:04:06,360
of voice and sees if it's 
actually appropriate to send to 

76
00:04:06,360 --> 00:04:08,400
the other the party. 
I was like, that's very 

77
00:04:08,400 --> 00:04:10,080
interesting. 
And then it also quite quickly 

78
00:04:10,080 --> 00:04:13,520
gets into a level of scale and 
complexity because you also have

79
00:04:13,520 --> 00:04:16,160
the language factory in there. 
Certain models are better with 

80
00:04:16,160 --> 00:04:19,279
different languages and I have 
no clue, but it's it sounds like

81
00:04:19,279 --> 00:04:21,160
a fun challenge. 
Yeah, definitely. 

82
00:04:21,560 --> 00:04:25,320
And I think in in general A-Team
spend so much time on, you know,

83
00:04:25,800 --> 00:04:30,240
doing hands on work where it's 
not per SE needed and can be 

84
00:04:30,240 --> 00:04:33,960
outsourced, an agent, something 
related to analytics work, that 

85
00:04:34,000 --> 00:04:36,040
business comes with the same 
questions all the time. 

86
00:04:36,040 --> 00:04:38,040
I guess you could make 
dashboards, but instead you 

87
00:04:38,040 --> 00:04:40,040
could have this kind of AIBI 
tools. 

88
00:04:41,080 --> 00:04:44,440
Of course, that must be applied 
with caution, but I think 

89
00:04:44,440 --> 00:04:46,840
business has pretty good 
understanding of what they 

90
00:04:46,840 --> 00:04:50,440
supposed to. 
And actually in fact analysts 

91
00:04:50,440 --> 00:04:52,880
when they come up with the 
numbers, they're also validated 

92
00:04:52,880 --> 00:04:55,800
with the business. 
So for a lot of these steps can 

93
00:04:55,800 --> 00:04:58,720
be just skipped by giving 
directly access to the data and 

94
00:04:58,720 --> 00:05:00,680
to this interface to the 
business users. 

95
00:05:01,120 --> 00:05:05,440
And that's also what I think 
very valuable these days. 

96
00:05:05,480 --> 00:05:07,560
Yes. 
You've been involved in the ML 

97
00:05:07,560 --> 00:05:10,640
OPS community for a long time, 
even before the the AI hype. 

98
00:05:10,760 --> 00:05:14,360
Yeah, and now software engineers
are really looking into fitting 

99
00:05:14,360 --> 00:05:16,560
in Gen. 
AI solutions in their products. 

100
00:05:17,040 --> 00:05:19,040
What will they have to know with
regards to envelopes from a 

101
00:05:19,040 --> 00:05:20,800
basics perspective? 
Yeah. 

102
00:05:20,800 --> 00:05:24,800
Well, I think they need to learn
a little bit of data science, 

103
00:05:24,920 --> 00:05:28,400
the basics of data science, 
because, you know, it's very 

104
00:05:28,400 --> 00:05:33,600
easy to deploy an AI application
if you ignore all the hard 

105
00:05:33,600 --> 00:05:36,760
parts, OK. 
Because, you know, it's just 

106
00:05:37,360 --> 00:05:42,480
regular software if you look at 
it that way, but all the parts 

107
00:05:42,480 --> 00:05:45,720
around it are, you know, related
to evaluation. 

108
00:05:46,000 --> 00:05:48,800
Those are very close to data 
science. 

109
00:05:49,600 --> 00:05:51,800
The problem here is the data 
gathering. 

110
00:05:51,800 --> 00:05:55,520
You actually need humans to come
up with, you know, with 

111
00:05:55,520 --> 00:05:58,720
expectations, what is expected 
from an agent and what is the 

112
00:05:58,720 --> 00:06:02,440
correct answer. 
And that answer also can be 

113
00:06:02,440 --> 00:06:04,840
vague, right? 
So you don't really know. 

114
00:06:05,120 --> 00:06:10,320
So this clear evaluation of what
we're evaluating for and what 

115
00:06:10,320 --> 00:06:15,400
are the examples of this proper 
of this proper responses. 

116
00:06:15,800 --> 00:06:19,200
So that must be gathered. 
And that's something that a lot 

117
00:06:19,200 --> 00:06:21,960
of people just ignore like it 
doesn't exist. 

118
00:06:21,960 --> 00:06:27,480
It's not a problem at all. 
It's just also the the fine 

119
00:06:27,480 --> 00:06:31,840
tuning of the agent, like how 
the agent is built is not really

120
00:06:31,840 --> 00:06:34,240
done properly. 
It's just, you know, it kind of 

121
00:06:34,240 --> 00:06:37,000
works. 
It kind of something working. 

122
00:06:37,280 --> 00:06:42,480
It comes up with something that,
OK, yeah, it's a black box, but 

123
00:06:42,480 --> 00:06:45,440
it's not really a black box. 
It can be pretty, pretty well 

124
00:06:45,440 --> 00:06:49,880
evaluated if you come up with 
these metrics first includes 

125
00:06:49,880 --> 00:06:53,440
business users or whoever is 
using this application from the 

126
00:06:53,440 --> 00:06:58,200
very beginning and have these 
kind of data sets available. 

127
00:06:58,680 --> 00:07:02,840
And there are nice tools like a 
metal tracing where you can get 

128
00:07:02,840 --> 00:07:06,280
very deep insights and what's 
happening within the agent like 

129
00:07:06,520 --> 00:07:11,840
LLM calling, tool calling and 
well reasoning steps. 

130
00:07:11,840 --> 00:07:17,120
All of this can be logged and 
you can also give feedback on 

131
00:07:17,120 --> 00:07:19,200
these steps. 
So each of these steps can be 

132
00:07:19,520 --> 00:07:26,400
also evaluated by a human or by 
an LLM, and that can be used for

133
00:07:26,400 --> 00:07:30,560
enough monitoring purposes. 
Also things like, I don't know, 

134
00:07:31,240 --> 00:07:36,000
on average we have 5 * 2 calling
for an agent and now we have 30.

135
00:07:36,280 --> 00:07:38,000
So something is going wrong 
there. 

136
00:07:38,200 --> 00:07:41,280
You want to get alerted in this 
kind of anomalies. 

137
00:07:41,800 --> 00:07:44,280
So this is also something you 
want to pay attention to. 

138
00:07:45,600 --> 00:07:50,680
From what I've seen, people 
mostly do go the easy route 

139
00:07:50,960 --> 00:07:55,800
ignoring all of these hard parts
and they also disattach all the 

140
00:07:55,800 --> 00:07:58,400
pieces, right? 
Typically you have some kind of 

141
00:07:58,400 --> 00:08:03,040
data processing and if you have 
APD FS, you are going to parse 

142
00:08:03,040 --> 00:08:07,920
this PDF, the OCR, the chunking,
you use some vector search. 

143
00:08:08,240 --> 00:08:11,960
You also maybe extract some 
metadata storage in some SQL 

144
00:08:11,960 --> 00:08:16,080
database. 
And then you define tools for 

145
00:08:16,080 --> 00:08:19,080
your for agent, maybe you have 
MCP server as well. 

146
00:08:19,480 --> 00:08:23,040
So this process is kind of, it's
something that needs to run 

147
00:08:23,040 --> 00:08:25,560
periodically once you get new 
batch of data. 

148
00:08:25,960 --> 00:08:29,200
So the in that sense, it's very 
similar to, you know, how data 

149
00:08:29,200 --> 00:08:31,560
scientists work. 
There is also data preprocessing

150
00:08:31,560 --> 00:08:36,480
work that is embedded into it. 
Then you have the agent 

151
00:08:36,480 --> 00:08:40,360
definition itself that consists 
of, you know, the logic of the 

152
00:08:40,360 --> 00:08:42,919
agent, the system prompts and 
all these components that are 

153
00:08:43,200 --> 00:08:48,120
also moving a lot. 
Like if LM changes, it may start

154
00:08:48,120 --> 00:08:51,440
behaving weirdly right? 
Like there are so many moving 

155
00:08:51,440 --> 00:08:55,760
pieces that you need to control 
and in that way, it's much 

156
00:08:55,760 --> 00:08:59,000
harder to control than an ML 
model, the standard ML model. 

157
00:09:00,040 --> 00:09:05,480
And then you have piece when you
deploy the the agent and well in

158
00:09:05,680 --> 00:09:08,120
I like data bricks a lot and I 
use data bricks a lot. 

159
00:09:08,520 --> 00:09:12,600
So there are processes that ML 
flow allows for. 

160
00:09:12,760 --> 00:09:15,440
So for example, when you 
register a new version of the 

161
00:09:15,440 --> 00:09:17,480
model, the deployment job will 
start. 

162
00:09:17,480 --> 00:09:19,720
And it's not just you're going 
to deploy right away. 

163
00:09:20,080 --> 00:09:24,000
Now it has an evaluation step 
using, you know, this kind of a 

164
00:09:24,000 --> 00:09:26,040
project I was talking about 
before. 

165
00:09:26,400 --> 00:09:28,240
And then you have a manual 
approval step. 

166
00:09:28,520 --> 00:09:33,320
So actually human is going to 
look at it, maybe evaluate in 

167
00:09:33,320 --> 00:09:38,960
certain ways, look at traces 
maybe and then we can go and 

168
00:09:38,960 --> 00:09:41,840
deploy. 
You can also say, OK, maybe for 

169
00:09:41,840 --> 00:09:44,520
some protesters we don't want 
the human to approve and maybe 

170
00:09:44,520 --> 00:09:48,560
an agent it's also possible 
because it's an API call in 

171
00:09:48,560 --> 00:09:52,360
them, right? 
So, so there are these 

172
00:09:52,360 --> 00:09:55,240
components that you need to 
think about, but also the 

173
00:09:55,240 --> 00:09:57,960
governance part of the of these 
agents, right? 

174
00:09:58,160 --> 00:10:01,600
So you have to, so if, if the 
agent is deployed behind 

175
00:10:01,600 --> 00:10:04,640
endpoint, you have governance on
top of the agent, but also on 

176
00:10:04,640 --> 00:10:08,040
top of the LLM that it's using. 
So to ensure that there are 

177
00:10:08,040 --> 00:10:13,480
guardrails, that there is no PII
data involved or being 

178
00:10:13,480 --> 00:10:15,480
processed, things like that, 
right? 

179
00:10:15,640 --> 00:10:19,680
The funny thing is, like when I 
started with AI within a product

180
00:10:19,760 --> 00:10:22,960
and also I looked online, people
are saying for software 

181
00:10:22,960 --> 00:10:24,960
engineers, there's nothing new, 
right? 

182
00:10:24,960 --> 00:10:26,560
Because it's just an API call 
away. 

183
00:10:27,080 --> 00:10:31,800
And right now, if I hear you say
like evaluations, figuring out 

184
00:10:31,800 --> 00:10:36,400
your context, figuring out what 
actually is valid with regards 

185
00:10:36,400 --> 00:10:38,840
to using an agent or a model 
behind the scenes, than the 

186
00:10:38,840 --> 00:10:41,960
governance and the guardrails, 
yeah, If that all of a sudden 

187
00:10:41,960 --> 00:10:44,320
becomes also a responsibility of
a software engineer, it's a lot 

188
00:10:44,320 --> 00:10:46,440
on their plate because they 
don't just have to think of 

189
00:10:46,440 --> 00:10:49,360
implementation, but also these 
are all data science principles.

190
00:10:49,560 --> 00:10:51,200
Yeah, yeah. 
ML OPS is hard. 

191
00:10:51,200 --> 00:10:52,360
ML OPS is hard. 
What? 

192
00:10:52,400 --> 00:10:54,720
Can I say right? 
I'm always saying it's hard. 

193
00:10:54,720 --> 00:10:56,080
It's not the beginner's job. 
Yeah. 

194
00:10:56,400 --> 00:10:59,880
And it's hard for pretty much 
anyone because data scientists, 

195
00:11:00,280 --> 00:11:02,160
they also just, you know, work 
with notebooks. 

196
00:11:02,320 --> 00:11:04,520
It's pretty much industry 
standard, which I'm trying to 

197
00:11:04,520 --> 00:11:08,200
fight. 
But yeah, that's the truth. 

198
00:11:08,200 --> 00:11:10,440
They they are not great. 
The following software 

199
00:11:10,440 --> 00:11:15,440
engineering practices it, it now
gets better, but it's still not 

200
00:11:15,440 --> 00:11:18,040
quite there. 
And software engineers, they 

201
00:11:18,040 --> 00:11:20,760
often lack these data science 
principles. 

202
00:11:21,000 --> 00:11:24,680
They never learn them because it
was never the their job. 

203
00:11:25,120 --> 00:11:27,120
Now we need to kind of merge it 
together. 

204
00:11:27,120 --> 00:11:29,880
And I think that for the first 
time in history, we actually get

205
00:11:29,880 --> 00:11:31,680
this. 
Opportunity to emerge it. 

206
00:11:32,000 --> 00:11:36,880
And so that's why I believe that
Melops will be it, yeah, will be

207
00:11:36,880 --> 00:11:38,840
very, very popular in the 
upcoming years. 

208
00:11:39,040 --> 00:11:40,600
That is a fun and exciting time 
though. 

209
00:11:40,600 --> 00:11:43,040
Yeah, definitely. 
Like I can get up and running 

210
00:11:43,040 --> 00:11:44,800
really quickly. 
But that's just the proof of 

211
00:11:44,800 --> 00:11:46,680
concept. 
And I've had many people on on 

212
00:11:46,680 --> 00:11:49,120
the podcast and they say the 
part from proof of concept to 

213
00:11:49,120 --> 00:11:50,680
production is incredibly 
difficult. 

214
00:11:51,560 --> 00:11:54,960
So from your perspective, what 
is the minimum set that I 

215
00:11:54,960 --> 00:11:58,080
actually need in production? 
Because I feel like a real 

216
00:11:58,080 --> 00:12:00,240
mature production solution is 
going to be very different from,

217
00:12:00,240 --> 00:12:02,120
let's say, my first version that
goes to production. 

218
00:12:02,280 --> 00:12:04,320
Well, all of the components that
I'm adjusting. 

219
00:12:04,320 --> 00:12:07,280
That's already a lot, yeah. 
Yes, I haven't mentioned all of 

220
00:12:07,280 --> 00:12:12,280
them Even so there are more. 
So like you want to monitor 

221
00:12:13,080 --> 00:12:16,280
what's going in your systems, 
what's going out of your systems

222
00:12:16,280 --> 00:12:18,840
at. 
Like each step really you want 

223
00:12:18,840 --> 00:12:22,600
to keep. 
Track of it and there are tools 

224
00:12:22,600 --> 00:12:27,680
for it today like a mouthful 
tracing and you can also dump 

225
00:12:27,680 --> 00:12:30,920
all this information into open 
telemetry supported tooling like

226
00:12:30,920 --> 00:12:34,560
Datadog or you can put them in 
delta tables, sync it to delta 

227
00:12:34,560 --> 00:12:38,520
tables, but also observability 
on top like who was calling that

228
00:12:39,200 --> 00:12:40,040
API. 
Yeah. 

229
00:12:40,360 --> 00:12:44,720
And how long did it take to, you
know, to give the response back 

230
00:12:45,200 --> 00:12:49,720
and and maybe put some limits on
certain users that they can't 

231
00:12:49,720 --> 00:12:52,440
and use more than so many tokens
or things like that? 

232
00:12:52,880 --> 00:12:55,920
So there are this kind of 
guardrails that you want to 

233
00:12:55,920 --> 00:12:58,240
implement on top of that as 
well. 

234
00:12:58,600 --> 00:13:01,160
And I find that it's also part 
of ML OPS. 

235
00:13:02,000 --> 00:13:07,040
So yeah, you need to have this 
data processing pipelines, you 

236
00:13:07,040 --> 00:13:09,240
need to have evaluation 
pipelines, you need to have 

237
00:13:09,240 --> 00:13:15,160
deployment pipelines, human loop
pipelines, and all the 

238
00:13:15,160 --> 00:13:20,480
governance as well on top. 
And what I talk about as a 

239
00:13:20,480 --> 00:13:22,920
Mallops principle a lot is 
traceability and 

240
00:13:22,920 --> 00:13:26,480
reproducibility. 
You need to know what data was 

241
00:13:26,480 --> 00:13:28,360
used. 
You need to know what code was 

242
00:13:28,360 --> 00:13:30,080
used, what the environment was 
used. 

243
00:13:30,800 --> 00:13:34,120
And with ML models, it's most 
straightforward to them. 

244
00:13:34,600 --> 00:13:37,360
I guess with the agents, because
as I said, there are so many 

245
00:13:37,360 --> 00:13:40,800
more moving pieces. 
We can't even guarantee that 

246
00:13:40,800 --> 00:13:47,520
LLM, I know if you're using open
AI that I know GPT 5 that today,

247
00:13:47,680 --> 00:13:51,200
like it's 9:00 and at 12:00 it's
exactly the same model. 

248
00:13:51,400 --> 00:13:53,760
No, you can't. 
Probably it's not it. 

249
00:13:53,840 --> 00:13:56,640
They may have changed something,
you have no idea and suddenly 

250
00:13:56,640 --> 00:13:58,400
your system doesn't behave 
anymore. 

251
00:13:58,840 --> 00:14:02,360
I mean, it can happen. 
And even though you don't have a

252
00:14:02,360 --> 00:14:06,200
monitoring on top of that in 
place that I guess monitors in 

253
00:14:06,200 --> 00:14:11,400
real time, whether your answers 
get skewed somehow based on LM 

254
00:14:11,400 --> 00:14:15,080
evaluators, because I mean, 
there is no way you can let 

255
00:14:15,080 --> 00:14:18,120
humans label things real time, 
right? 

256
00:14:19,000 --> 00:14:22,760
No, no. 
But at least some kind of yeah, 

257
00:14:22,760 --> 00:14:24,520
it gives some kind of idea. 
OK. 

258
00:14:24,520 --> 00:14:27,480
And what you see, it's actually 
funny. 

259
00:14:27,480 --> 00:14:31,360
At the sorry con, we had this 
discussion about observability 

260
00:14:31,800 --> 00:14:35,560
of MLS systems and there were 
some awesome people that do it 

261
00:14:36,000 --> 00:14:39,720
like for a very long time. 
Todd Underwood, he I think he 

262
00:14:39,720 --> 00:14:42,520
retired now, but back then not 
yet. 

263
00:14:42,520 --> 00:14:48,480
He that Tropic head of SRV and 
together with Neil Murphy, they 

264
00:14:48,720 --> 00:14:53,920
wrote a book on ML OPS in SRV. 
So it's, I mean, all of the 

265
00:14:53,920 --> 00:14:56,360
things you talk about is still 
very much valid. 

266
00:14:58,000 --> 00:15:01,360
So at Azure, for example, I 
think to emulate their models, 

267
00:15:01,800 --> 00:15:08,120
they they use whether internal 
employees give thumbs up to the 

268
00:15:08,120 --> 00:15:11,320
model and how often and whether 
it's going to change over time. 

269
00:15:11,640 --> 00:15:15,920
Because it's actually really 
hard to tie it to certain 

270
00:15:16,040 --> 00:15:18,920
standard evaluations. 
And more generic your model is, 

271
00:15:18,920 --> 00:15:22,000
the harder it is. 
So the more generic your agent 

272
00:15:22,000 --> 00:15:25,440
is, the also the harder it is. 
So that's why I really believe 

273
00:15:25,440 --> 00:15:29,800
in specialized agents that only 
do very specific tasks, or maybe

274
00:15:29,800 --> 00:15:31,720
not even agents. 
Just tell them workflows. 

275
00:15:31,880 --> 00:15:33,560
Now today, we could call 
everything an agent. 

276
00:15:35,520 --> 00:15:38,720
Let's put a, let's put a pin on 
agents for a second cuz I, I 

277
00:15:38,720 --> 00:15:40,960
want you to help me through kind
of a step by step. 

278
00:15:41,480 --> 00:15:44,000
A lot of this information is 
partially new for me and I'm 

279
00:15:44,000 --> 00:15:46,000
expecting for the audience as 
well since they're mainly 

280
00:15:46,000 --> 00:15:48,840
software engineers. 
But let's say I have an existing

281
00:15:48,840 --> 00:15:51,400
product and I want to embed some
type of Gen. 

282
00:15:51,400 --> 00:15:54,000
AI feature, right? 
Whether it's PDF parsing and 

283
00:15:54,000 --> 00:15:56,680
then getting context and 
prefilling some form or it's 

284
00:15:56,680 --> 00:15:59,520
indeed being between chat and 
checking for tone of voice with 

285
00:15:59,520 --> 00:16:01,040
regards to the communication 
that's there. 

286
00:16:01,600 --> 00:16:04,720
I can call an API and I have a 
model, so I can get stuff up and

287
00:16:04,720 --> 00:16:06,760
running quite quickly. 
Or what would be the next step? 

288
00:16:06,760 --> 00:16:10,000
Is that evils? 
Is that more observability with 

289
00:16:10,000 --> 00:16:13,680
regards to some of the metrics 
first time to token or measuring

290
00:16:13,680 --> 00:16:15,320
the performance? 
Or what would be the next step 

291
00:16:15,320 --> 00:16:17,040
to go actually towards 
production? 

292
00:16:18,040 --> 00:16:19,880
Well, what you're talking about 
is, not. 

293
00:16:19,920 --> 00:16:22,480
An an agent really. 
So if it's just an API call, 

294
00:16:22,480 --> 00:16:27,200
again, it's a very deterministic
and well, more or less NLM 

295
00:16:27,200 --> 00:16:29,800
workflow, right? 
Indeed, what you need to add is 

296
00:16:29,800 --> 00:16:33,960
evaluation. 
So you need to have labels, 

297
00:16:33,960 --> 00:16:37,200
actual labels done for humans. 
So the tone of voice example, 

298
00:16:37,200 --> 00:16:39,480
right, it's actually a 
classification problem. 

299
00:16:39,760 --> 00:16:42,240
So the way you evaluate it is 
the same way as you would 

300
00:16:42,240 --> 00:16:44,400
evaluate the classifier you have
your. 

301
00:16:44,400 --> 00:16:47,320
Accuracy of ones corridor and 
cost of your false negative, 

302
00:16:47,440 --> 00:16:51,160
false positives, that's what you
need to account for and you 

303
00:16:51,160 --> 00:16:53,960
know, maximize for for the 
value. 

304
00:16:55,080 --> 00:16:58,560
So yeah, I think that's a very 
straightforward example. 

305
00:16:58,560 --> 00:17:02,040
It's pure data science, except 
that instead of a model they can

306
00:17:02,040 --> 00:17:05,040
say I could learn you're using 
an LM to do that. 

307
00:17:05,240 --> 00:17:07,599
When you're starting to label 
things with regards to 

308
00:17:07,640 --> 00:17:11,040
evaluations, do you let 
something run in test acceptance

309
00:17:11,040 --> 00:17:14,200
or production where you have the
data that you can label, or how 

310
00:17:14,200 --> 00:17:19,760
do you typically approach that? 
Well, yeah, I think overall in 

311
00:17:19,760 --> 00:17:22,680
development and production, we 
need to have access to the same 

312
00:17:22,680 --> 00:17:25,720
data just to start. 
With it must be all production 

313
00:17:25,720 --> 00:17:27,319
data. 
And of course you shouldn't be 

314
00:17:27,319 --> 00:17:30,360
able to write from development 
to production. 

315
00:17:30,600 --> 00:17:33,560
So it's just to read access. 
So the data is the same. 

316
00:17:33,560 --> 00:17:36,680
So you basically when we're 
talking about labeling, it's a 

317
00:17:36,680 --> 00:17:39,520
separate process that is outside
of your development cycle. 

318
00:17:39,840 --> 00:17:43,720
I mean, it's tied together, but 
it's I I think it must be viewed

319
00:17:43,720 --> 00:17:46,320
as a separate stream. 
But it's always production data 

320
00:17:46,320 --> 00:17:48,520
that you're, yes, Interesting. 
OK. 

321
00:17:48,720 --> 00:17:52,040
So then I will go live with 
something that is unlabeled 

322
00:17:52,040 --> 00:17:54,920
typically in production to get 
to a step where I can label, 

323
00:17:54,960 --> 00:17:56,440
Yes. 
OK, interesting. 

324
00:17:56,640 --> 00:17:58,080
Yeah. 
That's a very interesting part 

325
00:17:58,080 --> 00:18:01,960
in, in, of, of data science in 
generally, yeah. 

326
00:18:01,960 --> 00:18:04,960
And then when I am on production
and I can label things, this is 

327
00:18:04,960 --> 00:18:08,600
where you pull in your business 
experts with regards to OK, what

328
00:18:08,600 --> 00:18:11,000
is your opinion on this or 
typically who does this? 

329
00:18:11,000 --> 00:18:12,520
Yeah. 
But I think before you deploy, 

330
00:18:12,680 --> 00:18:16,720
deploy this kind of classifier 
that uses LLM, you already you 

331
00:18:16,720 --> 00:18:21,720
need to have examples and you 
need to already have someone, a 

332
00:18:21,720 --> 00:18:27,920
human having labeled that 
already before ideally you can 

333
00:18:28,000 --> 00:18:31,920
do it half human, half LLM 
labeling like there are 

334
00:18:31,920 --> 00:18:34,400
different approaches. 
I think LLMS are pretty OK in 

335
00:18:34,400 --> 00:18:37,880
labeling things, especially when
they're straightforward as this 

336
00:18:37,880 --> 00:18:41,440
example. 
But I mean, I mean some examples

337
00:18:41,440 --> 00:18:42,760
are just not that 
straightforward. 

338
00:18:42,760 --> 00:18:46,560
And you need align your human 
judgement with your LLM 

339
00:18:46,560 --> 00:18:51,240
judgement. 
And to have this LLM judge you 

340
00:18:51,240 --> 00:18:54,680
know to have certain level of. 
Accuracy, I guess. 

341
00:18:56,000 --> 00:18:59,120
So that's that's one part. 
So you already need to have this

342
00:18:59,120 --> 00:19:01,280
label date and you already need 
to have late your agent and 

343
00:19:01,280 --> 00:19:04,480
maybe fine tune your agent. 
So what can you fine TuneIn this

344
00:19:04,480 --> 00:19:06,200
scenario? 
You can fine tune your prompt 

345
00:19:06,840 --> 00:19:09,320
and they're like a melt flow 
prompt fine tuning. 

346
00:19:09,320 --> 00:19:13,480
There is other flow that also 
does prompt fine tuning. 

347
00:19:14,080 --> 00:19:16,360
That's basically like in data 
science. 

348
00:19:16,520 --> 00:19:18,640
Instead of fine tuning you 
model, you fine tune your 

349
00:19:18,640 --> 00:19:20,800
prompt. 
It's kind of the same process, 

350
00:19:20,800 --> 00:19:22,960
right? 
And for classification, I would 

351
00:19:22,960 --> 00:19:26,360
probably do it, you know, 
periodically, just like you're a

352
00:19:26,360 --> 00:19:30,000
training model, your actual 
model is behind it, especially 

353
00:19:30,000 --> 00:19:34,080
if you don't host it yourself. 
It may change, right? 

354
00:19:34,080 --> 00:19:36,640
It may change. 
So you don't even know whether 

355
00:19:36,640 --> 00:19:39,400
it's going to perform, you know,
an expected way. 

356
00:19:39,720 --> 00:19:43,400
How do you figure out if your 
model, whether it's behind an 

357
00:19:43,400 --> 00:19:44,920
API? 
I mean self hosted doesn't 

358
00:19:44,920 --> 00:19:46,480
really change because you would 
know about it. 

359
00:19:46,480 --> 00:19:50,480
But typically those that are not
yours, not self hosted, they can

360
00:19:50,480 --> 00:19:51,440
change. 
You mentioned that. 

361
00:19:51,760 --> 00:19:53,440
How do you measure if they have 
changed? 

362
00:19:53,800 --> 00:19:59,120
Yeah, well you need to monitor 
real time whenever possible or 

363
00:19:59,120 --> 00:20:02,120
with a delay when your labels 
are delayed. 

364
00:20:03,720 --> 00:20:07,080
Yeah, it just so this LLM 
emulator that I was talking 

365
00:20:07,080 --> 00:20:09,360
about. 
So let's say we take this 

366
00:20:09,360 --> 00:20:13,720
example of this sentiment, what 
the sentiment is it you know, 

367
00:20:14,160 --> 00:20:18,120
it's good. 
So we can we can see what LLM is

368
00:20:18,120 --> 00:20:23,200
evaluating and and that's 
something that we we can see 

369
00:20:23,200 --> 00:20:28,160
skewness on in, in real time and
we can put some alerts on top of

370
00:20:28,160 --> 00:20:29,640
that. 
So there are some detect 

371
00:20:29,720 --> 00:20:34,080
detection of the drift for 
example that you can implement 

372
00:20:34,080 --> 00:20:34,920
there. 
Gotcha. 

373
00:20:35,160 --> 00:20:38,440
And this the same type of 
metrics and signals you would 

374
00:20:38,440 --> 00:20:40,920
use for, for example, adjusting 
your prompt? 

375
00:20:41,000 --> 00:20:43,920
Yeah, because then you can see 
the same changes with regards to

376
00:20:43,920 --> 00:20:45,040
the data set that you already 
have. 

377
00:20:45,640 --> 00:20:47,000
Exactly. 
Interesting. 

378
00:20:47,160 --> 00:20:49,080
OK. 
So I understand evals. 

379
00:20:49,400 --> 00:20:51,080
What tooling would you typically
use for that? 

380
00:20:51,080 --> 00:20:53,800
Do people build their own custom
things or what is kind of mature

381
00:20:53,800 --> 00:20:57,360
on the platform out there? 
Well, I think you always have to

382
00:20:57,360 --> 00:21:00,240
build something custom anyways, 
because your problems are never 

383
00:21:00,240 --> 00:21:03,120
that easiest. 
Which now discussed, right, They

384
00:21:03,120 --> 00:21:05,960
are pretty complicated and what 
you're trying to evaluate for is

385
00:21:05,960 --> 00:21:09,880
pretty custom usually, right? 
Like for example, I don't know 

386
00:21:09,880 --> 00:21:14,080
you want to process some 
documents and then in the end 

387
00:21:14,080 --> 00:21:17,080
these documents must be stored 
in a very specific way. 

388
00:21:17,320 --> 00:21:20,520
So you need to basically say is 
it correct? 

389
00:21:20,520 --> 00:21:23,640
That is that way that it's now I
would put it, is it actually 

390
00:21:23,640 --> 00:21:24,880
correct? 
Some of them are pretty 

391
00:21:24,880 --> 00:21:27,480
deterministic and you can just 
pay the antic maybe even. 

392
00:21:28,160 --> 00:21:30,640
But some of them are not 
necessarily that deterministic. 

393
00:21:32,120 --> 00:21:35,320
So that that's not an evaluator 
that you can find in any 

394
00:21:35,320 --> 00:21:37,640
library, right? 
So you have to create your own 

395
00:21:37,640 --> 00:21:40,200
custom evaluation. 
But there are tools that 

396
00:21:40,200 --> 00:21:43,480
facilitate this kind of custom 
evaluation like ML Flow. 

397
00:21:44,880 --> 00:21:47,840
I mean, I don't know how many of
these tools survive anyways, but

398
00:21:47,840 --> 00:21:49,800
I'm pretty sure ML Flow will 
survive so. 

399
00:21:51,560 --> 00:21:53,400
Yeah. 
Now I like your thought process 

400
00:21:53,400 --> 00:21:56,000
in that you will need something 
custom and the typically people 

401
00:21:56,000 --> 00:21:58,280
build that around their own 
solution. 

402
00:21:58,440 --> 00:22:01,240
But monitoring in general in the
mail is very custom. 

403
00:22:01,440 --> 00:22:03,680
Like, I mean, who cares anyways 
about accuracy? 

404
00:22:03,680 --> 00:22:07,760
I mean kind of cares. 
But what we care about is the 

405
00:22:07,760 --> 00:22:11,320
business value, right? 
So and to monitor business 

406
00:22:11,320 --> 00:22:13,720
value, it's custom, it's always 
custom. 

407
00:22:14,120 --> 00:22:16,600
So that's why monitoring is 
hard. 

408
00:22:17,040 --> 00:22:19,280
That's why. 
I mean, monitoring accuracy is 

409
00:22:19,280 --> 00:22:21,160
easy. 
I feel like monitoring but who 

410
00:22:21,160 --> 00:22:23,760
business value is, is way more 
difficult. 

411
00:22:23,840 --> 00:22:25,200
Exactly. 
Yeah, yeah. 

412
00:22:25,960 --> 00:22:28,800
How do you get to a point where 
you actually have proper metrics

413
00:22:28,960 --> 00:22:32,160
on the business value of your 
solution that you've built? 

414
00:22:32,280 --> 00:22:34,800
Yeah. 
Well, I, I don't really have 

415
00:22:34,800 --> 00:22:38,200
answer for that. 
I think no one really has 

416
00:22:38,200 --> 00:22:41,360
because it, it depends. 
It really depends on the problem

417
00:22:41,360 --> 00:22:43,240
you're trying to solve. 
For some problems it's very 

418
00:22:43,240 --> 00:22:44,800
straightforward. 
For some it's not. 

419
00:22:45,080 --> 00:22:47,080
For classification, it's 
typically easier. 

420
00:22:47,080 --> 00:22:50,040
You also can estimate the cost 
of your false negatives and 

421
00:22:50,040 --> 00:22:52,120
false positives. 
And I think that's what you need

422
00:22:52,120 --> 00:22:56,560
to, you know, kind of steer 
towards minimizing the loss. 

423
00:22:56,880 --> 00:22:58,960
Yeah, yeah. 
I feel like this is where. 

424
00:22:59,000 --> 00:23:02,320
Maximizing profit you. 
Know, yeah, I feel like this is 

425
00:23:02,320 --> 00:23:04,360
where a lot of product 
understanding will also come to 

426
00:23:04,360 --> 00:23:06,560
the teams, yes. 
And even though those metrics 

427
00:23:06,560 --> 00:23:09,280
are very far apart from maybe 
the feature that you're 

428
00:23:09,280 --> 00:23:12,920
building, things like, OK, are 
users actually happy with 

429
00:23:12,920 --> 00:23:15,520
regards to the functionality on 
our platform, the functionality 

430
00:23:15,520 --> 00:23:17,760
that you're building that is 
contributing towards that? 

431
00:23:17,760 --> 00:23:20,800
And happiness, depending on what
your revenue model is of your 

432
00:23:20,800 --> 00:23:24,520
product, whether it's retention 
or actually conversion, those 

433
00:23:24,520 --> 00:23:26,800
things you will need to be aware
of to actually be able to track 

434
00:23:26,800 --> 00:23:28,520
business value. 
Yeah, exactly. 

435
00:23:28,920 --> 00:23:32,720
I don't know, Like there are so 
many different business problems

436
00:23:32,720 --> 00:23:35,880
you're trying to solve, right. 
So for example, cross sell on 

437
00:23:35,880 --> 00:23:39,760
the on the web page. 
So I guess we do care if we show

438
00:23:39,760 --> 00:23:42,640
a product that the person may be
added to the basket, but what we

439
00:23:42,640 --> 00:23:46,320
really care about is that the 
total basket value increases. 

440
00:23:46,440 --> 00:23:49,720
That's what we care about. 
If they do add what we suggest, 

441
00:23:49,720 --> 00:23:52,440
but the basket value is not 
increasing like why are we doing

442
00:23:52,440 --> 00:23:53,280
that anyways? 
It's. 

443
00:23:53,480 --> 00:23:58,480
Just different, yeah. 
So indeed, So what we steer our 

444
00:23:58,480 --> 00:24:01,400
algorithm towards is often 
different that what we actually 

445
00:24:01,400 --> 00:24:04,560
care about. 
And yeah, it's very hard to 

446
00:24:04,560 --> 00:24:06,360
align it in general. 
Interesting. 

447
00:24:06,760 --> 00:24:10,960
Yeah, I understand how these 
concepts come together from 

448
00:24:11,440 --> 00:24:14,160
actually calling an API, let's 
say that has an an element 

449
00:24:14,160 --> 00:24:17,480
behind it, adding evals and the 
right observations to catch 

450
00:24:17,480 --> 00:24:21,320
drift and also figure out if the
changes you make to a prompt are

451
00:24:21,320 --> 00:24:25,200
actually benefiting the product 
in the end of the feature in the

452
00:24:25,200 --> 00:24:27,280
end. 
I've also seen people kind of 

453
00:24:27,280 --> 00:24:30,240
struggle with regards to picking
the right model, but if you have

454
00:24:30,240 --> 00:24:34,240
this in production, I feel like 
picking the right model for your

455
00:24:34,240 --> 00:24:37,280
feature becomes a lot easier. 
Now there's the question of if 

456
00:24:37,280 --> 00:24:40,880
your feature requires many 
models, if you have the factor 

457
00:24:40,880 --> 00:24:43,120
of language, for example, you 
will also have to hook into 

458
00:24:43,120 --> 00:24:45,200
within the same feature, 
different models. 

459
00:24:45,200 --> 00:24:47,040
That gets another layer of 
complexity. 

460
00:24:47,440 --> 00:24:49,280
But if you like, if you have 
these fundamentals, you can 

461
00:24:49,280 --> 00:24:51,200
really build upon a solid 
foundation. 

462
00:24:51,520 --> 00:24:52,800
Yeah. 
Is there anything that's still 

463
00:24:52,800 --> 00:24:55,720
missing that you want to add on 
top of a set it set up like 

464
00:24:55,720 --> 00:24:58,360
that? 
No, I I think we we covered 

465
00:24:58,360 --> 00:25:01,600
pretty much the basics. 
Yeah, you can always make things

466
00:25:01,600 --> 00:25:05,920
more complicated like by having 
this multi agent system. 

467
00:25:05,920 --> 00:25:08,160
So that and each of the parts 
must be evaluated. 

468
00:25:08,160 --> 00:25:12,280
So you basically observability 
gets harder and harder, the 

469
00:25:12,680 --> 00:25:15,800
extra level of complexity you 
add and then you need to start 

470
00:25:15,800 --> 00:25:18,800
thinking do I really need this 
complexity just to start with 

471
00:25:19,280 --> 00:25:21,120
maybe do I, do I even need an 
agent? 

472
00:25:21,120 --> 00:25:24,320
Do I need an LLM? 
So this kind of questions we 

473
00:25:24,320 --> 00:25:26,840
need to really ask ourselves. 
Those are the best questions. 

474
00:25:27,160 --> 00:25:28,280
If you don't need it, don't use 
it. 

475
00:25:28,840 --> 00:25:32,720
Yeah, but I mean, it's, it's 
always like that people want to 

476
00:25:32,720 --> 00:25:35,160
try all this new shiny cool 
things. 

477
00:25:35,520 --> 00:25:38,680
We, I don't know, it always has 
been like that, right. 

478
00:25:39,800 --> 00:25:42,560
And it's often resume driven 
developments. 

479
00:25:42,560 --> 00:25:47,560
That's what they see. 
Yeah, that's the thing. 

480
00:25:47,600 --> 00:25:50,400
So, yeah, people do it just 
because, you know, they want to 

481
00:25:50,400 --> 00:25:52,200
learn it. 
They think that it's useful for 

482
00:25:52,200 --> 00:25:54,520
their career, whether it's 
actually useful for the business

483
00:25:54,520 --> 00:25:55,960
and the problem they're solving 
not. 

484
00:25:55,960 --> 00:25:58,760
Always. 
You touched on 2 topics, 1 is 

485
00:25:58,760 --> 00:26:01,200
knowledge and one is more 
organizational maturity. 

486
00:26:01,200 --> 00:26:04,040
I want to cover knowledge first 
because you mentioned these are 

487
00:26:04,040 --> 00:26:07,240
very much data science 
principles in teams that you've 

488
00:26:07,240 --> 00:26:10,000
seen operate effectively in the 
features that they build, do 

489
00:26:10,000 --> 00:26:13,240
they usually have data science 
people embedded in that team? 

490
00:26:13,240 --> 00:26:16,080
Or how do software engineers 
familiarize themselves with 

491
00:26:16,080 --> 00:26:17,840
these concepts with the rest of 
envelopes? 

492
00:26:18,360 --> 00:26:23,360
Yeah, Well, I think it it arise,
it depends on the company. 

493
00:26:23,360 --> 00:26:26,480
I think there are some really 
good examples of teams that 

494
00:26:26,480 --> 00:26:30,360
actually indeed think about that
from data science perspective. 

495
00:26:30,360 --> 00:26:34,160
And I think these are previously
mature organisations that were 

496
00:26:34,160 --> 00:26:36,160
doing the mallops well in the 
past. 

497
00:26:36,400 --> 00:26:39,480
So they already have this kind 
of data science soft engineering

498
00:26:39,480 --> 00:26:42,400
way of thinking blended 
together. 

499
00:26:42,800 --> 00:26:45,920
So those doing do it well, they 
know what they're doing. 

500
00:26:46,240 --> 00:26:48,720
But there are teams that, you 
know, they just had some 

501
00:26:48,720 --> 00:26:51,720
software engineering teams and 
they just rebranded them into an

502
00:26:51,720 --> 00:26:58,200
agentic AI whatever. 
Yeah, no, they those don't do 

503
00:26:58,200 --> 00:26:59,360
well. 
No. 

504
00:26:59,400 --> 00:27:04,000
And also data scientists that I 
use the notebooks, and then they

505
00:27:04,000 --> 00:27:07,000
start also developing those 
systems, they also don't do 

506
00:27:07,000 --> 00:27:10,000
well. 
So yeah, I think maturity here 

507
00:27:10,000 --> 00:27:12,440
is the key. 
And we have a very, very long 

508
00:27:12,440 --> 00:27:14,120
way to go to get there. 
Yeah. 

509
00:27:14,280 --> 00:27:16,960
What's the best way for people 
to gain that knowledge? 

510
00:27:16,960 --> 00:27:19,960
Is it to go YouTube and check 
out some videos? 

511
00:27:19,960 --> 00:27:22,920
Is it to go to conferences, 
content, read books? 

512
00:27:23,080 --> 00:27:24,440
What is your advice? 
Yeah. 

513
00:27:27,320 --> 00:27:30,800
I mean there is so just so much 
knowledge available already and 

514
00:27:30,800 --> 00:27:34,480
I think you just need to know 
what people to follow and who 

515
00:27:34,480 --> 00:27:36,880
who say something that is 
actually makes sense. 

516
00:27:38,280 --> 00:27:40,920
I think there are a lot of 
really nice courses in Maven 

517
00:27:41,920 --> 00:27:46,560
that well I like a lot. 
We also have course in Maven, 

518
00:27:47,160 --> 00:27:53,080
yes, LLM OPS with data breaks 
specifically, I think we talk 

519
00:27:53,080 --> 00:27:58,640
about tools and principles 
first. 

520
00:27:58,640 --> 00:28:02,840
So like how do you apply the 
principles to build proper 

521
00:28:02,840 --> 00:28:05,680
systems? 
And yeah, we do use data rigs 

522
00:28:05,680 --> 00:28:08,760
just because it's easy so 
embedded and there are two 

523
00:28:08,760 --> 00:28:11,200
little examples of how to do it 
well on data rigs. 

524
00:28:11,200 --> 00:28:15,120
So that's why we teach it. 
But I think there are a lot of 

525
00:28:15,840 --> 00:28:19,800
some courses like Hugo Bona 
Anderson, he also has a course 

526
00:28:20,160 --> 00:28:23,600
on Maven. 
They focus on evaluations and 

527
00:28:23,600 --> 00:28:26,520
building agents. 
So not a lot like the we focus 

528
00:28:26,520 --> 00:28:30,120
really on the upside of things 
more and they focus more on 

529
00:28:30,120 --> 00:28:34,120
building good agents, yeah. 
Yeah, what you mentioned in that

530
00:28:34,320 --> 00:28:37,040
an organization needs to be 
mature is really a foundation 

531
00:28:37,040 --> 00:28:39,520
for a team to thrive in the 1st 
place already. 

532
00:28:39,520 --> 00:28:42,680
The fact that some organisations
that have been in do not have 

533
00:28:42,680 --> 00:28:46,040
production data available in the
other environments and and data 

534
00:28:46,040 --> 00:28:48,200
availability has always been a 
challenge for them. 

535
00:28:48,520 --> 00:28:52,000
It means that actually getting 
whatever AI feature you're 

536
00:28:52,000 --> 00:28:54,920
building to a mature level is 
going to be very challenging 

537
00:28:54,920 --> 00:28:57,520
because you don't have solid 
foundation to build on top of, 

538
00:28:57,920 --> 00:28:59,720
and then you get all the 
complexity of doing that in the 

539
00:28:59,720 --> 00:29:01,800
first place. 
Yeah, yeah, I know for sure. 

540
00:29:01,800 --> 00:29:04,760
I think like if you have this 
foundations, you will be all 

541
00:29:04,760 --> 00:29:06,760
right. 
If you don't, yeah, then invest 

542
00:29:06,760 --> 00:29:10,240
in those first and don't try to 
do all the shiny things instead.

543
00:29:10,280 --> 00:29:11,760
Yeah. 
But I want to start with the 

544
00:29:11,760 --> 00:29:14,320
results from a business 
perspective. 

545
00:29:14,320 --> 00:29:17,000
Now I get that organizational 
maturity has always been 

546
00:29:17,000 --> 00:29:19,000
complex. 
Yeah, indeed. 

547
00:29:19,000 --> 00:29:24,640
And like I think now we now move
those into consulting and I 

548
00:29:24,640 --> 00:29:27,800
guess before you're exposed to 
certain set and now the the set 

549
00:29:28,120 --> 00:29:30,440
of things you look at, it 
becomes bigger and you just 

550
00:29:30,440 --> 00:29:32,840
realize, you know, everyone is 
struggling. 

551
00:29:33,320 --> 00:29:36,120
Everyone is struggling. 
Yeah, Yeah, it's hard. 

552
00:29:36,280 --> 00:29:40,320
Yeah, I do like that a lot of 
people are looking into this and

553
00:29:40,320 --> 00:29:43,080
the fact that people are. 
That means when people figure 

554
00:29:43,080 --> 00:29:44,680
things out, knowledge is being 
shared. 

555
00:29:44,680 --> 00:29:47,600
Or when new things come out, 
people try it, they figure 

556
00:29:47,600 --> 00:29:49,680
things out and they talk to each
other and they share that 

557
00:29:49,680 --> 00:29:51,800
knowledge. 
That's always been the most fun 

558
00:29:51,800 --> 00:29:54,560
of being part of this industry 
specifically. 

559
00:29:54,960 --> 00:29:56,800
And I don't know any other 
industry that does that. 

560
00:29:56,840 --> 00:30:00,040
Especially now I feel like it's 
it's happening a lot and I'm 

561
00:30:00,040 --> 00:30:01,920
having a lot of fun with it 
specifically. 

562
00:30:02,000 --> 00:30:04,480
Yeah, no, I agree. 
I also really enjoy sharing my 

563
00:30:04,480 --> 00:30:06,400
knowledge. 
I write a lot on LinkedIn. 

564
00:30:06,400 --> 00:30:10,560
We also have sub stack and yeah,
and that's and also writing a 

565
00:30:10,560 --> 00:30:13,320
book. 
So it's a lot of effort writing 

566
00:30:13,320 --> 00:30:16,240
a book. 
Yeah. 

567
00:30:16,240 --> 00:30:19,280
But it's, it's fun because I 
think it's needed. 

568
00:30:19,280 --> 00:30:22,520
And I think once you get some 
knowledge, some understanding, 

569
00:30:22,520 --> 00:30:24,040
you need to share it with other 
people. 

570
00:30:25,320 --> 00:30:29,200
So I think it must come from the
right motivations because some 

571
00:30:29,240 --> 00:30:30,720
just want followers. 
I don't know. 

572
00:30:31,200 --> 00:30:34,480
And they don't know what they're
talking about. 

573
00:30:34,760 --> 00:30:36,840
That's what they also have seen,
yeah. 

574
00:30:37,080 --> 00:30:40,800
Yeah, yeah, interesting. 
I like people that share 

575
00:30:40,800 --> 00:30:45,200
knowledge and I do feel like 
great content sometimes you also

576
00:30:45,200 --> 00:30:47,880
have to pay for, but a lot of 
great content is also out there 

577
00:30:47,880 --> 00:30:51,120
free on YouTube, for example, or
a lot of news articles. 

578
00:30:51,120 --> 00:30:54,760
So it's really whatever your 
preferences for consuming 

579
00:30:54,760 --> 00:30:56,280
knowledge. 
I feel like that's out there. 

580
00:30:56,400 --> 00:30:58,440
No, definitely. 
This is just very, very fresh. 

581
00:30:58,600 --> 00:31:01,920
Yeah, no, we we also have a free
ML OPS course on YouTube. 

582
00:31:01,920 --> 00:31:04,600
It's ML OPS, so it's not about 
agents. 

583
00:31:04,600 --> 00:31:08,080
But I mean, it's really the 
same, like a lot of these things

584
00:31:08,080 --> 00:31:11,400
just the same. 
And I think people don't realize

585
00:31:11,400 --> 00:31:14,080
how close it is to each other. 
Yeah, yeah. 

586
00:31:14,840 --> 00:31:17,560
Somehow it's kind of feels like 
it's a whole new world. 

587
00:31:17,560 --> 00:31:19,920
I mean, it's not. 
It's been there forever for. 

588
00:31:19,920 --> 00:31:21,720
You, it's obvious. 
Yeah, yeah, yeah. 

589
00:31:22,480 --> 00:31:25,040
We put a pin on agents 
specifically because I feel like

590
00:31:25,040 --> 00:31:27,680
agents adds another layer of 
complexity because if you have 

591
00:31:27,680 --> 00:31:30,840
different autonomous agents, 
especially in production, things

592
00:31:30,840 --> 00:31:33,680
get even more complex. 
But let's say I have a mature 

593
00:31:33,680 --> 00:31:35,920
organization with regressive 
data availability. 

594
00:31:36,200 --> 00:31:39,720
I have a set up where a model is
behind an API call and I have my

595
00:31:39,720 --> 00:31:42,640
evaluation pipeline and I have 
my observability in production 

596
00:31:43,280 --> 00:31:46,320
and potentially not even 1 model
but different models and now I 

597
00:31:46,320 --> 00:31:48,640
want to have some type of agent 
functionality. 

598
00:31:48,800 --> 00:31:52,400
I feel like the kind of most 
cookie cutter example that I can

599
00:31:52,400 --> 00:31:56,080
think of is something in 
customer service with regards to

600
00:31:56,280 --> 00:31:58,840
having an interaction with a 
person and then autonomously 

601
00:31:58,840 --> 00:32:02,040
fetching data from wherever it 
needs to about a certain ticket,

602
00:32:02,040 --> 00:32:05,240
about a certain order. 
How do I put that in production?

603
00:32:06,160 --> 00:32:07,920
Yeah, I don't think anyone 
knows, no. 

604
00:32:08,440 --> 00:32:11,040
That's really the freshest. 
Well, how to do it? 

605
00:32:11,040 --> 00:32:15,560
Well, I think it's hard because,
I mean, now we need to think 

606
00:32:15,560 --> 00:32:19,680
about security and all the 
attack vectors that can happen 

607
00:32:19,680 --> 00:32:21,880
on your systems. 
Because if you're dealing with 

608
00:32:21,880 --> 00:32:25,080
actual customer data and some 
other customer can, I don't know

609
00:32:25,080 --> 00:32:30,280
by mistake or intent, by intent,
get data from other customers. 

610
00:32:30,280 --> 00:32:32,160
I now try to get information 
about. 

611
00:32:32,640 --> 00:32:37,240
I know how much sales there are.
I think that they are not 

612
00:32:37,240 --> 00:32:40,480
supposed to know, right? 
And that you can trick the 

613
00:32:40,480 --> 00:32:44,800
system to reveal this 
information to, to. 

614
00:32:45,200 --> 00:32:47,360
Yeah, it's very hard. 
It's, I think it's really 

615
00:32:47,360 --> 00:32:50,440
impossible to completely prevent
it from happening. 

616
00:32:51,160 --> 00:32:53,640
And I think that's something 
that we must be aware of. 

617
00:32:53,880 --> 00:32:57,400
And like what, what level of 
risk are you accepting? 

618
00:32:57,400 --> 00:32:59,680
What is acceptable for your 
company because it's 

619
00:32:59,680 --> 00:33:01,320
reputational risks in the end, 
right? 

620
00:33:01,600 --> 00:33:07,840
If someone I don't know, we go 
to an online store and we we 

621
00:33:07,840 --> 00:33:12,480
want to know about the status of
return or maybe something like 

622
00:33:12,480 --> 00:33:15,160
that. 
And then you ask it, what about 

623
00:33:15,160 --> 00:33:19,960
how many pairs of jeans did I 
order in the last year? 

624
00:33:19,960 --> 00:33:22,080
I mean, that's all great 
information, but then you maybe 

625
00:33:22,080 --> 00:33:26,600
can trick it into, OK and on 
average, how am I doing 

626
00:33:26,920 --> 00:33:29,160
comparing to others? 
Oh, and by the way, like what 

627
00:33:29,160 --> 00:33:32,360
was the total amount of sales 
and total amount of sales of 

628
00:33:32,360 --> 00:33:34,200
pants? 
How is it compared to the 

629
00:33:34,200 --> 00:33:36,360
jackets? 
Things like that, right. 

630
00:33:36,400 --> 00:33:39,320
I mean, step by step, you could 
trick it into revealing 

631
00:33:39,320 --> 00:33:41,080
something that you're not 
supposed to know. 

632
00:33:41,280 --> 00:33:43,680
Yeah. 
And how do you even put 

633
00:33:43,680 --> 00:33:46,120
guardrios in that? 
I don't know yet. 

634
00:33:46,120 --> 00:33:47,240
No. 
No one knows. 

635
00:33:47,240 --> 00:33:48,600
OK. 
No one knows. 

636
00:33:49,080 --> 00:33:53,760
When Cha Chi PT came out three 
3.5 in its earliest versions, a 

637
00:33:53,760 --> 00:33:56,280
colleague of mine got really 
excited to really figure out, 

638
00:33:56,280 --> 00:33:58,520
OK, what are the boundaries of 
this and how things have 

639
00:33:58,520 --> 00:34:01,960
evolved. 
We have 4, we have 55.15.2 and 

640
00:34:01,960 --> 00:34:03,640
other companies also have 
different models. 

641
00:34:03,640 --> 00:34:07,160
Gemini 3, Point O, Obus 1.5, 
everything like that. 

642
00:34:07,320 --> 00:34:11,120
But he continued on this path, 
mainly figuring out if he can 

643
00:34:11,120 --> 00:34:14,120
still prompt inject whatever 
malicious intent from the 

644
00:34:14,120 --> 00:34:15,880
perspective of learning, but 
also from a security 

645
00:34:15,880 --> 00:34:17,800
perspective. 
And he agrees with you that 

646
00:34:18,120 --> 00:34:20,719
whatever model comes out, it 
will have flaws from a prompt 

647
00:34:20,719 --> 00:34:22,880
injection perspective. 
We haven't solved that yet, 

648
00:34:23,320 --> 00:34:27,120
which means that no, like we 
haven't, I don't know if we ever

649
00:34:27,120 --> 00:34:29,159
can. 
Like my knowledge, I feel like 

650
00:34:29,159 --> 00:34:33,639
is too too shallow for that. 
But that also means that agents 

651
00:34:33,880 --> 00:34:35,600
should be in very simple use 
cases. 

652
00:34:35,600 --> 00:34:39,639
They should not have any 
availability to data which can 

653
00:34:39,639 --> 00:34:42,440
cause reputational damage or you
will have to build the right 

654
00:34:42,440 --> 00:34:44,560
guardrails around. 
The nation, the one in the loop,

655
00:34:44,600 --> 00:34:46,840
Yeah. 
So what I actually like about, 

656
00:34:47,400 --> 00:34:51,719
so I would like to see agents, 
it's like a personal assistant. 

657
00:34:51,719 --> 00:34:52,560
OK, Right. 
Yeah. 

658
00:34:52,719 --> 00:34:56,400
So for example, if you have kind
of customer service 

659
00:34:56,400 --> 00:35:01,160
communication is that a person 
types something and then 

660
00:35:01,200 --> 00:35:04,120
actually an agent generates a 
response. 

661
00:35:04,680 --> 00:35:08,120
And then maybe in certain way 
certain scenarios there is not 

662
00:35:08,120 --> 00:35:10,520
agent that every ways. 
I mean, is it safe to say that? 

663
00:35:10,840 --> 00:35:14,720
But also and and if that is 
certain, I guess OK, but human 

664
00:35:14,720 --> 00:35:17,040
actually clicking on yes every 
time. 

665
00:35:17,040 --> 00:35:22,560
So actually the human is doing 
the sending, but not typing 

666
00:35:22,560 --> 00:35:25,080
anymore and also not execution 
of the tasks. 

667
00:35:25,640 --> 00:35:29,920
So and I guess certain things 
can go automatically, but just 

668
00:35:29,920 --> 00:35:33,720
some generic things like asking 
for certain information from the

669
00:35:33,720 --> 00:35:37,360
customer that's required to 
retrieve the information there. 

670
00:35:37,360 --> 00:35:40,160
You don't need an agent to ask 
like a human agent to be 

671
00:35:40,160 --> 00:35:42,720
involved. 
But at certain stage when we get

672
00:35:42,720 --> 00:35:45,960
into certain questions, I mean, 
always human must be actually 

673
00:35:45,960 --> 00:35:49,160
clicking, yeah. 
And that's not the most exciting

674
00:35:49,160 --> 00:35:50,720
work. 
That's the work that typically 

675
00:35:50,720 --> 00:35:53,160
you would be like, let's 
automate that also. 

676
00:35:53,160 --> 00:35:55,720
I don't think we can. 
I don't think we will ever be 

677
00:35:55,760 --> 00:35:57,880
able to. 
Yeah, that's the challenge. 

678
00:35:58,280 --> 00:36:02,200
I've had a lot of fun working 
with agents specifically for 

679
00:36:02,200 --> 00:36:05,520
software engineering for for the
producing of my code in whatever

680
00:36:05,520 --> 00:36:07,360
context I need to to develop 
features. 

681
00:36:08,200 --> 00:36:11,080
And not everyone agrees with me.
Some people really like the 

682
00:36:11,080 --> 00:36:14,600
craft part of things or writing,
writing good code that they 

683
00:36:14,600 --> 00:36:18,240
think is really elegant or 
apartment for the solution, take

684
00:36:18,240 --> 00:36:20,960
a lot of pride in that. 
So there's a lot of aspects with

685
00:36:20,960 --> 00:36:23,720
regards to the people I talk to,
either they're in one camp, I 

686
00:36:23,720 --> 00:36:25,760
don't like it. 
Code generation is kind of taken

687
00:36:25,760 --> 00:36:27,800
away from my craft and that's 
what I found joy in. 

688
00:36:28,040 --> 00:36:30,760
And then I say to really just 
keep doing that and see how else

689
00:36:30,760 --> 00:36:33,600
it can make you productive or 
people like me where they really

690
00:36:33,600 --> 00:36:35,960
enjoy kind of the productivity 
boost and they want to work 

691
00:36:35,960 --> 00:36:37,240
towards outcomes. 
Yeah. 

692
00:36:37,240 --> 00:36:39,440
I'm wondering from your sense, 
what have you found that works 

693
00:36:39,440 --> 00:36:42,080
well for your own producing, 
whatever that is? 

694
00:36:42,240 --> 00:36:45,320
Yeah, I I'm opinionated in how 
my clothes should look like, but

695
00:36:45,320 --> 00:36:48,920
I already have enough examples 
for, you know, clothes to 

696
00:36:48,920 --> 00:36:52,480
generate something that is 
similar to what I think is good.

697
00:36:52,480 --> 00:36:55,000
And I have critically 
instructions of that should look

698
00:36:55,000 --> 00:36:59,000
like and I think the more 
examples you have, the more 

699
00:36:59,000 --> 00:37:02,240
clear instructions are the 
better, and the closer the 

700
00:37:02,240 --> 00:37:05,720
output is the to what you need 
and what you like. 

701
00:37:05,760 --> 00:37:08,200
And. 
What you find acceptable and 

702
00:37:08,200 --> 00:37:10,560
then it is a productivity boost 
for sure. 

703
00:37:10,800 --> 00:37:13,600
So I think you just need to find
the right balance between these 

704
00:37:13,600 --> 00:37:15,440
two things. 
And I mean, it's not easy. 

705
00:37:15,840 --> 00:37:18,200
It's not easy. 
Let's zoom in on that because I 

706
00:37:18,200 --> 00:37:20,480
feel like opinionated people, 
they have it easy because they 

707
00:37:20,480 --> 00:37:22,280
have an opinion. 
So they know kind of how to 

708
00:37:22,280 --> 00:37:26,760
define at least some kind of 
rules of guidance for something 

709
00:37:26,760 --> 00:37:29,000
that generates code. 
Let's say if you don't have an 

710
00:37:29,000 --> 00:37:31,760
opinion that you need to figure 
out what what your opinion is in

711
00:37:31,760 --> 00:37:33,720
the 1st place before you can 
actually have an opinion on what

712
00:37:33,720 --> 00:37:35,800
the output is. 
But how do you then structure 

713
00:37:35,800 --> 00:37:37,360
your opinion? 
Is it in skills? 

714
00:37:37,360 --> 00:37:40,600
Is it in a Claude MD file to 
take Claude as as a specific 

715
00:37:40,600 --> 00:37:41,720
example? 
Yeah. 

716
00:37:41,960 --> 00:37:45,000
How do you structure that? 
Well I I am yet to try skills. 

717
00:37:45,240 --> 00:37:50,720
I am behind on this, but yeah, 
it is typically in cloud. 

718
00:37:50,880 --> 00:37:55,160
Cloud MD file and other MD files
that have been pointed from 

719
00:37:55,160 --> 00:37:57,640
Cloud MDI. 
Mean what I'm saying now is 

720
00:37:57,640 --> 00:38:00,320
probably not already the best 
practice if you're looking at 

721
00:38:00,320 --> 00:38:02,600
the skills what's available 
there. 

722
00:38:02,960 --> 00:38:06,760
And I have some MCPS configured,
so I know depending on the tasks

723
00:38:06,760 --> 00:38:11,920
that I'm doing and for, for, for
example, we also now work with 

724
00:38:11,920 --> 00:38:16,920
linear and we have some 
brainstorm sessions and we write

725
00:38:16,920 --> 00:38:18,120
things down that should be 
there. 

726
00:38:18,480 --> 00:38:22,560
And based on this example, we 
have an also a prompt, an 

727
00:38:22,560 --> 00:38:24,880
instruction. 
And based on that, it will 

728
00:38:24,880 --> 00:38:28,720
generate tickets in linear 
expect as as expected. 

729
00:38:28,920 --> 00:38:32,400
You can even maybe when you're 
in the car, you can dictate 

730
00:38:32,400 --> 00:38:34,760
things on the phone and on 
Apple. 

731
00:38:35,360 --> 00:38:37,920
On the iPhone you can also get 
transcriptions right away. 

732
00:38:38,800 --> 00:38:42,560
You could also use to. 
You're always working, even when

733
00:38:42,560 --> 00:38:43,800
driving. 
Yeah, yeah, yeah. 

734
00:38:43,800 --> 00:38:46,320
So I mean, there are so many 
different tricks you could, I 

735
00:38:46,320 --> 00:38:48,080
think these are really 
productivity boost. 

736
00:38:48,080 --> 00:38:52,720
So like when you have a 
conversation, you can record it 

737
00:38:52,720 --> 00:38:56,720
and transcribe it and use it to 
create a summary and make a log 

738
00:38:56,720 --> 00:38:58,720
book of everything that's 
happening in your life. 

739
00:38:58,720 --> 00:39:00,800
And it will. 
You will be very thankful for 

740
00:39:01,160 --> 00:39:03,680
doing that and a half a year 
later when you already forgot 

741
00:39:03,680 --> 00:39:07,360
what you were doing and having 
an interface to actually search 

742
00:39:07,400 --> 00:39:09,720
through that. 
And I think code, for example, 

743
00:39:09,720 --> 00:39:14,160
can connect to your Google Drive
and actually be able to retrieve

744
00:39:14,160 --> 00:39:18,680
this information for you. 
So I think for me, it goes way 

745
00:39:18,680 --> 00:39:20,760
beyond just code. 
Yeah. 

746
00:39:20,960 --> 00:39:23,480
I like that a lot. 
I wonder if we get to a future 

747
00:39:23,480 --> 00:39:27,760
where so when computers came out
we started typing and then when 

748
00:39:27,760 --> 00:39:29,520
phones came out we started 
typing on our phones. 

749
00:39:29,520 --> 00:39:32,600
But the way I can speak is way 
faster than my ability to type. 

750
00:39:32,600 --> 00:39:34,280
Exactly. 
And now if I want to be 

751
00:39:34,280 --> 00:39:37,760
productive, if there are more, 
more tools for me to actually 

752
00:39:37,760 --> 00:39:39,920
speak to. 
And that then becomes context 

753
00:39:39,920 --> 00:39:43,160
for something that executes. 
If we move to a society where 

754
00:39:43,160 --> 00:39:45,560
people speak a lot more than 
they are also better at 

755
00:39:45,560 --> 00:39:49,200
communicating, better at kind of
explaining themselves instead of

756
00:39:49,200 --> 00:39:51,200
typing. 
Because I feel like the people 

757
00:39:51,200 --> 00:39:53,800
that type are not necessarily 
the best communicators if they 

758
00:39:53,800 --> 00:39:59,120
only type, and for me to be more
productive if I don't have to 

759
00:39:59,120 --> 00:40:01,680
speak, that would be good, I 
think as a personal skill as 

760
00:40:01,680 --> 00:40:03,520
well as a life skill. 
Yeah, yeah, Yeah. 

761
00:40:03,520 --> 00:40:05,040
No, there is something there for
sure. 

762
00:40:05,040 --> 00:40:07,600
But I think it's still different
talking to yourself versus 

763
00:40:07,600 --> 00:40:09,560
talking to another human, right?
Yeah, yeah. 

764
00:40:10,400 --> 00:40:12,680
But yeah, I agree. 
I think it's a it's a good 

765
00:40:12,760 --> 00:40:14,480
exercise. 
It's interesting. 

766
00:40:14,480 --> 00:40:16,480
To have your thoughts more 
concise. 

767
00:40:16,480 --> 00:40:19,760
And yeah, I do like writing, 
though I don't know, I like 

768
00:40:19,760 --> 00:40:21,360
writing more than talking. 
Somehow I. 

769
00:40:21,880 --> 00:40:23,240
Mean it's. 
Do you like reading? 

770
00:40:25,120 --> 00:40:28,120
No, not personally interesting. 
Well, no, actually. 

771
00:40:29,080 --> 00:40:31,680
Well, I just don't have that 
much time for reading. 

772
00:40:31,680 --> 00:40:34,520
I think that's it really. 
So I'm listening more like 

773
00:40:34,520 --> 00:40:37,360
podcasts and stuff or books 
actually, I'm listening to the 

774
00:40:37,360 --> 00:40:39,760
books, but it's mostly because I
just don't have time. 

775
00:40:39,760 --> 00:40:42,720
And when I'm driving, I'm trying
to squeeze all the stuff that I 

776
00:40:42,720 --> 00:40:45,960
want to do into that time. 
So then I'm talking and 

777
00:40:45,960 --> 00:40:47,720
listening. 
But it's not my natural 

778
00:40:47,720 --> 00:40:52,480
preference because if I writing 
and reading, because if I, I 

779
00:40:52,480 --> 00:40:55,440
don't know if I'm searching for 
information, I wouldn't go to 

780
00:40:55,440 --> 00:40:58,920
YouTube channel unless I really,
really have to just because I 

781
00:40:58,920 --> 00:41:02,240
don't know, maybe it's my 
personality, but I rarely like 

782
00:41:02,240 --> 00:41:06,160
the voice of other people 
talking or I don't know why 

783
00:41:06,640 --> 00:41:10,040
it's, it's really, it's, it's me
really, right? 

784
00:41:10,360 --> 00:41:15,560
But it's always been like that. 
It's not like so I I do like 

785
00:41:15,560 --> 00:41:18,880
reading more because it goes 
through more through my lens 

786
00:41:18,880 --> 00:41:20,680
somehow, right? 
Yeah, you get. 

787
00:41:20,760 --> 00:41:23,240
That yeah. 
So it's easier for me to to get 

788
00:41:23,240 --> 00:41:26,160
this information out of the text
and visualization rather than, 

789
00:41:26,960 --> 00:41:29,400
you know, hearing. 
I think it's really, really cool

790
00:41:29,400 --> 00:41:31,440
that you say, OK, this is my 
kind of natural preference, but 

791
00:41:31,440 --> 00:41:34,000
I just don't have time for that.
So I do this instead, which is 

792
00:41:34,000 --> 00:41:37,040
still like everything in mind 
with the rest of productivity. 

793
00:41:37,160 --> 00:41:40,360
Yeah, I learnt something new 
about specifically Claude and 

794
00:41:40,360 --> 00:41:43,880
skills because previously I had 
a big Claude MD file with my 

795
00:41:43,880 --> 00:41:46,720
code conventions and a colleague
of me and said I had so many 

796
00:41:47,000 --> 00:41:49,240
that Claude started complaining 
because when you load in a 

797
00:41:49,240 --> 00:41:52,400
Claude MD file, that's too much.
He said it's about I think 1000 

798
00:41:52,400 --> 00:41:55,360
or 2000 lines. 
There's guidelines on this on 

799
00:41:55,360 --> 00:41:59,360
the Anthropic website. 
Then it might actually miss 

800
00:41:59,360 --> 00:42:01,040
parts of that contacts that it 
loads in. 

801
00:42:01,080 --> 00:42:02,840
And he says that's when I use 
skills. 

802
00:42:02,840 --> 00:42:06,800
So code conventions, styles with
regards to a certain controller 

803
00:42:06,800 --> 00:42:09,320
or yeah, that's what he's been 
using and he's really happy with

804
00:42:09,320 --> 00:42:10,560
that. 
Everyone is so happy with 

805
00:42:10,560 --> 00:42:12,400
skills, so now it's Christmas 
vacation. 

806
00:42:12,400 --> 00:42:13,400
Yeah, you're gonna. 
You're gonna. 

807
00:42:14,160 --> 00:42:16,720
I'm gonna do that, yes. 
I'm finally gonna do because I'm

808
00:42:16,720 --> 00:42:20,320
just so much stuff I need to do.
Of course, just this little tiny

809
00:42:20,320 --> 00:42:22,200
thing to try already feels too 
much. 

810
00:42:23,240 --> 00:42:25,880
Yeah. 
Where I really use skills, my 

811
00:42:25,880 --> 00:42:27,880
own thing is mainly for content 
creation. 

812
00:42:27,880 --> 00:42:31,360
So episodes come out, I have a 
transcript of an episode, I need

813
00:42:31,360 --> 00:42:33,680
to figure out titles and 
thumbnails and descriptions and 

814
00:42:33,680 --> 00:42:35,720
timestamps. 
I made that into a skill. 

815
00:42:35,720 --> 00:42:37,680
I was like, let's try this thing
out, and it actually works quite

816
00:42:37,680 --> 00:42:38,560
well. 
I'm happy with it. 

817
00:42:38,680 --> 00:42:41,760
That's awesome, yeah. 
But everything with regards to 

818
00:42:41,880 --> 00:42:44,520
personal productivity and 
experimenting, I feel like is 

819
00:42:45,040 --> 00:42:46,840
it's just really fun to do. 
To be honest. 

820
00:42:47,120 --> 00:42:52,080
But I I don't really have like 
very large files like Cloud MD 

821
00:42:52,080 --> 00:42:55,280
and other MDI mean they're still
manageable, but they're per 

822
00:42:55,280 --> 00:42:57,520
repos. 
I have many repos and in those 

823
00:42:57,560 --> 00:43:01,600
repos you have this. 
But yeah, it's like I I do like 

824
00:43:01,600 --> 00:43:04,880
mono repos but not for personal 
projects somehow like. 

825
00:43:05,760 --> 00:43:08,080
How do you split those up? 
Personal projects, not minor 

826
00:43:08,080 --> 00:43:10,160
mono repo or. 
No, it's not mono repo. 

827
00:43:10,160 --> 00:43:12,920
It's all separate things. 
Separate repos? 

828
00:43:13,080 --> 00:43:15,480
Like even tiny things. 
Relatively tiny, I think so 

829
00:43:15,840 --> 00:43:17,400
different people. 
Why is that? 

830
00:43:17,400 --> 00:43:20,320
You just want them isolated. 
Yeah, I just want them isolated.

831
00:43:20,320 --> 00:43:22,560
Yeah, OK. 
Yeah, I know it's personal 

832
00:43:22,560 --> 00:43:23,920
preference. 
Yeah, yeah, interesting. 

833
00:43:23,920 --> 00:43:28,040
But when you talk about, I don't
know, actual production, mono 

834
00:43:28,040 --> 00:43:31,120
repo makes a lot of sense for 
different use cases. 

835
00:43:31,120 --> 00:43:33,440
Yeah, for sure. 
Yeah, I think sometimes that 

836
00:43:33,440 --> 00:43:36,560
might be controversial or kind 
of the the thought on the 

837
00:43:36,560 --> 00:43:39,280
Internet has shifted around. 
First mono repos were evil and 

838
00:43:39,280 --> 00:43:41,280
now people are coming back on 
that and people are saying it's 

839
00:43:41,280 --> 00:43:43,200
actually quite stable. 
Yeah, right. 

840
00:43:43,240 --> 00:43:47,000
It's so funny how the the world 
changes in that regard. 

841
00:43:47,000 --> 00:43:48,040
Yeah. 
Yeah, definitely. 

842
00:43:48,360 --> 00:43:51,600
We've gone through, I feel like 
more fundamentals of envelope 

843
00:43:51,600 --> 00:43:55,320
specifically for software 
engineers going to production 

844
00:43:55,320 --> 00:43:58,800
with a first model and then also
the challenges of having agents 

845
00:43:58,800 --> 00:44:01,000
and a little bit of personal 
productivity in that 

846
00:44:01,000 --> 00:44:02,680
conversation. 
Is there anything that we miss 

847
00:44:02,680 --> 00:44:05,960
that you still want to share? 
No, I still recover the Lords. 

848
00:44:06,080 --> 00:44:07,760
Awesome. 
And thank you so much for coming

849
00:44:07,760 --> 00:44:09,400
on. 
I really, I really enjoyed this.

850
00:44:09,400 --> 00:44:10,640
Yeah. 
It was fun. 

851
00:44:10,800 --> 00:44:11,960
Cool. 
We're going to round it off 

852
00:44:11,960 --> 00:44:13,240
here. 
If you're still listening, let 

853
00:44:13,240 --> 00:44:15,400
me know in the comments section 
what you thought of this episode

854
00:44:15,640 --> 00:44:16,600
and we'll see you in the next 
one.

