1
00:00:00,000 --> 00:00:02,450
Once I start to learn about 
cognitive psychology, then it 

2
00:00:02,450 --> 00:00:05,028
kinda quickly occur to me that 
it's a surprise that we're 

3
00:00:05,028 --> 00:00:07,698
capable of writing code at all. 
It just shouldn't be possible 

4
00:00:07,698 --> 00:00:09,690
because the human brain is way 
too limited. 

5
00:00:09,720 --> 00:00:13,293
Today's guest is Adam Tornhill, 
CTO and Founder of CodeScene, 

6
00:00:13,293 --> 00:00:15,750
and author of Your Code as a 
Crime Scene. 

7
00:00:15,870 --> 00:00:18,792
With over two decades studying 
the intersection of psychology 

8
00:00:18,792 --> 00:00:22,208
and software engineering, Adam 
has pioneered behavioral code 

9
00:00:22,208 --> 00:00:24,848
analysis, revealing how team 
dynamics and cognitive 

10
00:00:24,848 --> 00:00:27,987
bottlenecks shape code quality. 
Your concept of your code base 

11
00:00:27,987 --> 00:00:29,730
as a prime scene is quite 
interesting. 

12
00:00:30,000 --> 00:00:32,100
So tell us what exactly the 
first thing that you would do? 

13
00:00:32,220 --> 00:00:35,049
If you plot out the change 
frequency of every single file 

14
00:00:35,049 --> 00:00:38,376
in your code base, you have 
maybe 1-2% of your code base 

15
00:00:38,376 --> 00:00:41,190
that accounts for the majority 
of your development work. 

16
00:00:41,220 --> 00:00:44,438
The obvious implications is that
if you want to improve code 

17
00:00:44,438 --> 00:00:46,884
quality, then you should really 
start with the most frequently 

18
00:00:46,884 --> 00:00:49,404
worked on files. 
If you have a healthy code, then

19
00:00:49,404 --> 00:00:52,452
your development work is going 
to be not only more than twice 

20
00:00:52,452 --> 00:00:54,459
as quick as someone working 
unhealthy code. 

21
00:00:54,610 --> 00:00:57,640
It's also going to be 10x more 
predictable. 

22
00:00:57,819 --> 00:01:00,760
Do you think AI will help us in 
software quality? 

23
00:01:00,940 --> 00:01:03,121
Writing code was never the 
bottleneck, but we pretend it 

24
00:01:03,121 --> 00:01:05,248
was. 
The big bottleneck is instead in

25
00:01:05,248 --> 00:01:08,046
understanding existing code, 
where we spend roughly 70% of 

26
00:01:08,046 --> 00:01:10,941
our work week. 
I work with a company that had 

27
00:01:10,941 --> 00:01:13,934
made a massive, massive rollout 
of an AI coding tool. 

28
00:01:13,964 --> 00:01:16,536
And then decided, hey, let's 
scale this up to a couple of 

29
00:01:16,536 --> 00:01:18,239
thousand developers. 
Pretty soon they noticed an 

30
00:01:18,239 --> 00:01:20,580
impact on production that they 
were generating poor code 

31
00:01:20,580 --> 00:01:22,760
leading to bugs. 
They have thousands of lines of 

32
00:01:22,760 --> 00:01:25,125
code that you didn't write, your
teammates didn't write. 

33
00:01:25,214 --> 00:01:27,345
But now you need to understand 
them, and you need to understand

34
00:01:27,345 --> 00:01:45,320
them under of time pressure. 
Hello, everyone. 

35
00:01:45,320 --> 00:01:48,318
Welcome back to another new 
episode of the Tech Lead Journal

36
00:01:48,318 --> 00:01:50,570
podcast. 
Today, I'm very excited to have 

37
00:01:50,570 --> 00:01:55,586
Adam Tornhill with me today. 
He's the CTO and founder of a 

38
00:01:55,586 --> 00:01:58,967
software quality, software 
metrics tools called CodeScene. 

39
00:01:59,592 --> 00:02:03,078
And actually also represents his
famous book that he authored, 

40
00:02:03,078 --> 00:02:07,386
uh, many years ago titled Your 
Code as Your Crime Scene. 

41
00:02:07,896 --> 00:02:10,338
I think it's very interesting 
title that hopefully we get a 

42
00:02:10,338 --> 00:02:13,181
chance to talk about it, like, 
you know, investigating your 

43
00:02:13,181 --> 00:02:16,605
codebase, trying to figure out, 
uh, where the defects are, where

44
00:02:16,605 --> 00:02:20,155
the problem is. 
And I think Adam is also quite 

45
00:02:20,155 --> 00:02:23,964
experienced, or I would say also
experts in software quality, and

46
00:02:23,964 --> 00:02:27,306
looking forward also to discuss 
about software quality and AI 

47
00:02:27,306 --> 00:02:29,854
later on. 
So Adam, looking forward for 

48
00:02:29,854 --> 00:02:31,386
this conversation. 
Welcome to the show. 

49
00:02:32,238 --> 00:02:34,501
Thank you very much. 
I'm really happy that I could 

50
00:02:34,501 --> 00:02:35,568
join. 
I'm looking forward to this. 

51
00:02:36,759 --> 00:02:39,132
Right. 
Adam, uh, first of all, maybe I 

52
00:02:39,132 --> 00:02:42,180
would like to ask you, looking 
back at your career until now, 

53
00:02:42,180 --> 00:02:45,326
are there certain turning points
that you think are interesting 

54
00:02:45,326 --> 00:02:49,618
that we could learn from you? 
So I think I had multiple 

55
00:02:49,618 --> 00:02:53,368
turning points, but the most 
impactful was roughly 20 years 

56
00:02:53,368 --> 00:02:56,866
ago. 
So back then I had been working 

57
00:02:56,866 --> 00:03:00,508
as a professional developer for 
six, seven years, and I kind of 

58
00:03:00,508 --> 00:03:03,808
made this observation that many 
others have done too, that most 

59
00:03:03,808 --> 00:03:05,233
software projects, they tend to 
fail. 

60
00:03:05,968 --> 00:03:09,568
And They tend to fail miserably 
like, you know, way over budget.

61
00:03:10,258 --> 00:03:13,198
Doesn't live up to customer 
expectations and painful. 

62
00:03:13,978 --> 00:03:16,924
And I kind of wanted to 
understand why is that 

63
00:03:16,924 --> 00:03:19,582
happening? 
So I decided to really, really 

64
00:03:19,582 --> 00:03:23,443
get to the root of the problem 
and I decided to look outside of

65
00:03:23,443 --> 00:03:26,306
technology. 
So that's how I got involved in 

66
00:03:26,306 --> 00:03:28,334
psychology. 
I signed up at the university 

67
00:03:28,334 --> 00:03:31,850
for an introductory course to 
psychology and kind of went on 

68
00:03:31,850 --> 00:03:36,282
for roughly six years and took a
second degree in psychology. 

69
00:03:36,992 --> 00:03:40,250
And that has influenced, and 
still a lot, in my career, uh, 

70
00:03:40,250 --> 00:03:43,675
going forward. 
So, uh, I think the main advice 

71
00:03:43,675 --> 00:03:47,062
is that there's so much to learn
from other disciplines too. 

72
00:03:47,062 --> 00:03:51,346
So even if we work within tech, 
there's a lot to learn from the 

73
00:03:51,346 --> 00:03:53,482
human sciences, from behavioral 
psychology. 

74
00:03:54,292 --> 00:03:58,192
And um, I think that being able 
to combine multiple disciplines,

75
00:03:58,222 --> 00:03:59,972
uh, give us an edge as 
developers. 

76
00:04:00,747 --> 00:04:04,377
Before we continue, I want to 
tell you about our sponsor 

77
00:04:04,377 --> 00:04:08,231
Unleash, who helps engineering 
teams ship code faster with less

78
00:04:08,231 --> 00:04:11,440
risk using feature flags. 
Let's hear what they have to 

79
00:04:11,440 --> 00:04:17,783
say. 
I am Egil, CEO of Unleash. 

80
00:04:17,903 --> 00:04:21,147
We believe that AI is having a 
tremendous impact of how 

81
00:04:21,147 --> 00:04:23,703
software is being delivered. 
It's going faster than ever 

82
00:04:23,703 --> 00:04:25,923
before. 
And if you look at the latest 

83
00:04:25,923 --> 00:04:29,213
DORA reports, we already see 
that more than 90% of developers

84
00:04:29,213 --> 00:04:31,293
are using AI in the day-to-day 
work. 

85
00:04:31,473 --> 00:04:34,017
AI is getting software to 
production faster, but we're 

86
00:04:34,017 --> 00:04:37,136
also seeing that the AI is 
impacting the stability of the 

87
00:04:37,136 --> 00:04:39,444
software in production. 
By the end of the day, what you 

88
00:04:39,444 --> 00:04:42,248
can say is that AI is getting 
more code to production faster. 

89
00:04:42,488 --> 00:04:44,378
But at the same time more bugs 
is coming with it. 

90
00:04:44,558 --> 00:04:47,392
Google is a cautionary tale. 
They talk about how more than 

91
00:04:47,392 --> 00:04:49,328
30% of their software is written
with AI. 

92
00:04:49,848 --> 00:04:52,048
At the same time, you'll 
remember this big outage that 

93
00:04:52,048 --> 00:04:55,535
happened back in June this year 
where a minor update in their 

94
00:04:55,535 --> 00:04:58,208
backend services took down 
Google Mail, BigQuery, and 

95
00:04:58,208 --> 00:04:59,658
basically half of internet, 
right? 

96
00:04:59,898 --> 00:05:02,598
So if you go and read their 
postmortem after this incident. 

97
00:05:02,598 --> 00:05:05,478
Really, what it says is that if 
this minor change would be 

98
00:05:05,478 --> 00:05:08,572
flag-protected, the incident 
wouldn't occur in the first 

99
00:05:08,572 --> 00:05:10,836
place. 
So feature flags are the fastest

100
00:05:10,836 --> 00:05:13,188
and safest way to roll back code
in production. 

101
00:05:13,338 --> 00:05:15,258
It always made sense for a human
written code. 

102
00:05:15,348 --> 00:05:17,503
And with the AI assisted 
development, it makes even more 

103
00:05:17,503 --> 00:05:19,476
sense. 
So this is what we call 

104
00:05:19,476 --> 00:05:21,492
FeatureOps. 
So where DevOps is all about 

105
00:05:21,492 --> 00:05:24,790
getting code into production, 
FeatureOps is all about how code

106
00:05:24,790 --> 00:05:26,778
operates in production at real 
time. 

107
00:05:27,618 --> 00:05:30,736
This is why teams that 
Prudential, Wayfair, Visa, and 

108
00:05:30,736 --> 00:05:34,274
many, many others are using 
Unleash as their FeatureOps 

109
00:05:34,274 --> 00:05:38,328
control plane to move fast, stay
safe, and be in control. 

110
00:05:38,808 --> 00:05:42,970
Learn more how to ship code 
faster with less risk at 

111
00:05:42,970 --> 00:05:46,057
getunleash.io. 
And now let's get back to our 

112
00:05:46,057 --> 00:05:49,082
episode. 
Yeah, thank you for sharing your

113
00:05:49,082 --> 00:05:51,828
unique journey, right? 
I think studying psychology and 

114
00:05:51,828 --> 00:05:53,808
software engineering at the same
time. 

115
00:05:53,838 --> 00:05:57,258
Um, it's not quite often I hear 
people doing so. 

116
00:05:57,558 --> 00:06:00,138
But I think that gives you a 
quite a unique perspectives. 

117
00:06:00,318 --> 00:06:04,050
And I know that you have been 
applying these two disciplines 

118
00:06:04,050 --> 00:06:07,283
quite rigorously either in your 
research, in your current work 

119
00:06:07,283 --> 00:06:10,376
and also your product. 
So tell us maybe what's 

120
00:06:10,376 --> 00:06:14,069
interesting fusion that you have
found out about engineering and 

121
00:06:14,069 --> 00:06:17,018
psychology. 
There's so much to it. 

122
00:06:17,168 --> 00:06:19,982
So learning psychology 
completely changed how I 

123
00:06:19,982 --> 00:06:23,798
approach, uh, software. 
So first of all, I got 

124
00:06:23,798 --> 00:06:27,548
explanation to many phenomenon. 
So we all know about the 

125
00:06:27,548 --> 00:06:29,794
challenges of scaling 
development teams and getting a 

126
00:06:29,794 --> 00:06:32,198
large organization kind of pull 
in the same direction. 

127
00:06:33,038 --> 00:06:36,008
And what's so interesting is 
that social psychologists, they 

128
00:06:36,008 --> 00:06:41,390
have studied this for decades. 
There are a lot of knowledge and

129
00:06:41,390 --> 00:06:44,648
a lot of the problems we try to 
solve within the software 

130
00:06:44,648 --> 00:06:46,643
industry had already been solved
in psychology. 

131
00:06:47,043 --> 00:06:49,073
So that's like the obvious 
learnings. 

132
00:06:49,913 --> 00:06:53,204
The second one that might be a, 
that surprised me a little bit 

133
00:06:53,204 --> 00:06:56,188
more was that once I started to 
learn about cognitive 

134
00:06:56,188 --> 00:06:59,201
psychology, then which is where 
we much about how we people 

135
00:06:59,201 --> 00:07:02,555
think, reason how we solve 
problems, it kinda quickly occur

136
00:07:02,555 --> 00:07:06,221
to me that it's a surprise that 
we're capable of writing code at

137
00:07:06,221 --> 00:07:08,023
all. 
It just shouldn't be possible 

138
00:07:08,023 --> 00:07:10,403
because the human brain is way 
too limited. 

139
00:07:10,793 --> 00:07:13,466
You, you know, in cognitive 
psychology you learn about all 

140
00:07:13,466 --> 00:07:15,953
these cognitive bottlenecks that
we have. 

141
00:07:16,763 --> 00:07:19,763
But surprisingly the human brain
is very good at workarounds. 

142
00:07:20,573 --> 00:07:23,554
So I thought that there's a lot 
of lessons to pull out of 

143
00:07:23,554 --> 00:07:26,411
cognitive psychology, because if
we are aware of what the 

144
00:07:26,411 --> 00:07:29,449
cognitive bottlenecks are, then 
we can start to design our 

145
00:07:29,449 --> 00:07:30,983
software around those 
bottlenecks, right? 

146
00:07:30,983 --> 00:07:33,458
In tandem with the bottlenecks 
rather than try to fighting them

147
00:07:33,458 --> 00:07:37,308
all the time, which like 90% of 
all code out there does. 

148
00:07:38,468 --> 00:07:42,018
So that's one area that I 
applied directly from psychology

149
00:07:42,018 --> 00:07:44,065
to software. 
Very interesting. 

150
00:07:44,065 --> 00:07:47,162
I hope we get to dive deeper 
into the cognitive aspect. 

151
00:07:47,342 --> 00:07:50,069
But interestingly when you 
mention about this, right, many 

152
00:07:50,069 --> 00:07:52,916
software engineering teams, 
especially now where everyone is

153
00:07:52,916 --> 00:07:54,872
building technology within their
companies, right? 

154
00:07:55,273 --> 00:07:59,375
So many leaders I think still do
not understand the psychological

155
00:07:59,375 --> 00:08:02,255
aspect of software development 
or software development teams. 

156
00:08:02,585 --> 00:08:06,731
So would you say that every 
engineering leaders these days 

157
00:08:06,731 --> 00:08:08,975
must understand a bit of this 
psychology? 

158
00:08:10,101 --> 00:08:12,865
It gets very important in order 
to build an efficient team. 

159
00:08:13,575 --> 00:08:17,745
There's so many wasteful 
ceremonies and procedures that 

160
00:08:17,745 --> 00:08:20,487
we kind of insist on carrying on
because we've gotten used to 

161
00:08:20,487 --> 00:08:22,497
'em. 
And, uh, I mean, to just give 

162
00:08:22,497 --> 00:08:26,505
you a quick example on something
I see almost on a daily basis. 

163
00:08:26,835 --> 00:08:30,000
Brainstorming, right? 
It's something that's been 

164
00:08:30,000 --> 00:08:32,100
around for a long, long time in 
the current format. 

165
00:08:32,130 --> 00:08:36,134
It's an idea from the 1950s. 
And we look at the research done

166
00:08:36,134 --> 00:08:39,330
on it. 
Then the research is, you know, 

167
00:08:39,330 --> 00:08:42,630
it's very clear that 
brainstorming just doesn't work.

168
00:08:43,559 --> 00:08:46,750
And the reason it doesn't work 
is because the whole social 

169
00:08:46,750 --> 00:08:51,189
situation is it's like an open 
invitation to lots and lots of, 

170
00:08:51,189 --> 00:08:54,927
uh, social biases. 
And, uh, being aware of that as 

171
00:08:54,927 --> 00:08:57,444
a technical leader immediately 
makes it possible to avoid that 

172
00:08:57,444 --> 00:09:00,300
type of waste and also to get 
the most outta your team. 

173
00:09:00,300 --> 00:09:02,250
So I think it's, uh, very 
useful. 

174
00:09:02,250 --> 00:09:06,270
I highly recommend any technical
leader or manager to dip their 

175
00:09:06,270 --> 00:09:09,158
toes into psychology. 
Right. 

176
00:09:09,578 --> 00:09:10,983
Yeah. 
Especially, I would say software

177
00:09:10,983 --> 00:09:13,783
engineering is more like a 
knowledge worker kind of type of

178
00:09:13,783 --> 00:09:16,688
work, right? 
Where you, you use more of your 

179
00:09:16,688 --> 00:09:20,289
brain power and definitely like 
cognitive, you know, cognitive 

180
00:09:20,289 --> 00:09:23,519
load, psychological aspect. 
And also we work in teams rather

181
00:09:23,519 --> 00:09:26,283
than mostly solo, right? 
So I think it's also important 

182
00:09:26,283 --> 00:09:29,175
to understand psychology from 
other human behaviors part. 

183
00:09:29,565 --> 00:09:32,960
So you mentioned in the very 
beginning that you took your 

184
00:09:32,960 --> 00:09:35,941
turning point simply because you
found so many software projects 

185
00:09:35,941 --> 00:09:38,228
failing. 
And this was way back many, many

186
00:09:38,228 --> 00:09:40,613
years ago. 
I still think statistically many

187
00:09:40,613 --> 00:09:42,748
software projects still kind of 
like failed. 

188
00:09:43,078 --> 00:09:45,838
Uh, it seems like we haven't 
really learned our lessons. 

189
00:09:46,018 --> 00:09:49,237
And it goes back mostly to 
software quality or code 

190
00:09:49,237 --> 00:09:52,824
quality. 
So tell us why this is still the

191
00:09:52,824 --> 00:09:56,340
case, why we have advanced so 
many technologies, uh, within so

192
00:09:56,340 --> 00:09:58,683
many years, but it seems like 
this problem cannot be solved. 

193
00:09:59,772 --> 00:10:02,282
Oh, uh, I'm afraid that you are 
correct. 

194
00:10:02,282 --> 00:10:04,782
It's actually quite depressing 
looking back. 

195
00:10:05,622 --> 00:10:08,855
But there are certain things we 
know today that we didn't know 

196
00:10:08,855 --> 00:10:13,089
10 or 20 years ago, and, uh, we 
can use that knowledge to our 

197
00:10:13,089 --> 00:10:16,127
advantage. 
So the reason I think so many 

198
00:10:16,127 --> 00:10:19,540
projects fail, I mean the, there
are many, many reasons, but a 

199
00:10:19,540 --> 00:10:23,164
very common root cause is that 
we have a poor quality code. 

200
00:10:23,464 --> 00:10:27,321
And if we have poor quality 
code, then the main problem is 

201
00:10:27,321 --> 00:10:30,473
that a manager, they won't see 
the root cause, they will just 

202
00:10:30,473 --> 00:10:33,015
see the symptoms. 
So they will see a project 

203
00:10:33,015 --> 00:10:35,117
that's unpredictable. 
You think things are going to 

204
00:10:35,117 --> 00:10:37,589
take a day and they end up 
taking two months. 

205
00:10:37,889 --> 00:10:39,599
There are all these unknown 
unknowns. 

206
00:10:40,379 --> 00:10:44,300
And uh, you might also see that 
it becomes really, really hard. 

207
00:10:44,330 --> 00:10:46,616
So it becomes really hard to 
guess when something will be 

208
00:10:46,616 --> 00:10:49,190
done, which puts a lot of 
pressure not only on the 

209
00:10:49,190 --> 00:10:50,630
leadership team, but also on 
developers. 

210
00:10:51,290 --> 00:10:53,813
And then of course, once 
something is done, you have a 

211
00:10:53,813 --> 00:10:56,852
ton of rework that you cannot 
anticipate upfront due to poor 

212
00:10:56,852 --> 00:10:58,652
quality, right? 
So you have all these symptoms 

213
00:10:58,652 --> 00:11:03,056
that we have all seen. 
But the root cause poor quality 

214
00:11:03,056 --> 00:11:05,720
remains largely, it's largely a 
black box. 

215
00:11:06,330 --> 00:11:09,156
And that I think is the big 
tragedy of software that it's so

216
00:11:09,156 --> 00:11:11,738
hard to have a communication 
around something as technical as

217
00:11:11,738 --> 00:11:14,150
a source code with the 
non-technical stakeholder. 

218
00:11:15,080 --> 00:11:17,672
Because even we as developers, 
we struggle with understanding 

219
00:11:17,672 --> 00:11:19,670
code we didn't write ourselves, 
right? 

220
00:11:20,405 --> 00:11:22,895
So how should we expect someone 
that doesn't write code 

221
00:11:22,895 --> 00:11:26,441
themselves to understand it? 
So code is very abstract, and 

222
00:11:26,441 --> 00:11:29,745
that's what I've been working on
for like the past 10 years, to 

223
00:11:29,745 --> 00:11:32,105
try to bring visibility to 
source code, to make it 

224
00:11:32,135 --> 00:11:36,275
accessible not only to technical
people but also to 

225
00:11:36,275 --> 00:11:39,990
non-technical, non-coders. 
Yeah, you mentioned the keyword 

226
00:11:39,990 --> 00:11:43,032
there, abstract. 
Because I think many, if we look

227
00:11:43,032 --> 00:11:45,903
at many other engineering 
discipline, right, it seems like

228
00:11:45,903 --> 00:11:48,816
they are able to do much better 
job in terms of, I dunno, like 

229
00:11:49,146 --> 00:11:51,486
chunking the work, making more 
estimate. 

230
00:11:51,846 --> 00:11:54,740
Is it because like code is 
abstract such that it's very 

231
00:11:54,740 --> 00:11:58,024
difficult to kind of like 
maintain the quality or even 

232
00:11:58,024 --> 00:12:01,476
like have a bar of quality that 
we all can aspire to hit? 

233
00:12:01,476 --> 00:12:03,276
Because like for example, in 
other engineering, they might 

234
00:12:03,276 --> 00:12:06,434
have some standards, right? 
And they just need to, you know,

235
00:12:06,434 --> 00:12:09,840
work within the standard. 
But for code, I guess it's very 

236
00:12:09,840 --> 00:12:12,772
difficult to say what kind of 
quality every software team 

237
00:12:12,772 --> 00:12:15,202
needs to aspire to. 
Is that part of the problem? 

238
00:12:15,202 --> 00:12:16,882
Why this is such a difficult 
problem? 

239
00:12:17,875 --> 00:12:20,665
It's part of the problem, but I 
kind of think that the key 

240
00:12:20,665 --> 00:12:23,993
problem is that we don't have 
any physics for source code, 

241
00:12:23,993 --> 00:12:25,766
right? 
We cannot weigh it. 

242
00:12:25,766 --> 00:12:28,406
We cannot take a software 
system, you know, pull it out 

243
00:12:28,406 --> 00:12:30,806
and turn it around and inspect 
it for technical debt. 

244
00:12:30,836 --> 00:12:34,770
It's just not doable. 
And that is what makes it so 

245
00:12:34,770 --> 00:12:37,005
hard. 
Uh, second problem, I think, is 

246
00:12:37,005 --> 00:12:41,306
that for a long, long time, we 
haven't really had any way of 

247
00:12:41,306 --> 00:12:43,346
measuring quality in a reliable 
way. 

248
00:12:43,871 --> 00:12:48,461
And that means that, you know, 
if you sit down and talk to 20 

249
00:12:48,461 --> 00:12:50,953
developers, you get thirty 
different answers on what good 

250
00:12:50,953 --> 00:12:53,999
source code is. 
So I think there are many 

251
00:12:53,999 --> 00:12:56,811
challenges like that, that we 
have been trying to tackle with 

252
00:12:56,811 --> 00:12:59,010
our research. 
Yeah. 

253
00:12:59,340 --> 00:13:02,393
I guess one part of the software
engineering teams whenever we 

254
00:13:02,393 --> 00:13:05,649
discuss about code quality, 
right, is the, so called the gap

255
00:13:05,649 --> 00:13:08,100
of understanding the code 
quality within the team itself. 

256
00:13:08,160 --> 00:13:10,572
Then the next thing is actually 
to explain it with the 

257
00:13:10,572 --> 00:13:13,320
stakeholders, the non-technical 
people about code quality. 

258
00:13:13,592 --> 00:13:15,542
We always use technical debt, 
technical debt. 

259
00:13:15,542 --> 00:13:18,512
It, first of all, is it the 
right term for us to communicate

260
00:13:18,512 --> 00:13:23,301
with other non-technical people?
Or if so, uh, what would be the 

261
00:13:23,301 --> 00:13:26,697
best way to convey this kind of 
understanding about the 

262
00:13:26,697 --> 00:13:30,610
importance of code quality? 
That's an interesting point. 

263
00:13:30,610 --> 00:13:34,462
I think technical debt as a 
term, I think the original usage

264
00:13:34,462 --> 00:13:38,457
of it back in the 1990s as Ward 
Cunningham kind of coined it. 

265
00:13:38,457 --> 00:13:42,516
I think that one is useful. 
But as you kind of indicate that

266
00:13:42,516 --> 00:13:45,844
the term technical debt has been
so diluted over the years and 

267
00:13:45,844 --> 00:13:49,190
now we've used that term to 
anything that we don't like or 

268
00:13:49,190 --> 00:13:50,968
disagree with. 
And that's not useful. 

269
00:13:50,998 --> 00:13:54,100
That's not helpful. 
So I think the proper solution 

270
00:13:54,100 --> 00:13:57,878
is that we need to align 
whatever we call software 

271
00:13:57,878 --> 00:14:02,998
quality with a business outcome.
Whatever we mean with high 

272
00:14:02,998 --> 00:14:05,908
quality has to be something that
benefits the business. 

273
00:14:05,908 --> 00:14:09,727
Otherwise, it's really just a 
vanity metric and that's not 

274
00:14:09,727 --> 00:14:11,983
useful. 
That's going to hurt trust 

275
00:14:11,983 --> 00:14:15,040
rather than empower us as 
developers to do the right 

276
00:14:15,040 --> 00:14:16,651
thing. 
Yeah. 

277
00:14:16,681 --> 00:14:20,432
And you wrote this paper long 
time back, the business impact 

278
00:14:20,432 --> 00:14:23,910
of code quality. 
So tell us, um, what kind of 

279
00:14:23,910 --> 00:14:27,402
findings that you found out when
writing that paper about the 

280
00:14:27,402 --> 00:14:29,080
benefits of good software 
quality? 

281
00:14:29,080 --> 00:14:31,825
And how do you define good 
software quality back then? 

282
00:14:33,093 --> 00:14:34,868
Yeah. 
Uh, let's start with the hard 

283
00:14:34,868 --> 00:14:36,513
problem. 
How do we define good quality? 

284
00:14:36,513 --> 00:14:41,730
So, one of my favorite books of 
all time is Zen and Art of 

285
00:14:41,730 --> 00:14:43,882
Motorcycle Maintenance by Robert
Pirsig. 

286
00:14:43,902 --> 00:14:45,282
It's such a brilliant book, 
right? 

287
00:14:45,282 --> 00:14:48,252
They influenced a lot of my 
views on quality. 

288
00:14:49,032 --> 00:14:52,931
But if I do like a sloppy 
summary of that book, it's 

289
00:14:52,931 --> 00:14:56,231
basically about a really, really
intelligent guy that, uh, ends 

290
00:14:56,231 --> 00:15:00,074
up in a mental hospital, because
he kind of tries to define 

291
00:15:00,074 --> 00:15:03,882
quality and he fails. 
So it's a really, really, really

292
00:15:03,882 --> 00:15:07,318
hard problem. 
What we thought was that instead

293
00:15:07,318 --> 00:15:10,741
of trying to define what good 
quality is, let's try to get 

294
00:15:10,741 --> 00:15:14,070
developers to agree on what bad 
quality is, because that is a 

295
00:15:14,070 --> 00:15:17,284
much easier problem. 
So there are always these things

296
00:15:17,284 --> 00:15:21,208
in software like, you know, 
excess copy paste, deep nested 

297
00:15:21,208 --> 00:15:23,859
control logic, excessively long 
functions. 

298
00:15:23,859 --> 00:15:26,809
It's the kind of stuff that we 
can agree upon that this is bad.

299
00:15:27,829 --> 00:15:32,034
So we defined a new metric 
called code health, that this 

300
00:15:32,034 --> 00:15:35,388
goes back like almost 10 years 
when we work on that code health

301
00:15:35,388 --> 00:15:37,713
metric startup. 
So we identified like 25 

302
00:15:37,713 --> 00:15:40,868
different factors that we can 
kind of agree that they, these 

303
00:15:40,868 --> 00:15:44,718
are bad practices in source 
code, then we start to measure 

304
00:15:44,718 --> 00:15:48,785
them and aggregate them. 
And that made it possible to 

305
00:15:48,785 --> 00:15:50,878
classify code if it's healthy or
unhealthy. 

306
00:15:52,078 --> 00:15:54,765
And then to actually connect it 
to some business impact, what we

307
00:15:54,765 --> 00:15:57,178
did was that we started to 
collect data. 

308
00:15:57,598 --> 00:16:01,253
And, uh, we went for, you know, 
real enterprises, real 

309
00:16:01,253 --> 00:16:03,328
production code, closed source 
development. 

310
00:16:03,568 --> 00:16:06,716
So we didn't have access to a 
source code, but working closely

311
00:16:06,716 --> 00:16:09,934
with those companies, they gave 
us access to their code health 

312
00:16:09,934 --> 00:16:12,362
scores as well as their, uh, 
JIRA data. 

313
00:16:13,022 --> 00:16:16,374
And that made it possible to 
calculate how long have you 

314
00:16:16,374 --> 00:16:19,750
spent working on a piece of code
and correlate that to its code 

315
00:16:19,750 --> 00:16:22,678
health category. 
And by doing that, what we could

316
00:16:22,678 --> 00:16:25,213
show is that if you have a 
healthy code, then your 

317
00:16:25,213 --> 00:16:28,746
development work is going to be 
not only more than twice as 

318
00:16:28,746 --> 00:16:32,316
quick as someone working in 
health, unhealthy code, it's 

319
00:16:32,316 --> 00:16:35,448
also going to be 10x more 
predictable. 

320
00:16:36,258 --> 00:16:39,593
And what that means in practice 
is that if you have a healthy 

321
00:16:39,593 --> 00:16:43,805
codebase, then your development 
work is gonna be very 

322
00:16:43,805 --> 00:16:45,343
predictable. 
You know roughly how long 

323
00:16:45,343 --> 00:16:47,963
something is going to take to 
wrap up because there are no 

324
00:16:47,963 --> 00:16:51,707
nasty surprises down the road. 
If you have unhealthy code, then

325
00:16:51,707 --> 00:16:54,293
a task can take you up to 10 
times longer. 

326
00:16:55,658 --> 00:16:59,808
And that is what causing stress,
confusion, and overtime. 

327
00:17:00,518 --> 00:17:03,070
So that was some of the 
contributions we did in the 

328
00:17:03,070 --> 00:17:05,288
"Code Red: The Business Impact 
of Code Quality" papers. 

329
00:17:06,584 --> 00:17:10,098
And yeah, one other thing is 
like, uh, you found that 15 

330
00:17:10,098 --> 00:17:13,042
times fewer bugs, uh, for 
software that has higher 

331
00:17:13,042 --> 00:17:15,411
quality. 
So I think those are kind of 

332
00:17:15,411 --> 00:17:18,864
like remarkable, right? 
I think the, when you mentioned 

333
00:17:18,864 --> 00:17:22,666
predictability, I think many 
stakeholders still kind of don't

334
00:17:22,666 --> 00:17:27,115
treat it as equal as the number 
of bugs and speed of, you know, 

335
00:17:27,115 --> 00:17:29,438
development, right? 
Predictability I think is very, 

336
00:17:29,438 --> 00:17:31,931
very important, especially when 
you build software products that

337
00:17:31,931 --> 00:17:35,688
evolve over the time. 
So why do you think, you know, 

338
00:17:35,688 --> 00:17:39,211
predictability is kind of like 
put on the backseat rather than,

339
00:17:39,211 --> 00:17:41,346
you know, being put as more 
focused? 

340
00:17:41,346 --> 00:17:44,166
Because people always think, 
okay, I need to deliver fast. 

341
00:17:44,166 --> 00:17:46,866
I need to deliver without bugs. 
And hence, maybe I add more 

342
00:17:46,866 --> 00:17:48,576
people, I add more tools and 
things like that. 

343
00:17:48,576 --> 00:17:51,189
But predictability is something 
that is kind of like taking a 

344
00:17:51,189 --> 00:17:53,646
backseat. 
Maybe some thoughts about this. 

345
00:17:54,586 --> 00:17:56,821
Yeah, that, that's an 
interesting observation. 

346
00:17:56,821 --> 00:17:59,656
So I do agree. 
I think that's correct. 

347
00:17:59,656 --> 00:18:02,493
So, uh, what I've seen is that 
organizations value different 

348
00:18:02,493 --> 00:18:04,622
things. 
So most organizations, they 

349
00:18:04,622 --> 00:18:07,771
value development speed, right? 
The quicker you can get the 

350
00:18:07,771 --> 00:18:10,111
feature out, the shorter your 
time to market, the better. 

351
00:18:10,831 --> 00:18:13,231
And then of course, the 
organizations that understand 

352
00:18:13,231 --> 00:18:16,895
the cost of rework and the, 
know, the whole hurting the 

353
00:18:16,895 --> 00:18:19,654
whole product maturity and 
customer relationship by putting

354
00:18:19,654 --> 00:18:22,020
a lot of bugs into production, 
right? 

355
00:18:22,020 --> 00:18:25,135
So they understand that. 
And many organizations even have

356
00:18:25,135 --> 00:18:27,720
a cost associated with defects, 
right? 

357
00:18:28,500 --> 00:18:30,210
But predictability, you might be
right. 

358
00:18:30,270 --> 00:18:32,430
It's not something that people 
measure. 

359
00:18:32,460 --> 00:18:35,922
But my experience is that I 
think predictability is 

360
00:18:35,922 --> 00:18:38,670
important to everyone, even if 
we're not aware of it. 

361
00:18:39,420 --> 00:18:43,279
Because I'm yet to meet a 
manager that, uh, likes 

362
00:18:43,279 --> 00:18:46,127
uncertainty, right? 
As a manager, you absolutely 

363
00:18:46,127 --> 00:18:48,044
hate that. 
You want to know when are things

364
00:18:48,044 --> 00:18:51,439
going to be get done. 
So I think it's important. 

365
00:18:51,459 --> 00:18:54,729
And as a developer, I also 
strongly dislike uncertainty. 

366
00:18:54,729 --> 00:18:58,999
Because, again, it's causing so 
much stress and overtime. 

367
00:18:59,768 --> 00:19:02,129
So, uh, I think it's 
predictability. 

368
00:19:02,129 --> 00:19:04,319
I think it's very undervalued in
software. 

369
00:19:05,525 --> 00:19:08,378
Predictability, what I mean by 
that is predictable in the sense

370
00:19:08,378 --> 00:19:11,455
that we have an idea on what we 
want to achieve and we can 

371
00:19:11,455 --> 00:19:14,075
express that in source code. 
And it works like we expressed. 

372
00:19:14,564 --> 00:19:17,562
Unpredictability is rather that,
yeah, we have this idea now we 

373
00:19:17,562 --> 00:19:20,522
want to do it in the source code
and we end up in this unhealthy 

374
00:19:20,522 --> 00:19:22,448
code. 
And, uh, we have absolutely no 

375
00:19:22,448 --> 00:19:25,533
idea how we should express our 
idea, because we cannot make 

376
00:19:25,533 --> 00:19:28,344
sense of the code to start with.
Yeah. 

377
00:19:28,344 --> 00:19:31,314
So in my head, I think 
unpredictability could be first,

378
00:19:31,314 --> 00:19:34,645
we can't even reason how to make
the change, because, you know, 

379
00:19:34,645 --> 00:19:37,783
the code is so messy. 
The other one is we make an 

380
00:19:37,783 --> 00:19:40,586
estimate, but we kind of like 
delivered way, way, you know, 

381
00:19:40,586 --> 00:19:44,010
beyond that estimate. 
And maybe the thing is like we 

382
00:19:44,010 --> 00:19:46,174
deliver something, but the 
quality somehow when being 

383
00:19:46,174 --> 00:19:49,268
tested is kind of like not up to
mark for some reasons, right? 

384
00:19:49,598 --> 00:19:53,008
So I think there are many 
possibilities of how this become

385
00:19:53,008 --> 00:19:55,004
unpredictable. 
Funny enough, when talking about

386
00:19:55,004 --> 00:19:58,019
code quality, people always talk
about clean code. 

387
00:19:58,709 --> 00:20:02,018
So, uh, is clean code also 
highly associated with code 

388
00:20:02,018 --> 00:20:05,090
quality or is there any kind of 
good practice that you think is 

389
00:20:05,090 --> 00:20:07,082
highly associated with good code
quality? 

390
00:20:08,006 --> 00:20:12,383
So I think there are many 
practices that definitely 

391
00:20:12,383 --> 00:20:15,931
correlate with code quality and 
that I think are prerequisite 

392
00:20:15,931 --> 00:20:20,814
for high quality code. 
And one of them is, of course, 

393
00:20:20,814 --> 00:20:23,107
um, unit testing. 
More specifically, I've been a 

394
00:20:23,107 --> 00:20:26,507
big, big fan of test driven 
development for almost 25 years 

395
00:20:26,507 --> 00:20:29,640
now. 
It's how I write code and, uh, I

396
00:20:29,640 --> 00:20:32,312
don't think it's about software 
testing at all. 

397
00:20:32,332 --> 00:20:35,406
Rather, I think it's a great 
design methodology, because you 

398
00:20:35,406 --> 00:20:38,487
kind of start with like the 
outcomes, what do I want to 

399
00:20:38,487 --> 00:20:40,167
achieve, and that helps driving 
the code. 

400
00:20:41,169 --> 00:20:43,899
It also kind of adds to this 
predictability in the sense that

401
00:20:43,899 --> 00:20:47,269
it takes a potentially large 
task and gives you a method for 

402
00:20:47,269 --> 00:20:50,329
how to break it down into 
smaller steps so that you can 

403
00:20:50,329 --> 00:20:53,089
stay on track. 
So that I think is important. 

404
00:20:53,859 --> 00:20:57,729
Then there are a lot of like 
practices that teams are driving

405
00:20:57,729 --> 00:21:01,877
towards today that I think are 
valuable, like very, very short,

406
00:21:01,877 --> 00:21:04,954
uh, development cycles, frequent
releases. 

407
00:21:05,704 --> 00:21:09,832
Because there's nothing that's 
more useful feedback than 

408
00:21:09,832 --> 00:21:12,958
working software. 
So that's what we try to do 

409
00:21:12,958 --> 00:21:15,544
internally at CodeScene as well,
to get things into production as

410
00:21:15,544 --> 00:21:18,023
soon as possible. 
By using the feature toggles, we

411
00:21:18,023 --> 00:21:21,494
can kind of, you know, start to 
use it, start to validate 

412
00:21:21,494 --> 00:21:24,724
whatever we're building and, uh,
dog food, work in progress. 

413
00:21:24,724 --> 00:21:27,304
That's super useful. 
So highly iterative development 

414
00:21:27,304 --> 00:21:30,573
for sure. 
Yeah, so thanks for, uh, 

415
00:21:30,573 --> 00:21:32,243
mentioning again about TDD, 
right? 

416
00:21:32,243 --> 00:21:35,268
I think in so many conversations
I had with so-called the 

417
00:21:35,268 --> 00:21:38,085
software thought leaders, 
definitely TDD is kind of like 

418
00:21:38,085 --> 00:21:40,584
one of the most mentioned 
practices, right? 

419
00:21:40,624 --> 00:21:43,762
And TDD is, uh, okay, one aspect
is definitely for testing, but 

420
00:21:43,762 --> 00:21:46,349
the other aspect that is 
important is actually driving 

421
00:21:46,349 --> 00:21:49,582
the design through test, right? 
So thinking about in the very 

422
00:21:49,582 --> 00:21:53,116
first, you know, very first step
when you write the code is like 

423
00:21:53,116 --> 00:21:55,091
what behavior that you expect it
to behave. 

424
00:21:55,541 --> 00:21:57,491
So I think that's a very good 
practice. 

425
00:21:57,709 --> 00:22:00,979
Funny enough, right when we talk
about software code quality, we 

426
00:22:00,979 --> 00:22:03,259
have so many resources 
available. 

427
00:22:03,379 --> 00:22:06,979
Uh, now I think it's maybe you 
have books, YouTube, podcasts, 

428
00:22:07,159 --> 00:22:10,906
whatever that is, right? 
But I rarely see people within 

429
00:22:10,906 --> 00:22:14,811
software industry that says, my 
codebase is the best quality. 

430
00:22:15,051 --> 00:22:17,361
So they'll always say, yeah, 
there are parts that are good, 

431
00:22:17,361 --> 00:22:20,391
there are parts which are really
bad or has technical debt. 

432
00:22:20,823 --> 00:22:24,675
Is it always the case in every 
customers that you see that this

433
00:22:24,675 --> 00:22:26,175
is kind of like the normal 
thing? 

434
00:22:26,265 --> 00:22:29,891
Or is it like some people have a
much, much better codebase 

435
00:22:29,891 --> 00:22:31,695
compared to some other software 
development team? 

436
00:22:32,512 --> 00:22:36,818
There are definitely vast 
differences between different 

437
00:22:36,818 --> 00:22:39,680
companies and even between 
different teams inside the same 

438
00:22:39,680 --> 00:22:42,701
company. 
We are actually, and perhaps 

439
00:22:42,701 --> 00:22:45,413
finally, starting to shine some 
light on that. 

440
00:22:46,173 --> 00:22:49,891
So uh, what we did at CodeScene,
I think just earlier this year 

441
00:22:49,891 --> 00:22:53,251
was that we'd start to, uh, 
publish our benchmarking data. 

442
00:22:54,451 --> 00:22:57,172
I have a blog post where I have 
written about that and, uh, 

443
00:22:57,172 --> 00:22:59,211
there's also a research paper 
behind it. 

444
00:23:00,201 --> 00:23:03,589
But, uh, basically what we show 
is like the top five 

445
00:23:03,589 --> 00:23:06,837
performance, like across the 
industry have healthy codebases.

446
00:23:07,677 --> 00:23:11,694
But the vast majority are a 
little bit further down, in the 

447
00:23:11,694 --> 00:23:16,292
slightly unhealthy space. 
So that seems to be the norm for

448
00:23:16,292 --> 00:23:19,237
software that we struggle with 
maintaining healthy codebases. 

449
00:23:20,277 --> 00:23:23,577
My experience from a more 
subjective point, because I do, 

450
00:23:23,577 --> 00:23:27,231
you know, I'm really blessed in 
that I get to meet a lot of 

451
00:23:27,231 --> 00:23:29,049
different organizations and a 
lot of different software teams,

452
00:23:29,049 --> 00:23:34,022
so I get to see a lot of code. 
My experience is that smaller 

453
00:23:34,022 --> 00:23:38,265
teams with small codebases, 
unsurprisingly, they tend to 

454
00:23:38,265 --> 00:23:42,157
have a healthier place than 
larger projects. 

455
00:23:43,127 --> 00:23:46,311
That said, I have personally 
seen projects that have been 

456
00:23:46,311 --> 00:23:49,257
developed heavily, products that
have been developed heavily for 

457
00:23:49,257 --> 00:23:54,123
a decade that are still healthy.
And that I think is really, 

458
00:23:54,123 --> 00:23:57,335
really a prerequisite for 
remaining innovative and, uh, 

459
00:23:57,335 --> 00:24:00,457
you know, being able to keep the
fun in software development. 

460
00:24:01,226 --> 00:24:03,416
Seems sounds like an interesting
research, right? 

461
00:24:03,416 --> 00:24:06,040
So definitely we'll put it in 
the show notes for people to 

462
00:24:06,040 --> 00:24:08,998
refer further. 
So let's say you have a customer

463
00:24:08,998 --> 00:24:12,666
now, so, uh, I'm pretty sure 
whenever they wanna, you know, 

464
00:24:12,666 --> 00:24:16,286
implement CodeScene or maybe 
call you for consulting or 

465
00:24:16,286 --> 00:24:18,700
whatever that is, right? 
They think that their codebase 

466
00:24:18,700 --> 00:24:21,356
is pretty bad, right? 
So this is where your concept of

467
00:24:21,356 --> 00:24:24,602
your codebase as a crime scene 
is becoming quite interesting. 

468
00:24:24,973 --> 00:24:27,373
So tell us what exactly the 
first thing that you would do? 

469
00:24:27,560 --> 00:24:30,943
Because you have this opinion 
that the codebase is not just 

470
00:24:30,943 --> 00:24:33,650
technical thing, there's the 
behavioral analysis that you are

471
00:24:33,650 --> 00:24:38,058
doing on top of the codebase to 
see or to sense why the software

472
00:24:38,058 --> 00:24:41,264
code quality like that. 
So tell us about this very first

473
00:24:41,264 --> 00:24:43,679
step that you have. 
Sure. 

474
00:24:44,396 --> 00:24:48,963
So the first idea in your code 
as crime scene is that not all 

475
00:24:48,963 --> 00:24:53,079
code is equally important. 
Some code is simply worked on 

476
00:24:53,079 --> 00:24:56,076
much, much, much more frequently
than other pieces of code. 

477
00:24:56,886 --> 00:25:00,900
And if you plot out the change 
frequency of every single file 

478
00:25:00,900 --> 00:25:03,898
in your codebase, just look at 
how often have you done or 

479
00:25:03,898 --> 00:25:07,099
commit that touch that part of 
the code, you will see an 

480
00:25:07,099 --> 00:25:11,613
extremely steep power log curve.
And that means that at the head 

481
00:25:11,613 --> 00:25:14,802
of that curve, you have maybe 
1-2% of your codebase that 

482
00:25:14,802 --> 00:25:17,289
accounts for the majority of 
your development work. 

483
00:25:18,212 --> 00:25:23,095
Very, very often it's about 25% 
of development work in 1% of the

484
00:25:23,095 --> 00:25:26,942
codebase, and occasionally it 
could be up to 60-70 in just a 

485
00:25:26,942 --> 00:25:30,620
small part of the code. 
So the obvious implications is 

486
00:25:30,620 --> 00:25:34,579
that if you want to improve code
quality or if you want to 

487
00:25:34,579 --> 00:25:37,475
remediate technical debt, then 
you should really start with the

488
00:25:37,475 --> 00:25:40,403
most frequently worked on files.
And these are the ones that call

489
00:25:40,403 --> 00:25:41,790
hotspots in your code as crime 
scene. 

490
00:25:42,010 --> 00:25:45,330
They are development hotspots. 
Because that's where the return 

491
00:25:45,330 --> 00:25:47,128
on investment is. 
That's where we're really going 

492
00:25:47,128 --> 00:25:50,108
to make a difference. 
But it's also a positive message

493
00:25:50,108 --> 00:25:53,510
because what it means is that 
the majority of your code is 

494
00:25:53,510 --> 00:25:55,580
code that's rarely if ever 
touched. 

495
00:25:55,790 --> 00:25:58,430
So that's where you can actually
live with some technical debt. 

496
00:25:59,050 --> 00:26:03,026
So even if you have code that's,
you know, it's unhealthy, it's a

497
00:26:03,026 --> 00:26:05,046
complete mess. 
No one really understands it. 

498
00:26:05,466 --> 00:26:08,346
If it's code that you never have
to touch, I mean, you need to be

499
00:26:08,346 --> 00:26:11,346
aware of the problem because 
it's a potential future risk, 

500
00:26:11,346 --> 00:26:13,616
but it's probably not an urgent 
priority. 

501
00:26:14,216 --> 00:26:17,226
And it would have a very unclear
return on investment if you 

502
00:26:17,616 --> 00:26:21,623
spend time refactoring it. 
So the bulk of your code as 

503
00:26:21,623 --> 00:26:24,519
crime scene is a set of 
techniques for prioritizing your

504
00:26:24,519 --> 00:26:28,725
time, your effort, and your 
precious attention to where it's

505
00:26:28,725 --> 00:26:33,142
likely to be needed the most. 
Yeah, so hotspots I think is 

506
00:26:33,142 --> 00:26:35,986
very interesting. 
So I don't think many people 

507
00:26:35,986 --> 00:26:39,356
associate software quality with 
this hotspot most likely in the 

508
00:26:39,356 --> 00:26:41,402
industry, so far that I have 
experienced, at least. 

509
00:26:41,612 --> 00:26:44,252
They will rely on like static 
code analysis tool. 

510
00:26:44,282 --> 00:26:47,513
So maybe think of it like 
SonarQube or whatever linter and

511
00:26:47,513 --> 00:26:50,212
all that, right? 
And they will throw out a bunch 

512
00:26:50,212 --> 00:26:53,723
of issues that you, you know, 
you find in a report and then 

513
00:26:53,723 --> 00:26:57,092
you just have to classify them, 
which is critical, high, medium,

514
00:26:57,092 --> 00:26:59,653
low, and you kind of like go 
through the list and kind of 

515
00:26:59,653 --> 00:27:02,312
like close them. 
So tell us what is the pitfall 

516
00:27:02,312 --> 00:27:04,742
of this approach? 
Because I'm sure in like still 

517
00:27:04,742 --> 00:27:07,152
in the industry, many software 
development teams actually 

518
00:27:07,152 --> 00:27:09,952
practice this rather than doing 
the hotspot analysis. 

519
00:27:10,765 --> 00:27:11,515
Yep. 
Yep. 

520
00:27:11,875 --> 00:27:15,807
Uh, that's correct. 
The obvious challenge is that 

521
00:27:15,807 --> 00:27:20,532
static analysis, it's a great 
way for, uh, catching, um, 

522
00:27:20,532 --> 00:27:23,153
stylistic issues. 
You can even catch some bugs. 

523
00:27:23,213 --> 00:27:24,803
I'm a big fan of static 
analysis. 

524
00:27:24,803 --> 00:27:26,479
I recommend all teams to do 
that. 

525
00:27:27,409 --> 00:27:31,339
But static analysis was never 
ever intended to help you 

526
00:27:31,559 --> 00:27:33,689
prioritize technical debt, 
right? 

527
00:27:33,689 --> 00:27:36,862
It just cannot do that because 
it, maybe it could be used to, 

528
00:27:36,862 --> 00:27:40,611
you know, assess the amount of 
debt or quality issues you have,

529
00:27:40,611 --> 00:27:44,730
but it cannot possibly give you 
any priority on them, because it

530
00:27:44,730 --> 00:27:47,790
doesn't know anything about the 
interest on that debt, right? 

531
00:27:48,570 --> 00:27:51,616
So the big problem I see in 
practice is that you go to an 

532
00:27:51,616 --> 00:27:54,330
organization that is using one 
of these static analysis. 

533
00:27:54,330 --> 00:27:57,754
So you saw linting aggregator, 
like Sonar that we discussed. 

534
00:27:58,416 --> 00:28:01,026
You end up seeing that, that 
they have 5,000 issues. 

535
00:28:01,785 --> 00:28:05,999
And they, that simply means that
important stuff will fly under 

536
00:28:05,999 --> 00:28:08,550
the radar. 
It would kind of drown in that 

537
00:28:08,550 --> 00:28:12,053
amount of information. 
So what teams typically do is 

538
00:28:12,053 --> 00:28:14,859
that they say everything that's 
just information or, uh, you 

539
00:28:14,859 --> 00:28:17,807
know, a warning, throw that 
away, let's focus on the major 

540
00:28:17,807 --> 00:28:20,613
stuff. 
And that can actually lead you 

541
00:28:20,613 --> 00:28:23,955
to waste time fixing things that
aren't urgent nor important. 

542
00:28:24,135 --> 00:28:26,527
So you go into that long tail 
code that you never have to 

543
00:28:26,527 --> 00:28:29,089
touch and you make changes to 
it, then you're very likely to 

544
00:28:29,089 --> 00:28:32,823
introduce a new bug. 
And at the same time, you might 

545
00:28:32,823 --> 00:28:35,890
have smaller issues in the 
hotspots, and these are the ones

546
00:28:35,890 --> 00:28:37,515
that keep driving costs every 
single day. 

547
00:28:37,635 --> 00:28:40,665
But, again, they get down 
prioritized because they don't 

548
00:28:40,665 --> 00:28:43,037
have this critical label on 
them. 

549
00:28:43,847 --> 00:28:47,223
So I think that's the big danger
with static analysis, that it 

550
00:28:47,223 --> 00:28:49,863
makes it impossible to 
prioritize fixing the right 

551
00:28:49,863 --> 00:28:53,306
technical debt. 
Also it makes it very easy to 

552
00:28:53,306 --> 00:28:55,145
waste time doing things that 
aren't important. 

553
00:28:56,288 --> 00:28:58,495
Yeah, so, especially these days,
right? 

554
00:28:58,495 --> 00:29:01,015
There are so many different 
types of static code analysis. 

555
00:29:01,015 --> 00:29:03,355
So one could be like software 
quality type. 

556
00:29:03,715 --> 00:29:05,845
Second is about, you know, 
security, right? 

557
00:29:05,845 --> 00:29:08,858
So once you integrate all this, 
no wonder, I think many software

558
00:29:08,858 --> 00:29:11,675
teams will have like hundreds 
of, you know, issues. 

559
00:29:11,735 --> 00:29:14,107
Uh, although prioritization, 
when you mention about 

560
00:29:14,107 --> 00:29:17,579
prioritization, many would just 
focus on the critical or high 

561
00:29:17,579 --> 00:29:19,813
categories, right? 
And then they'll just, you know,

562
00:29:19,813 --> 00:29:21,605
spend some time to actually 
close them. 

563
00:29:21,981 --> 00:29:24,483
But yeah, definitely I think 
sometimes we found the issues 

564
00:29:24,483 --> 00:29:27,938
that is categorized as high, but
is in the code that is rarely 

565
00:29:27,938 --> 00:29:30,022
touched. 
So I think your point there, I 

566
00:29:30,022 --> 00:29:32,831
think makes sense, right? 
So sometimes, you know, like if 

567
00:29:32,831 --> 00:29:36,677
the code doesn't get touched so 
often, why would you want to 

568
00:29:36,677 --> 00:29:38,708
change it? 
Because maybe it doesn't give 

569
00:29:38,708 --> 00:29:42,546
you the high ROI, yeah. 
So hotspots actually is one of 

570
00:29:42,546 --> 00:29:45,791
the pillar, um, within, within 
your behavioral code analysis. 

571
00:29:45,791 --> 00:29:48,239
The other one is actually the so
called the change coupling, 

572
00:29:48,239 --> 00:29:50,478
right? 
So for example, if you make one 

573
00:29:50,478 --> 00:29:53,735
change on the file, most likely 
you will also make a change to 

574
00:29:53,735 --> 00:29:55,835
another file, right? 
This is where the coupling is. 

575
00:29:56,045 --> 00:29:58,925
So tell us why this is also an 
important analysis that you do 

576
00:29:58,925 --> 00:30:01,354
within this behavioral code 
analysis. 

577
00:30:02,272 --> 00:30:04,721
Yep. 
Change coupling is, uh, 

578
00:30:04,721 --> 00:30:08,343
important because it shows you 
the change patterns in your 

579
00:30:08,343 --> 00:30:11,244
codebase. 
And, uh, it has so many 

580
00:30:11,244 --> 00:30:13,749
different use cases. 
The most obvious one is to be 

581
00:30:13,749 --> 00:30:15,573
able to reason about the cost of
change. 

582
00:30:16,323 --> 00:30:19,351
So what I mean by that? 
Well, simply that code 

583
00:30:19,351 --> 00:30:21,487
complexity can come in two 
different shapes. 

584
00:30:21,487 --> 00:30:24,427
You can either have super 
complicated source code, right? 

585
00:30:24,427 --> 00:30:27,367
The code is written in a bad and
unstructured way, and now it's 

586
00:30:27,367 --> 00:30:30,802
hard to understand. 
But could also be that the code 

587
00:30:30,802 --> 00:30:34,286
itself is actually fairly easy 
to follow, but you have no idea 

588
00:30:34,286 --> 00:30:37,292
how various modules kinda fit 
together to make a system. 

589
00:30:37,922 --> 00:30:40,747
So whenever you want to make a 
change, maybe implement a new 

590
00:30:40,747 --> 00:30:44,112
feature, then you find yourself 
playing shotgun surgery, like 

591
00:30:44,112 --> 00:30:46,292
traversing, uh, the whole 
system, searching for places 

592
00:30:46,292 --> 00:30:49,361
that need to be modified. 
And change coupling is really 

593
00:30:49,361 --> 00:30:51,952
powerful, because with it does 
is that it looks into the 

594
00:30:51,952 --> 00:30:55,240
history of your code via the Git
history and figures out that, 

595
00:30:55,240 --> 00:30:58,202
you know, other developers that 
worked on this part of the code 

596
00:30:58,202 --> 00:31:00,854
also had to modify that and that
and that file. 

597
00:31:01,154 --> 00:31:03,794
So it kind of gives you a map 
where you can kind of see where 

598
00:31:03,794 --> 00:31:07,331
the changes are going to be. 
And that is something you can 

599
00:31:07,331 --> 00:31:10,375
use not only to onboard yourself
and figure out quicker what you 

600
00:31:10,375 --> 00:31:12,805
need to change in order to 
complete the task. 

601
00:31:13,555 --> 00:31:16,937
It's also super useful if you 
think about like architecture 

602
00:31:16,937 --> 00:31:20,121
refactorings, because you can 
use that information to figure 

603
00:31:20,121 --> 00:31:23,037
out which modules belong 
together, which modules should I

604
00:31:23,037 --> 00:31:26,650
split and so on. 
And it's also important to 

605
00:31:26,650 --> 00:31:30,165
clarify that change coupling in 
itself is neither good nor bad. 

606
00:31:30,165 --> 00:31:32,455
It just shows you that this is 
the way the system is. 

607
00:31:33,433 --> 00:31:36,163
What about if I have something 
like a god class, right? 

608
00:31:36,163 --> 00:31:38,623
So every change actually go 
through that one class. 

609
00:31:38,863 --> 00:31:40,543
So what, how do you classify 
this? 

610
00:31:41,770 --> 00:31:43,040
Yeah. 
God classes. 

611
00:31:43,040 --> 00:31:45,988
They are amazing. 
It's probably the worst code 

612
00:31:45,988 --> 00:31:49,669
smell you can ever come across. 
When I've analyzed systems with 

613
00:31:49,669 --> 00:31:53,363
god classes, and I might even 
have a blog post on that too 

614
00:31:53,363 --> 00:31:56,562
with some examples. 
They typically see that the god 

615
00:31:56,562 --> 00:31:59,300
class, they have like these 
implicit dependencies that you 

616
00:31:59,300 --> 00:32:02,470
reveal via change coupling to 
like forty different places in 

617
00:32:02,470 --> 00:32:05,134
the code. 
It's extremely expensive. 

618
00:32:06,114 --> 00:32:10,350
And what I typically recommend 
is that, first of all, use this 

619
00:32:10,350 --> 00:32:14,157
information to make everyone in 
the organization aware of the 

620
00:32:14,157 --> 00:32:18,251
bottleneck and the cost so that 
we have a shared situation of 

621
00:32:18,251 --> 00:32:20,989
understanding. 
And then we need to do some 

622
00:32:20,989 --> 00:32:23,995
serious refactoring. 
So very often what you find in 

623
00:32:23,995 --> 00:32:27,007
god class is that they become 
god classes because they 

624
00:32:27,007 --> 00:32:29,871
accumulate so many different 
business responsibilities. 

625
00:32:30,501 --> 00:32:33,021
So the first challenge is always
to identify what are these 

626
00:32:33,021 --> 00:32:35,666
different responsibilities? 
And then you need to start to 

627
00:32:35,666 --> 00:32:38,211
split up the god class. 
You need to start modularize it 

628
00:32:38,211 --> 00:32:41,530
so that you can, you know, put 
each responsibility in its own 

629
00:32:41,530 --> 00:32:44,401
module, and that will help with 
the change coupling. 

630
00:32:44,421 --> 00:32:46,161
But again, it's going to be 
painful. 

631
00:32:46,911 --> 00:32:51,694
So that's, again, where hotspots
down at the function level can 

632
00:32:51,694 --> 00:32:55,383
really, really help. 
So it's, uh, it's a technique I 

633
00:32:55,383 --> 00:32:59,349
call X-ray, hotspots X-ray, and 
it basically gives you a, like a

634
00:32:59,349 --> 00:33:02,044
prioritized list of, you know, 
given these hundreds of 

635
00:33:02,044 --> 00:33:05,999
functions that you might find in
the god class, maybe only 20% of

636
00:33:05,999 --> 00:33:08,519
them are actively worked on. 
So these are obviously 

637
00:33:08,519 --> 00:33:11,320
responsibilities you want to 
start modularizing and 

638
00:33:11,320 --> 00:33:13,854
extracting first. 
So I hope that helps. 

639
00:33:14,878 --> 00:33:16,948
Yeah, so definitely very 
interesting, right? 

640
00:33:17,128 --> 00:33:19,932
The way that you mentioned about
this kind of analysis is like 

641
00:33:19,932 --> 00:33:22,173
investigation, right? 
So it's like investigating crime

642
00:33:22,173 --> 00:33:24,792
scene. 
Crime scene that is done by all 

643
00:33:24,792 --> 00:33:26,848
team members within the software
engineering teams. 

644
00:33:27,118 --> 00:33:29,458
And speaking about software 
engineering team members, right?

645
00:33:29,458 --> 00:33:32,758
So the other pillar of your 
behavioral code analysis is 

646
00:33:32,758 --> 00:33:34,378
actually the social aspect of 
it. 

647
00:33:34,618 --> 00:33:37,379
So tell us what are the social 
aspects that you analyze, why 

648
00:33:37,379 --> 00:33:40,628
they are also important as part 
of this analysis, and how do you

649
00:33:40,628 --> 00:33:42,184
capture it within the CodeScene 
tool? 

650
00:33:43,556 --> 00:33:48,226
Yeah, sure. 
One classic thing is, um, 

651
00:33:48,226 --> 00:33:51,482
unfamiliar code. 
So that is code that we didn't 

652
00:33:51,482 --> 00:33:54,314
write ourselves. 
So quite often you find 

653
00:33:54,314 --> 00:33:57,254
developers, uh, complaining that
a piece of code is hard to 

654
00:33:57,254 --> 00:34:00,239
understand. 
And then you start to measure it

655
00:34:00,239 --> 00:34:02,219
objectively using something like
the Code Health Metric. 

656
00:34:02,219 --> 00:34:04,649
And you find out that no, this 
code is actually healthy. 

657
00:34:05,399 --> 00:34:07,909
And you have a conversation with
the developer and the 

658
00:34:07,909 --> 00:34:09,741
development team, and you 
quickly figure out that the 

659
00:34:09,741 --> 00:34:11,969
reason they thought it was 
complicated was that they have 

660
00:34:11,969 --> 00:34:13,409
never worked on that code 
before. 

661
00:34:13,649 --> 00:34:15,299
So they had some onboarding to 
do, right? 

662
00:34:15,299 --> 00:34:17,549
They need to become familiar 
with the structure of the code 

663
00:34:17,909 --> 00:34:22,524
as well as the whole domain. 
So it's very easy to kind of 

664
00:34:22,524 --> 00:34:25,004
mistake or lack of familiarity 
for complexity. 

665
00:34:25,724 --> 00:34:28,060
And when you do that, it's 
really dangerous because now you

666
00:34:28,060 --> 00:34:30,842
run the risk at, you know, 
prioritizing refactorings to 

667
00:34:30,842 --> 00:34:33,434
something that doesn't need to 
be refactored, right? 

668
00:34:33,494 --> 00:34:35,940
What you need is proper 
onboarding time and time for 

669
00:34:35,940 --> 00:34:38,342
learning. 
So it might make you do the 

670
00:34:38,342 --> 00:34:42,406
wrong thing. 
However, a flip side of that is 

671
00:34:42,406 --> 00:34:46,726
also that you can run into risks
like the truck factor or the bus

672
00:34:46,726 --> 00:34:50,650
factor. 
Which is, you know, a, a fun way

673
00:34:50,650 --> 00:34:53,726
of pointing out the risk 
associated with key person 

674
00:34:53,726 --> 00:34:57,761
dependencies in software. 
And the bus factor, it's kind of

675
00:34:57,761 --> 00:35:00,483
fascinating. 
We did a study last year on it 

676
00:35:00,483 --> 00:35:03,825
and we found that even in larger
teams, larger departments with 

677
00:35:03,825 --> 00:35:07,961
like 50, 60 developers, the bus 
factor is usually just two or 

678
00:35:07,961 --> 00:35:11,625
maximum three people. 
So what that means in practice 

679
00:35:11,625 --> 00:35:14,963
is that if the two wrong 
developers would leave your 

680
00:35:14,963 --> 00:35:18,615
team, then you lose control of 
50% of the codebase, right? 

681
00:35:18,675 --> 00:35:21,405
Because the people who know the 
details of that code are gone. 

682
00:35:22,109 --> 00:35:26,179
And, uh, this is where I think, 
behavior code analysis can 

683
00:35:26,179 --> 00:35:29,639
really add a different dimension
to how we look at code. 

684
00:35:30,269 --> 00:35:33,370
Because using Git data, it's 
possible to figure out which 

685
00:35:33,370 --> 00:35:37,110
developer that has written which
code, and if they are still 

686
00:35:37,110 --> 00:35:40,360
around. 
And, uh, it's of course not 

687
00:35:40,360 --> 00:35:42,400
anything I recommend using for 
micromanagement. 

688
00:35:42,400 --> 00:35:44,750
I think there are so many 
pitfalls associated with that. 

689
00:35:45,310 --> 00:35:48,294
But it's super useful in order 
to build a knowledge map of your

690
00:35:48,294 --> 00:35:51,334
codebase so that you know that 
if I'm working this part of the 

691
00:35:51,334 --> 00:35:53,220
code, this is developer I should
ask for. 

692
00:35:53,830 --> 00:35:58,406
It also makes it very easy to 
discover risks like the, like a 

693
00:35:58,406 --> 00:36:01,630
low bus factor. 
And if you're aware of those 

694
00:36:01,630 --> 00:36:04,853
potential problems, then as a 
technical leader, as a tech lead

695
00:36:04,853 --> 00:36:08,620
or a coach, you can help the 
organization remediate that 

696
00:36:08,620 --> 00:36:11,462
problem. 
And in particular, what I always

697
00:36:11,462 --> 00:36:14,353
recommend is to combine this 
with the technical measures. 

698
00:36:15,093 --> 00:36:18,489
So if you find that you have a 
piece of code with the low bus 

699
00:36:18,489 --> 00:36:21,961
factor and that code is also 
unhealthy, then there's an 

700
00:36:21,961 --> 00:36:24,523
extreme off-boarding risk 
associated with that. 

701
00:36:24,973 --> 00:36:28,466
So you probably want to be 
proactive here and, you know, 

702
00:36:28,466 --> 00:36:31,300
refactor that code while the 
developer who wrote it is still 

703
00:36:31,300 --> 00:36:33,643
a lot around. 
So pair them together with 

704
00:36:33,643 --> 00:36:36,525
someone else on that team. 
You're going to dispute 

705
00:36:36,525 --> 00:36:39,943
knowledge in the process as well
as remediating our massive, 

706
00:36:39,943 --> 00:36:43,363
massive future risk. 
So that's one example at the, 

707
00:36:43,363 --> 00:36:46,081
like the individual level. 
And then, of course, there's a 

708
00:36:46,081 --> 00:36:49,887
team analysis aspect of it too. 
Yeah, speaking about the 

709
00:36:49,887 --> 00:36:52,083
unfamiliarity and also bus 
factor, right? 

710
00:36:52,083 --> 00:36:56,070
I think obviously almost every 
leaders understand about this 

711
00:36:56,070 --> 00:36:58,747
criticality, right? 
So some team members seems to be

712
00:36:58,747 --> 00:37:00,788
quite expert in some areas of 
the code. 

713
00:37:00,968 --> 00:37:03,685
But having said that, right, I 
think when we talk about 

714
00:37:03,685 --> 00:37:07,176
developers with executives, I 
think they would think it's 

715
00:37:07,176 --> 00:37:10,036
very... like software developers
easily, are easily 

716
00:37:10,036 --> 00:37:11,876
interchangeable. 
You know, like, okay, one person

717
00:37:11,876 --> 00:37:14,738
quit, maybe we can just replace 
it with another person, right? 

718
00:37:14,978 --> 00:37:17,618
Maybe we can even hire more 
senior, they think, and they 

719
00:37:17,618 --> 00:37:20,793
will be able to solve it. 
So tell us, because I'm a 

720
00:37:20,793 --> 00:37:23,250
software engineer, I know that 
is sometimes not possible, but 

721
00:37:23,250 --> 00:37:26,070
from your point of view how do 
you actually explain the danger 

722
00:37:26,070 --> 00:37:27,660
of, you know, having this 
mindset? 

723
00:37:28,015 --> 00:37:30,761
Because obviously there are some
pitfalls about this approach. 

724
00:37:32,393 --> 00:37:36,206
It's very dangerous and I think 
that explains a lot of the 

725
00:37:36,206 --> 00:37:39,619
problems in the software 
industry that we cannot view 

726
00:37:39,619 --> 00:37:42,013
developers as interchangeable 
cogs in a large machine. 

727
00:37:42,708 --> 00:37:45,834
What I try to do is I try to 
visualize that problem. 

728
00:37:46,410 --> 00:37:51,408
What we do is that we create 
maps like visualizations that 

729
00:37:51,408 --> 00:37:54,420
pretty much shows what your 
software system looks like. 

730
00:37:55,240 --> 00:37:58,820
On these maps, you see every 
piece of code visualized, but 

731
00:37:58,820 --> 00:38:01,300
they're visualized in a more 
accessible form, right? 

732
00:38:01,300 --> 00:38:04,350
So they're not visualized the 
source code or visualized like 

733
00:38:04,350 --> 00:38:06,298
various circles. 
And the size of the circle 

734
00:38:06,298 --> 00:38:10,302
simply shows the amount of code.
If you can present the box, bus 

735
00:38:10,302 --> 00:38:12,712
factor using that visualization,
then you can immediately spot 

736
00:38:12,712 --> 00:38:16,409
the problem because you will see
that if like these two people 

737
00:38:16,409 --> 00:38:19,147
leave, you lose control of all 
these pieces of code. 

738
00:38:19,537 --> 00:38:21,667
So I typically use color to 
visualize that. 

739
00:38:21,667 --> 00:38:25,322
So the problem really pops out 
and once you have seen that, you

740
00:38:25,322 --> 00:38:29,314
can never unsee it again. 
And it's also good because it 

741
00:38:29,314 --> 00:38:32,268
gives you a way of getting 
feedback continuously, because 

742
00:38:32,268 --> 00:38:36,811
if you start to act up on this 
risk, then you will see how the 

743
00:38:36,811 --> 00:38:41,504
bus factor decreases over time. 
That said, I don't think the bus

744
00:38:41,504 --> 00:38:43,966
factor is necessarily, it's not 
always a bad thing. 

745
00:38:43,966 --> 00:38:47,528
I mean, I would be worried 
sometimes if we don't have a bus

746
00:38:47,528 --> 00:38:51,331
factor, because individual 
productivity does vary a lot. 

747
00:38:52,021 --> 00:38:55,758
But what it's about is to avoid 
unnecessary risk, ensure that 

748
00:38:55,758 --> 00:38:57,597
everyone on the team gets a 
chance to contribute. 

749
00:38:58,839 --> 00:39:02,007
Yeah, so definitely we can use 
some socio-technical practices 

750
00:39:02,007 --> 00:39:04,459
as well. 
Things like pair programming or 

751
00:39:04,459 --> 00:39:05,859
mob programming, ensemble 
programming. 

752
00:39:06,099 --> 00:39:09,279
Or even just do, I dunno, things
like lunch and learn explaining 

753
00:39:09,279 --> 00:39:12,533
about modules as like your own 
modules to other team members. 

754
00:39:12,533 --> 00:39:16,656
I think that can also help. 
I think also this is a nice plug

755
00:39:16,656 --> 00:39:19,862
for you to actually introduce 
CodeScene for those of us who 

756
00:39:19,862 --> 00:39:22,208
haven't heard or haven't played 
around with CodeScene before. 

757
00:39:22,478 --> 00:39:25,463
Because, um, yeah, there are 
some alternatives out there to 

758
00:39:25,463 --> 00:39:28,056
measure software quality. 
But essentially how does 

759
00:39:28,056 --> 00:39:31,577
CodeScene work and why should 
you think people should try 

760
00:39:31,577 --> 00:39:33,907
CodeScene to measure that 
software quality? 

761
00:39:34,744 --> 00:39:39,090
Yeah, sure, I'd be happy to. 
So I founded CodeScene 10 years 

762
00:39:39,090 --> 00:39:42,121
ago. 
And I founded it after writing 

763
00:39:42,121 --> 00:39:45,098
Your Code as a Crime Scene. 
And in Your Code as a Crime 

764
00:39:45,098 --> 00:39:47,589
Scene, I basically collected a 
bunch of techniques I've been 

765
00:39:47,589 --> 00:39:48,999
using myself throughout my 
career. 

766
00:39:49,659 --> 00:39:51,859
But I also realized that the 
book won't be enough. 

767
00:39:51,859 --> 00:39:55,386
I really wanted to have 
professional tools that could 

768
00:39:55,386 --> 00:39:57,719
automate the analysis from Your 
Code as a Crime Scene. 

769
00:39:58,189 --> 00:40:02,059
So that was my main motivation. 
And the way CodeScene works, 

770
00:40:02,059 --> 00:40:06,027
because we have come a long way 
in 10 years, is that you, it's a

771
00:40:06,027 --> 00:40:08,228
source product. 
So you basically point it to 

772
00:40:08,228 --> 00:40:11,402
your codebase and then you press
a button and then CodeScene does

773
00:40:11,402 --> 00:40:15,528
all the heavy lifting. 
So you end up with not only KPIs

774
00:40:15,528 --> 00:40:19,410
and trends on code quality and 
code health and that stuff, you 

775
00:40:19,410 --> 00:40:20,818
also get complete 
visualizations. 

776
00:40:21,458 --> 00:40:25,703
So you can visually see where 
are the bottlenecks in your 

777
00:40:25,703 --> 00:40:27,918
code, where are the complicated 
hotspots. 

778
00:40:28,658 --> 00:40:32,085
And those maps are intended to 
serve as a communication not 

779
00:40:32,085 --> 00:40:35,352
only within the engineering 
team, but also so you can sit 

780
00:40:35,352 --> 00:40:38,410
down with technical leaders and 
managers and have a conversation

781
00:40:38,410 --> 00:40:41,028
around things like technical 
debt and code quality. 

782
00:40:41,876 --> 00:40:44,606
And, uh, I think the thing that 
sets CodeScene apart is, uh, 

783
00:40:44,606 --> 00:40:48,116
it's obviously the first and 
only behavioral code analysis 

784
00:40:48,116 --> 00:40:50,318
tool. 
So the only tool that really, 

785
00:40:50,318 --> 00:40:52,616
really consider the intersection
of people and code. 

786
00:40:53,546 --> 00:40:57,406
And, uh, second one thing we are
very proud of is that our code 

787
00:40:57,406 --> 00:41:01,964
health metric is the only proven
and validated code quality 

788
00:41:01,964 --> 00:41:06,821
metric with the connection to 
business outcomes like we talked

789
00:41:06,821 --> 00:41:10,181
about before, measurable defect 
reduction, measurable speed up 

790
00:41:10,181 --> 00:41:14,092
in development time. 
So that's the gist of CodeScene.

791
00:41:14,112 --> 00:41:16,422
And then of course, we have all 
these bells and whistles with, 

792
00:41:16,482 --> 00:41:19,026
you know, automated code 
reviews, IDE integrations, and 

793
00:41:19,026 --> 00:41:21,262
so on. 
Could go on for a long time 

794
00:41:21,262 --> 00:41:23,834
about this. 
Yeah, so I think that's the 

795
00:41:23,834 --> 00:41:26,390
first thing, the unique part of 
this CodeScene is the behavior 

796
00:41:26,390 --> 00:41:28,853
analysis part, right? 
So I think, I haven't really 

797
00:41:28,853 --> 00:41:30,033
played around with CodeScene 
much. 

798
00:41:30,033 --> 00:41:33,238
I just saw it from the websites.
But I think it's really 

799
00:41:33,238 --> 00:41:35,471
interesting to see some kind of 
visualization, especially when 

800
00:41:35,471 --> 00:41:38,529
you already have the knowledge 
about this and you have interest

801
00:41:38,529 --> 00:41:41,463
in this, right? 
Because again, still some people

802
00:41:41,463 --> 00:41:44,508
associates code quality with, 
you know, the number of issues 

803
00:41:44,508 --> 00:41:47,096
found by, you know, like a 
software, a static software 

804
00:41:47,096 --> 00:41:49,959
analysis tools, right? 
So I think this one is kind of 

805
00:41:49,959 --> 00:41:52,113
slightly different, right? 
So it's taking the behavioral 

806
00:41:52,113 --> 00:41:55,466
analysis of your code. 
And also the bus factor as well,

807
00:41:55,466 --> 00:41:58,302
I like it, because sometimes 
it's very important for us to 

808
00:41:58,302 --> 00:42:01,116
see, especially when you have a 
few team members and you don't 

809
00:42:01,116 --> 00:42:04,434
actually do hands-on coding. 
You wanna see the kind of bus 

810
00:42:04,434 --> 00:42:06,142
factors associated with some of 
the developers. 

811
00:42:06,832 --> 00:42:10,737
So I think one great thing these
days talking about software 

812
00:42:10,737 --> 00:42:13,642
quality is actually about AI 
produced code. 

813
00:42:14,212 --> 00:42:16,222
So I know this is probably one 
of the hot topic. 

814
00:42:16,342 --> 00:42:19,358
Um, first of all, I would like 
to clarify with you, do you 

815
00:42:19,358 --> 00:42:22,416
think AI will help us in 
software quality? 

816
00:42:22,446 --> 00:42:26,772
Or do you think AI will not help
us, or even, uh, make our code 

817
00:42:26,772 --> 00:42:30,197
quality worse? 
So I think it's very much up to 

818
00:42:30,197 --> 00:42:32,813
us. 
So AI itself, I have some pretty

819
00:42:32,813 --> 00:42:36,293
solid proof that it can help us 
write better software. 

820
00:42:37,493 --> 00:42:41,481
But, uh, there's also a couple 
of trends that worry me, which 

821
00:42:41,481 --> 00:42:45,449
indicates that if we misapply AI
or use it for the wrong purpose,

822
00:42:45,449 --> 00:42:49,046
then it's more likely to serve 
as a technical debt generator 

823
00:42:49,046 --> 00:42:51,812
rather than support. 
So it's like a, we are all like 

824
00:42:51,812 --> 00:42:53,588
at the fork in the road right 
now. 

825
00:42:53,588 --> 00:42:56,370
And I think it's pretty much up 
to us as a community, as 

826
00:42:56,370 --> 00:42:59,844
companies where we're going to 
take AI because we can end up 

827
00:42:59,844 --> 00:43:03,989
in, uh, big, big problems in the
future where technical debt 

828
00:43:03,989 --> 00:43:06,174
grows exponentially at machine 
speed. 

829
00:43:07,063 --> 00:43:10,132
My experience is actually, I 
worry more for the latter, 

830
00:43:10,132 --> 00:43:12,184
right? 
I think the tech debt, the 

831
00:43:12,184 --> 00:43:15,294
amount of code quality that gets
produced can be worse, simply 

832
00:43:15,294 --> 00:43:19,566
because I think now it's very 
easy to produce many lines of 

833
00:43:19,566 --> 00:43:22,732
code, procedural, uh, you know, 
you just can continue on one 

834
00:43:22,732 --> 00:43:26,529
class over and over and over. 
And the third thing, I think 

835
00:43:26,529 --> 00:43:29,558
it's about coherence in terms of
architecture and design and all 

836
00:43:29,558 --> 00:43:32,312
that, right? 
So I think I gotta worry that, 

837
00:43:32,312 --> 00:43:35,677
uh, one day, if let's say all 
developers kind of like 

838
00:43:35,677 --> 00:43:38,898
outsource our writing code to AI
mostly, and especially vibe 

839
00:43:38,898 --> 00:43:41,428
coding and all that, definitely 
we know it's even worse, right? 

840
00:43:41,828 --> 00:43:45,020
So what do you think are some of
the guardrails that we should 

841
00:43:45,020 --> 00:43:48,751
have actually to help us avoid 
these kind of, you know, code 

842
00:43:48,751 --> 00:43:50,371
quality getting even worse, 
right? 

843
00:43:50,371 --> 00:43:54,073
So I don't think everyone aspire
to have this, but we all kind of

844
00:43:54,073 --> 00:43:57,487
like seduced by the, you know, 
the speed, the amount of work 

845
00:43:57,487 --> 00:43:59,815
that can get done simply by 
using AI. 

846
00:43:59,875 --> 00:44:01,600
So what, uh, do you think are 
some of the guardrails? 

847
00:44:02,661 --> 00:44:05,184
The first guardrail has to be 
code quality. 

848
00:44:05,334 --> 00:44:08,406
And the reason for this is 
because yes, we are using AI 

849
00:44:08,406 --> 00:44:11,430
today and, uh, that's what I 
mean with the big risks. 

850
00:44:11,430 --> 00:44:15,090
I see that we're using a AI 
today to automate coding. 

851
00:44:15,810 --> 00:44:19,909
But writing code is a very, very
small part of developer's work 

852
00:44:19,909 --> 00:44:22,437
week. 
It's like roughly 5% of our work

853
00:44:22,437 --> 00:44:24,342
week. 
That's the time we spend typing 

854
00:44:24,342 --> 00:44:27,335
on the keyboard. 
So development is not about 

855
00:44:27,335 --> 00:44:31,025
typing faster. 
The big bottleneck is instead in

856
00:44:31,025 --> 00:44:34,192
understanding existing code 
where we spend roughly 70% of 

857
00:44:34,192 --> 00:44:37,611
our work week. 
If we're not aware of this and 

858
00:44:37,611 --> 00:44:40,656
if we're not guarding against 
this, then uh, we will have an 

859
00:44:40,656 --> 00:44:44,992
AI that optimizes 5% of our work
with at the expense of the other

860
00:44:44,992 --> 00:44:47,181
70. 
Writing code was never the 

861
00:44:47,181 --> 00:44:48,651
bottleneck, but we pretend it 
was. 

862
00:44:49,401 --> 00:44:52,518
And now we get a lot of code and
now we have to spend a ton of 

863
00:44:52,518 --> 00:44:54,843
time reading through that code 
and trying to understand code 

864
00:44:54,843 --> 00:44:58,037
that someone else, in this case,
an AI wrote, which is arguably a

865
00:44:58,037 --> 00:45:00,376
harder problem. 
So this is something I'm worried

866
00:45:00,376 --> 00:45:02,766
about. 
So the least we can do is to 

867
00:45:02,766 --> 00:45:05,331
ensure that whatever code AI 
generates that it's healthy. 

868
00:45:05,986 --> 00:45:09,318
So what we do internally is that
we have a code health metric in 

869
00:45:09,318 --> 00:45:11,645
the IDE. 
So wherever we have write some 

870
00:45:11,645 --> 00:45:15,553
code or where if we use AI to 
generate some code, it has to 

871
00:45:15,553 --> 00:45:18,413
pass a certain code quality bar,
a certain code health bar. 

872
00:45:18,413 --> 00:45:20,513
Otherwise, we discard it. 
It's not useful. 

873
00:45:21,773 --> 00:45:23,243
So that I think is the first 
thing. 

874
00:45:24,173 --> 00:45:26,933
The other thing that I think is 
important as a guardrail is that

875
00:45:26,933 --> 00:45:31,629
we need to, we kind of need to 
value this shift in emphasis 

876
00:45:31,629 --> 00:45:35,832
from, uh, writing code to become
really, really good at, at 

877
00:45:35,832 --> 00:45:39,719
reading code. 
So, uh, I think that traditional

878
00:45:39,719 --> 00:45:43,460
software engineering practices 
like TDD that we talked about 

879
00:45:43,460 --> 00:45:47,075
before, like code reviews, like 
pair programming, they will be 

880
00:45:47,075 --> 00:45:49,526
more important than ever before 
in the AI era. 

881
00:45:50,276 --> 00:45:53,055
Because it's a really, really 
hard problem to assess a lot of 

882
00:45:53,055 --> 00:45:54,456
code that we didn't write 
ourselves. 

883
00:45:55,267 --> 00:45:58,949
Yeah, so I think one challenging
aspect I think is like simply 

884
00:45:58,949 --> 00:46:01,757
because it's kind of cheap to 
produce code these days. 

885
00:46:02,097 --> 00:46:04,573
These practices of, you know, 
even generating tests, it could 

886
00:46:04,573 --> 00:46:06,897
be also even much more code to 
review. 

887
00:46:07,362 --> 00:46:11,044
And everyone seems to be 
working, like the expectation, 

888
00:46:11,044 --> 00:46:13,872
everyone seems to be working 
more things, right? 

889
00:46:14,586 --> 00:46:17,826
I have a worry that someday, you
know, our brain capacity just 

890
00:46:17,826 --> 00:46:20,946
couldn't keep up. 
And I think we'll just, you 

891
00:46:20,946 --> 00:46:23,805
know, simply ignore whatever 
code quality problems that we 

892
00:46:23,805 --> 00:46:26,802
have and still continue the 
cycle that is spiraling down, 

893
00:46:26,802 --> 00:46:29,956
you know, over the time. 
So when you mentioned about code

894
00:46:29,956 --> 00:46:32,914
quality, uh, what kind of things
that you support in CodeScene 

895
00:46:32,914 --> 00:46:35,594
that actually can help us reduce
this tendency? 

896
00:46:35,594 --> 00:46:38,619
Because I think it's very easy 
to just produce code commit, 

897
00:46:38,619 --> 00:46:40,274
and, you know, let other people 
see. 

898
00:46:40,274 --> 00:46:42,274
If not see, then they will just 
deploy it. 

899
00:46:42,274 --> 00:46:45,672
So what do you think the tools 
that can help us avoid this? 

900
00:46:46,713 --> 00:46:48,716
Uh, I think there are multiple 
tools here. 

901
00:46:48,716 --> 00:46:52,095
But one thing I think is 
important is obviously, and I 

902
00:46:52,095 --> 00:46:55,096
know that some AI platforms are 
already considering building 

903
00:46:55,096 --> 00:46:59,530
this in and some might already 
have it, but I think it's super 

904
00:46:59,530 --> 00:47:02,920
important to have a security 
scan happen, uh, as soon as 

905
00:47:02,920 --> 00:47:05,800
possible. 
Because a lot of the early 

906
00:47:05,800 --> 00:47:09,606
research on AI kind of showed 
that it was very prone to 

907
00:47:09,606 --> 00:47:14,058
shipping vulnerable code. 
So that, that's like a very low 

908
00:47:14,058 --> 00:47:15,870
barrier to quality. 
Make sure it's secure. 

909
00:47:16,650 --> 00:47:19,376
The second thing, and this is 
where I think CodeScene has an 

910
00:47:19,376 --> 00:47:23,261
important role to play, is that 
integrate the code health check 

911
00:47:23,261 --> 00:47:27,402
in your IDE so that whenever you
get some code, human written, AI

912
00:47:27,402 --> 00:47:30,437
written, make sure it's healthy.
Otherwise, discard it. 

913
00:47:30,437 --> 00:47:34,197
Or, you know, use an MCP that 
feeds it back to the AI so that 

914
00:47:34,197 --> 00:47:37,730
can make a new attempt at 
shipping code that are the right

915
00:47:37,730 --> 00:47:41,626
standard. 
Because, um. uh, I used to say 

916
00:47:41,626 --> 00:47:44,399
this, that acceleration just 
isn't useful if it kind of 

917
00:47:44,399 --> 00:47:46,751
drives our projects into this 
brick wall of technical debt. 

918
00:47:47,591 --> 00:47:50,736
So having healthy, secure code, 
that's like the very basics of 

919
00:47:50,736 --> 00:47:53,249
this. 
And third, also think we need to

920
00:47:53,249 --> 00:47:55,403
be really, really good at 
questioning whether we need that

921
00:47:55,403 --> 00:47:58,121
code in the first place. 
Because like you say, an AI 

922
00:47:58,121 --> 00:48:01,121
makes it more or less free to 
generate a lot of code. 

923
00:48:02,201 --> 00:48:05,177
But that code, that's still 
expensive, that's still waste if

924
00:48:05,177 --> 00:48:08,156
that code isn't needed. 
So maybe there's a library, 

925
00:48:08,156 --> 00:48:10,714
maybe there's already some 
in-house functionality for doing

926
00:48:10,714 --> 00:48:13,018
this. 
Or maybe we should go back and 

927
00:48:13,018 --> 00:48:15,221
question the requirement to 
really have to implement this. 

928
00:48:15,221 --> 00:48:18,341
Because every single line of 
code we add is going to cost us 

929
00:48:18,341 --> 00:48:21,685
a lot of time over the years, 
just to maintain it and just all

930
00:48:21,685 --> 00:48:23,351
the time we have to understand 
it. 

931
00:48:24,151 --> 00:48:26,942
Yes, speaking about that, I 
always remind myself that code 

932
00:48:26,942 --> 00:48:30,026
is a liability, right? 
So the more code means more 

933
00:48:30,026 --> 00:48:31,748
liability, is not necessarily 
good things. 

934
00:48:32,082 --> 00:48:35,997
So you mentioned about, you 
know, the acceleration of 

935
00:48:35,997 --> 00:48:39,081
technical debt. 
So maybe if you have experience 

936
00:48:39,081 --> 00:48:41,755
working with some customers 
implementing AI, what do you 

937
00:48:41,755 --> 00:48:44,803
think are some of the top tech 
debts that gets produced more, 

938
00:48:44,863 --> 00:48:46,903
simply because, uh, people are 
using AI? 

939
00:48:47,630 --> 00:48:51,769
Yeah, I think the challenge is 
that AI, it kind of works 

940
00:48:51,769 --> 00:48:55,215
really, really well for, you 
know, individuals doing toy 

941
00:48:55,215 --> 00:48:57,307
tasks. 
That way an AI can kind of drive

942
00:48:57,307 --> 00:49:00,399
most of the stuff in particular 
if it's a task that's been done 

943
00:49:00,399 --> 00:49:03,153
many, many times, and that's 
perhaps code that shouldn't even

944
00:49:03,153 --> 00:49:05,555
be written. 
You face challenges when you 

945
00:49:05,555 --> 00:49:07,277
start to scale to the team 
level. 

946
00:49:07,925 --> 00:49:12,021
So my experience is, I mean, 
I've seen companies, I remember 

947
00:49:12,021 --> 00:49:16,222
like six months ago I worked 
with a company that had made a 

948
00:49:16,222 --> 00:49:19,250
massive, massive rollout of an 
AI coding tool. 

949
00:49:19,940 --> 00:49:21,860
And they had done everything by 
the book. 

950
00:49:21,860 --> 00:49:24,692
So they started, they did on 
evaluation at a small scale with

951
00:49:24,692 --> 00:49:26,880
a small team, and that was of 
course, successful. 

952
00:49:27,430 --> 00:49:29,743
And then they decided, hey, 
let's scale this up to a couple 

953
00:49:29,743 --> 00:49:32,290
of thousand developers. 
And, um, that's where the 

954
00:49:32,290 --> 00:49:35,600
problem started, because pretty 
soon they noticed an impact on 

955
00:49:35,600 --> 00:49:38,381
production that they were 
generating poor code leading to 

956
00:49:38,381 --> 00:49:41,072
bugs. 
And this is where it's so 

957
00:49:41,072 --> 00:49:43,558
dangerous, because if that code 
isn't healthy that this AI 

958
00:49:43,558 --> 00:49:46,922
generated, you know, you need to
get the human into the loop to 

959
00:49:46,922 --> 00:49:48,445
figure out, okay, what's causing
this bug? 

960
00:49:49,155 --> 00:49:51,365
Because an AI is not 
particularly good at that. 

961
00:49:51,365 --> 00:49:54,387
The state of AI is simply not 
there, where it can 

962
00:49:54,387 --> 00:49:57,125
automatically repair system in 
production reliably. 

963
00:49:58,025 --> 00:50:01,085
So the code needs to have a fail
safe where humans can go in. 

964
00:50:01,925 --> 00:50:04,775
And now you have this massive, 
massive onboarding because They 

965
00:50:04,775 --> 00:50:07,739
have thousands of lines of code 
that you didn't write, your 

966
00:50:07,739 --> 00:50:10,403
teammates didn't write, but now 
you need to understand them and 

967
00:50:10,403 --> 00:50:12,185
you need to understand them 
under time pressure. 

968
00:50:12,815 --> 00:50:16,355
So that's where why code quality
is so fundamental for succeeding

969
00:50:16,355 --> 00:50:20,467
with AI-assisted development. 
Wow, so yeah, you mentioned this

970
00:50:20,467 --> 00:50:23,663
particular scenario that kind of
like also worries me, right? 

971
00:50:23,683 --> 00:50:26,953
So whenever you produce a lot of
code that gets produced by AI. 

972
00:50:27,177 --> 00:50:29,427
And then when there's a 
production issue, I think still 

973
00:50:29,427 --> 00:50:33,327
we have to kind of like know the
reason, the root cause where in 

974
00:50:33,327 --> 00:50:36,492
particular code that it fails. 
So if we don't actually have a 

975
00:50:36,492 --> 00:50:39,332
good understanding of codebase, 
I think it will be quite 

976
00:50:39,332 --> 00:50:41,817
dangerous and within time 
pressure as well. 

977
00:50:42,475 --> 00:50:45,375
You also have this, uh, line 
that I think quite interesting 

978
00:50:45,375 --> 00:50:48,456
to discuss, right? 
You mentioned that code that is 

979
00:50:48,456 --> 00:50:51,962
good, readable by human, is 
actually also beneficial for AI 

980
00:50:51,962 --> 00:50:56,224
or LLM to work, uh, with. 
So tell us, uh, about these 

981
00:50:56,224 --> 00:50:58,832
findings, right? 
And why do you see correlation 

982
00:50:58,832 --> 00:51:02,992
between these two? 
Yeah, that's a topic that I'm 

983
00:51:02,992 --> 00:51:04,925
super interested in at the 
moment. 

984
00:51:05,437 --> 00:51:09,028
This was an, uh, observation 
originally made by Thoughtworks 

985
00:51:09,028 --> 00:51:12,309
on their tech radar earlier this
year, that where they start to 

986
00:51:12,309 --> 00:51:14,782
talk about this concept of AI 
friendly code. 

987
00:51:14,812 --> 00:51:17,312
And that kind of really 
resonated with me because it's 

988
00:51:17,312 --> 00:51:21,376
exactly what I see in our own 
internal data lake where we 

989
00:51:21,376 --> 00:51:27,159
study how AI impacts quality. 
And the idea here is that, you 

990
00:51:27,159 --> 00:51:31,059
know, healthy code, an AI will 
benefit from this additional 

991
00:51:31,059 --> 00:51:33,752
structure. 
It will benefit from the richer 

992
00:51:33,752 --> 00:51:36,416
context by having proper 
function names, cohesive 

993
00:51:36,416 --> 00:51:41,036
functions, it simply makes it so
much easier for the AI to do do 

994
00:51:41,036 --> 00:51:44,804
a good job. 
So it's kind of fascinating that

995
00:51:44,804 --> 00:51:48,218
we have this potential double 
win here that, you know, if we 

996
00:51:48,218 --> 00:51:51,200
can write off the code, not only
is the life going to be easier 

997
00:51:51,200 --> 00:51:55,106
for us as humans, but we're also
going to be in a place where 

998
00:51:55,106 --> 00:51:59,981
we're able to benefit from AI. 
And that also means the imp- 

999
00:51:59,981 --> 00:52:03,221
obvious implication is that if 
your code isn't good enough, 

1000
00:52:03,221 --> 00:52:06,686
then maybe you're in a situation
where you cannot apply AI 

1001
00:52:06,686 --> 00:52:08,549
safely. 
Maybe you need to refactor first

1002
00:52:08,549 --> 00:52:11,291
to even get to a level where you
can benefit from AI. 

1003
00:52:12,041 --> 00:52:15,863
So this is an area where we 
have, uh, initiated a bunch of 

1004
00:52:15,863 --> 00:52:18,489
research initiatives. 
So I hope to be able to share 

1005
00:52:18,489 --> 00:52:21,729
some actual numbers and put some
actual data on what the problem 

1006
00:52:21,729 --> 00:52:25,625
is and whether the cutoff points
are for when it comes to AI 

1007
00:52:25,625 --> 00:52:28,602
performance and source code in, 
yeah, hopefully in a month's 

1008
00:52:28,602 --> 00:52:31,328
time. 
I hope, uh, the research can, 

1009
00:52:31,328 --> 00:52:34,421
you know, gives us some more 
insights about, you know, good 

1010
00:52:34,421 --> 00:52:37,526
code practices, good code 
quality actually can help human 

1011
00:52:37,526 --> 00:52:39,401
and AI to collaborate much 
better, right? 

1012
00:52:39,401 --> 00:52:40,781
So I think that's pretty 
exciting. 

1013
00:52:41,171 --> 00:52:45,490
So one thing about that, right, 
so do you think also software 

1014
00:52:45,490 --> 00:52:48,444
documentation or maybe now these
days people talk about spec 

1015
00:52:48,444 --> 00:52:51,525
driven development, do you think
this is also a good practice for

1016
00:52:51,525 --> 00:52:54,557
us to try out? 
Uh, simply because AI can have a

1017
00:52:54,557 --> 00:52:57,707
much, much better structured and
context as well. 

1018
00:52:57,746 --> 00:52:59,966
Do you think this is something, 
an area that every software 

1019
00:52:59,966 --> 00:53:04,381
developers have to try out? 
I think, uh, I think that good 

1020
00:53:04,381 --> 00:53:05,923
documentation has always been 
valuable. 

1021
00:53:06,133 --> 00:53:09,145
I know that there are a lot of 
conflicting opinions about that 

1022
00:53:09,145 --> 00:53:12,823
in the software industry. 
One thing I always miss in 

1023
00:53:12,823 --> 00:53:15,859
source code is that I want to 
understand the whys. 

1024
00:53:16,009 --> 00:53:17,929
Why have we chosen to do it this
way? 

1025
00:53:18,724 --> 00:53:20,494
And, uh, what are the 
trade-offs? 

1026
00:53:20,554 --> 00:53:23,104
Because that's something I can 
never read out of source code. 

1027
00:53:23,104 --> 00:53:25,354
I can use the source code to 
understand what the code does. 

1028
00:53:25,834 --> 00:53:28,084
But I don't know why we choose 
just this approach. 

1029
00:53:28,724 --> 00:53:32,214
So that level of documentation I
think is super important for a 

1030
00:53:32,214 --> 00:53:35,487
human maintainer. 
And I would be surprised if an 

1031
00:53:35,487 --> 00:53:38,856
AI didn't benefit from that, 
because if the AI has the 

1032
00:53:38,856 --> 00:53:42,897
context of why, then we can 
probably avoid a lot of 

1033
00:53:42,897 --> 00:53:45,178
pitfalls. 
Um, the classic example is that 

1034
00:53:45,178 --> 00:53:48,534
we have done some really, really
hard performance optimizations, 

1035
00:53:48,534 --> 00:53:50,818
right? 
So as a human, you want to know 

1036
00:53:50,818 --> 00:53:53,140
about that. 
As an AI, you probably also want

1037
00:53:53,140 --> 00:53:56,238
to understand that because it's 
gonna to influence what path you

1038
00:53:56,238 --> 00:54:00,239
take. 
So, uh, yes, I think that's a 

1039
00:54:00,239 --> 00:54:02,224
very interesting angle with 
documentation. 

1040
00:54:03,551 --> 00:54:05,765
Yeah. 
And also not to mention AI can 

1041
00:54:05,765 --> 00:54:08,776
actually help us to produce that
documentation itself and we can 

1042
00:54:08,776 --> 00:54:11,262
iterate together, right? 
To produce a much better 

1043
00:54:11,262 --> 00:54:13,462
documentation. 
Because I think almost all 

1044
00:54:13,462 --> 00:54:16,778
developers would not like, you 
know, writing documentations for

1045
00:54:16,778 --> 00:54:19,492
whatever reasons. 
Uh, I personally also sometimes 

1046
00:54:19,492 --> 00:54:21,502
drudge, you know, like writing 
documentations. 

1047
00:54:21,502 --> 00:54:24,186
But I think now we can kickstart
the documentation writing by 

1048
00:54:24,186 --> 00:54:27,609
using AI to help us. 
And I hope, yeah, we can iterate

1049
00:54:27,609 --> 00:54:29,842
much better and then the 
documentation gets improved over

1050
00:54:29,842 --> 00:54:33,100
the time as well. 
So I think speaking about 

1051
00:54:33,100 --> 00:54:35,966
software developer and AI, 
obviously one question that 

1052
00:54:35,966 --> 00:54:39,330
always in the news is about, you
know, the future of software 

1053
00:54:39,330 --> 00:54:41,913
development, right? 
And one thing is about, you 

1054
00:54:41,913 --> 00:54:44,833
know, some people say, more code
will be written by AI. 

1055
00:54:44,893 --> 00:54:47,674
Some big tech companies actually
claim that within their 

1056
00:54:47,674 --> 00:54:50,035
companies. 
People also say we don't need 

1057
00:54:50,035 --> 00:54:53,343
more juniors in the future. 
Some people get laid off simply 

1058
00:54:53,343 --> 00:54:56,963
because they think AI can, you 
know, produce equal number of 

1059
00:54:56,963 --> 00:54:59,557
code. 
So what is your view with all 

1060
00:54:59,557 --> 00:55:02,151
this trend and what kind of 
future of software development 

1061
00:55:02,151 --> 00:55:05,548
or software engineer do you 
think would happen in the next, 

1062
00:55:05,548 --> 00:55:08,645
I dunno, one or two years? 
Let's not put five or 10 years, 

1063
00:55:08,645 --> 00:55:11,585
yeah. 
So in two, three years time, I 

1064
00:55:11,585 --> 00:55:15,023
think what's going to happen is 
that we all get a reality check.

1065
00:55:15,927 --> 00:55:19,249
The reason I say this is because
looking historically, I haven't 

1066
00:55:19,249 --> 00:55:22,572
seen a single technological 
revolution that led to the 

1067
00:55:22,572 --> 00:55:25,600
demand for less work. 
It's always been more work. 

1068
00:55:26,190 --> 00:55:30,005
Because what happens is that 
when we get access to new 

1069
00:55:30,005 --> 00:55:32,905
technologies, like for example a
machine that can code, what 

1070
00:55:32,905 --> 00:55:34,837
happens is that we raise the 
bar. 

1071
00:55:34,897 --> 00:55:38,737
We start to take on larger and 
larger problems and that will 

1072
00:55:38,737 --> 00:55:41,377
only increase the demand for 
software developers. 

1073
00:55:42,292 --> 00:55:46,765
Given the current state of AI, 
we, I have a very hard time 

1074
00:55:46,765 --> 00:55:49,597
seeing that, uh, an AI will 
replace humans entirely. 

1075
00:55:49,597 --> 00:55:51,127
I don't think that will ever 
happen. 

1076
00:55:52,039 --> 00:55:55,438
I might of course be wrong, but 
there's no evidence for that so 

1077
00:55:55,438 --> 00:55:57,232
far. 
So what I think is going to 

1078
00:55:57,232 --> 00:56:00,099
happen over the next few years 
is that we're going to have this

1079
00:56:00,099 --> 00:56:03,791
hybrid model where we as humans 
have to understand code that's 

1080
00:56:03,791 --> 00:56:06,417
written by machine, and we have 
to kind of work in tandems with 

1081
00:56:06,417 --> 00:56:09,591
them. 
And that's where I think it's so

1082
00:56:09,591 --> 00:56:12,289
important that we, you know, 
reemphasize traditional 

1083
00:56:12,289 --> 00:56:14,669
engineering principles in 
software. 

1084
00:56:15,739 --> 00:56:19,103
But it also means that the 
barrier to entry is going to be 

1085
00:56:19,103 --> 00:56:21,563
really, really challenging, 
because these skills they take, 

1086
00:56:21,563 --> 00:56:25,384
at least for me, it took a 
decade before I even knew what I

1087
00:56:25,384 --> 00:56:28,869
did in software. 
And we kind of need to grow the 

1088
00:56:28,869 --> 00:56:30,179
next generation of developers 
too. 

1089
00:56:30,984 --> 00:56:34,988
This is what I'm mostly worried 
about because I think, as a 

1090
00:56:34,988 --> 00:56:38,426
junior, you need to have a 
chance to start and you need to 

1091
00:56:38,426 --> 00:56:42,415
learn from first principles. 
And having an AI available as a 

1092
00:56:42,415 --> 00:56:46,208
junior, I'm not sure if that's 
the right way to go because it's

1093
00:56:46,208 --> 00:56:49,190
so, it makes it so easy to 
complete tasks without building 

1094
00:56:49,190 --> 00:56:53,788
a proper understanding. 
So to me, true learning has to 

1095
00:56:53,788 --> 00:56:56,565
be effortful, right? 
We really have to struggle in 

1096
00:56:56,565 --> 00:56:59,260
order to learn, and we need a 
chance to do that. 

1097
00:57:00,197 --> 00:57:03,767
Yeah, I might be worried for our
profession 10 or even 15 years 

1098
00:57:03,767 --> 00:57:06,297
from now, because we need to 
grow the next generation of 

1099
00:57:06,297 --> 00:57:08,111
developers. 
There's going to be a lot of 

1100
00:57:08,111 --> 00:57:12,959
them that are needed out there. 
That said, I do hope that AI, if

1101
00:57:12,959 --> 00:57:16,307
we rethink it, then I think AI 
can become a best friend. 

1102
00:57:16,517 --> 00:57:18,737
Like doing all this work that we
dislike to do. 

1103
00:57:18,767 --> 00:57:20,537
You mentioned, uh, 
documentation. 

1104
00:57:21,167 --> 00:57:25,348
Another good example might be 
to, you know, fix technical 

1105
00:57:25,348 --> 00:57:28,784
debt, automate the security 
patches, that type of stuff that

1106
00:57:28,784 --> 00:57:31,819
is repetitive and might be 
boring to a human, right, and 

1107
00:57:31,819 --> 00:57:35,098
often they get down prioritized.
That's the type of task that I 

1108
00:57:35,098 --> 00:57:36,714
think could be super beneficial 
for a machine. 

1109
00:57:38,078 --> 00:57:39,797
Yeah. 
So yeah, definitely it's a 

1110
00:57:39,797 --> 00:57:42,398
reality check that we are all 
waiting for, right? 

1111
00:57:42,398 --> 00:57:44,760
So I think you mentioned a 
couple of key things for 

1112
00:57:44,760 --> 00:57:46,718
listeners here to put more focus
in, right? 

1113
00:57:46,718 --> 00:57:49,118
The first principles, the 
fundamentals, right? 

1114
00:57:49,418 --> 00:57:52,073
And also I think more 
importantly also do try out 

1115
00:57:52,073 --> 00:57:55,202
using these tools, because 
unless you try to use it, you 

1116
00:57:55,202 --> 00:57:58,465
probably won't see the, you 
know, the so-called the negative

1117
00:57:58,465 --> 00:58:01,019
side of using AI. 
So it could be the amount of 

1118
00:58:01,019 --> 00:58:03,464
tech debt that it gets 
introduced, security issues that

1119
00:58:03,464 --> 00:58:06,005
seem still not doing pretty 
well, right? 

1120
00:58:06,305 --> 00:58:09,809
So definitely, if you understand
how software gets delivered and 

1121
00:58:09,809 --> 00:58:12,385
also introduce much better 
guardrails in your software 

1122
00:58:12,385 --> 00:58:15,049
development, I think that will 
be much more important. 

1123
00:58:15,499 --> 00:58:18,379
So Adam, we have talked a lot 
about, you know, AI, code 

1124
00:58:18,379 --> 00:58:20,963
quality, is there anything else 
that you think must be covered 

1125
00:58:20,963 --> 00:58:22,819
before we move on to the last 
question? 

1126
00:58:23,615 --> 00:58:25,670
No, I think we covered so much 
ground. 

1127
00:58:25,670 --> 00:58:27,050
So I'm, uh, I'm quite happy with
that. 

1128
00:58:28,041 --> 00:58:29,560
Yeah. 
With that I have one last 

1129
00:58:29,560 --> 00:58:30,826
question that I would like to 
ask you. 

1130
00:58:30,826 --> 00:58:32,986
I call this the three technical 
leadership wisdom. 

1131
00:58:33,388 --> 00:58:35,770
So maybe you can think of it 
just like advice you wanna give 

1132
00:58:35,770 --> 00:58:37,786
to us. 
Maybe you can share the wisdom 

1133
00:58:37,786 --> 00:58:40,265
for us today. 
Yeah, sure. 

1134
00:58:41,465 --> 00:58:45,087
So I think my top three 
recommendations would be, number

1135
00:58:45,087 --> 00:58:49,441
one, learn to learn. 
So we have seen that like the 

1136
00:58:49,441 --> 00:58:52,055
only true constant in software 
development is change. 

1137
00:58:52,535 --> 00:58:55,895
We just discussed AI, and that's
one example, right? 

1138
00:58:55,895 --> 00:58:59,495
So when these changes in 
technology, changes in 

1139
00:58:59,495 --> 00:59:02,257
programming languages, changes 
in whatever technologies we're 

1140
00:59:02,257 --> 00:59:05,950
using, when they come along, we 
need to be able to understand 

1141
00:59:05,950 --> 00:59:10,423
them quickly and follow along. 
And, uh, that to me is a 

1142
00:59:10,423 --> 00:59:13,455
lifelong journey where we have 
to practice and become more 

1143
00:59:13,455 --> 00:59:16,835
efficient at learning new stuff 
and be able to pick it up fast. 

1144
00:59:17,547 --> 00:59:20,419
The second recommendation I 
would give, and that's something

1145
00:59:20,419 --> 00:59:24,053
I wish I understood much, much 
earlier in my career, that is to

1146
00:59:24,053 --> 00:59:26,457
become a domain expert at 
whatever you do. 

1147
00:59:27,837 --> 00:59:31,116
So we all understand the 
importance of being good at 

1148
00:59:31,116 --> 00:59:33,801
programming and development. 
But understanding the actual 

1149
00:59:33,801 --> 00:59:36,807
domain where we build products 
is super important, because if 

1150
00:59:36,807 --> 00:59:39,687
we are domain experts, if we 
truly understand our customers, 

1151
00:59:39,687 --> 00:59:42,927
use the product and what they 
aim to achieve with the product,

1152
00:59:43,707 --> 00:59:46,077
then we can do so many 
optimizations. 

1153
00:59:46,077 --> 00:59:48,177
That's where the real big 
benefits are, right? 

1154
00:59:48,207 --> 00:59:50,967
We might realize that we don't 
even have to build this feature.

1155
00:59:51,417 --> 00:59:54,207
We can take away that feature, 
or this one can be much simpler.

1156
00:59:54,207 --> 00:59:57,167
That's where we make the real 
architectural wins. 

1157
00:59:58,017 --> 01:00:02,029
And finally number three is, uh,
I would recommend to always lead

1158
01:00:02,029 --> 01:00:05,439
by example. 
So this is something I've been 

1159
01:00:05,439 --> 01:00:08,883
trying to do myself now, because
I've found myself in a manager 

1160
01:00:08,883 --> 01:00:10,696
position of the kind of founding
CodeScene. 

1161
01:00:11,356 --> 01:00:15,779
I always try to lead by example.
So what that means in practice 

1162
01:00:15,779 --> 01:00:19,196
is that I would never require 
anything from my teammates that 

1163
01:00:19,196 --> 01:00:21,191
I wouldn't be prepared to do 
myself. 

1164
01:00:21,959 --> 01:00:24,506
That I think is important. 
Wow. 

1165
01:00:24,566 --> 01:00:26,396
Beautiful, beautifully said, 
right? 

1166
01:00:26,396 --> 01:00:30,080
So learn to learn, uh, become 
domain expert and also lead by 

1167
01:00:30,080 --> 01:00:32,697
examples. 
So Adam, if people would love to

1168
01:00:32,697 --> 01:00:34,975
connect with you, ask you more 
questions beyond this 

1169
01:00:34,975 --> 01:00:37,259
conversation, is there a place 
where they can find you online? 

1170
01:00:37,773 --> 01:00:40,708
Yeah, I spend most of my online 
time at LinkedIn. 

1171
01:00:40,828 --> 01:00:44,128
So that's my preferred channel 
and I'll be super happy to 

1172
01:00:44,128 --> 01:00:45,928
connect and continue the 
conversation there. 

1173
01:00:47,053 --> 01:00:49,896
Yeah, and I hope more people 
will be able to also try out, 

1174
01:00:49,896 --> 01:00:52,369
uh, using CodeScene to do 
behavioral analysis of their 

1175
01:00:52,369 --> 01:00:55,423
codebase and see the kind of 
code health, healthness that 

1176
01:00:55,423 --> 01:00:57,635
they have within their software 
development teams. 

1177
01:00:57,965 --> 01:01:00,335
So again, thank you so much for 
your time today, Adam. 

1178
01:01:00,335 --> 01:01:03,742
I learned a lot about code 
quality and also the danger of 

1179
01:01:03,742 --> 01:01:05,656
using AI towards the code 
quality. 

1180
01:01:05,926 --> 01:01:07,833
So thank you for your time. 
Yeah. 

1181
01:01:07,833 --> 01:01:09,836
Thanks, Henry. 
Thank you very much for, uh, 

1182
01:01:09,836 --> 01:01:10,868
hosting me. 
Thanks.

