1
00:00:00,040 --> 00:00:03,280
Ever feel like you're caught in,
I don't know, an avalanche of 

2
00:00:03,280 --> 00:00:06,600
information, just wishing 
someone could sift through it 

3
00:00:06,600 --> 00:00:08,160
all and hand you the really 
important stuff? 

4
00:00:08,640 --> 00:00:10,640
Well, that's pretty much what we
do here on the Deep Dive. 

5
00:00:10,720 --> 00:00:13,320
We take your sources, articles, 
research, whatever you've got, 

6
00:00:13,640 --> 00:00:17,440
and we act as your guides, 
cutting through the noise to get

7
00:00:17,440 --> 00:00:20,680
to the core knowledge. 
Think of it as a shortcut to 

8
00:00:20,680 --> 00:00:24,080
being genuinely well informed, 
hopefully with a few surprising 

9
00:00:24,080 --> 00:00:25,920
facts, maybe a bit of humor 
along the way. 

10
00:00:26,560 --> 00:00:30,080
Today we're diving into software
engineering, specifically a 

11
00:00:30,080 --> 00:00:34,280
practice called refactoring, and
even more specifically, one 

12
00:00:34,280 --> 00:00:37,920
technique that some people call 
the Swiss Army Knife of code 

13
00:00:37,920 --> 00:00:40,000
improvement. 
That's right, and we're drawing 

14
00:00:40,000 --> 00:00:42,720
our insights from software 
Engineering, a Modern approach 

15
00:00:42,760 --> 00:00:45,760
by Marco Tulio Valente. 
We're really zeroing in on this 

16
00:00:45,760 --> 00:00:49,640
concept called Extract method. 
So our mission, if you like, for

17
00:00:49,640 --> 00:00:53,640
this deep dive is to unpack what
extract method actually is, how 

18
00:00:53,640 --> 00:00:56,440
it works, and maybe most 
importantly, why it's such a 

19
00:00:56,520 --> 00:00:59,560
valuable tool for anyone 
interested in how software gets 

20
00:00:59,560 --> 00:01:03,280
built and, crucially, how it's 
maintained over time. 

21
00:01:03,280 --> 00:01:05,800
OK, let's get into it then. 
Let's unpack this right. 

22
00:01:06,200 --> 00:01:09,400
So to really get extract method,
we first need to talk about the 

23
00:01:09,400 --> 00:01:12,400
bigger picture refactoring 
itself. 

24
00:01:12,520 --> 00:01:15,280
Basically, refactoring is all 
about restructuring existing 

25
00:01:15,280 --> 00:01:17,480
computer code. 
But here's the key thing. 

26
00:01:17,880 --> 00:01:20,720
You do it without changing what 
the software actually does. 

27
00:01:20,840 --> 00:01:25,280
From the outside, the behavior 
stays the same, but the inside, 

28
00:01:25,640 --> 00:01:27,680
well, the internal structure 
changes. 

29
00:01:28,160 --> 00:01:30,840
The main goals, according to the
book, are things like removing 

30
00:01:30,840 --> 00:01:33,920
duplication, making it easier 
for developers to communicate 

31
00:01:33,920 --> 00:01:36,880
about the code, and generally 
making the design simpler or 

32
00:01:36,880 --> 00:01:39,160
more flexible. 
So it's not about adding new 

33
00:01:39,160 --> 00:01:42,440
features or fixing bugs then? 
It's more like tidying up a 

34
00:01:42,440 --> 00:01:44,720
really messy workshop. 
But for code. 

35
00:01:44,720 --> 00:01:46,360
I think the book mentioned 
Martin Fowler. 

36
00:01:46,360 --> 00:01:48,480
He really pushed this idea back 
in 2000, right? 

37
00:01:48,720 --> 00:01:50,320
With his big list of 
refactorings. 

38
00:01:50,320 --> 00:01:52,760
He apparently laid out the 
steps, the pros, the cons for 

39
00:01:52,760 --> 00:01:54,600
each one. 
Gave developers a common 

40
00:01:54,600 --> 00:01:56,800
language, sort of. 
Exactly. 

41
00:01:56,840 --> 00:02:00,280
A common vocabulary, A toolkit. 
It really boils down to what 

42
00:02:00,280 --> 00:02:03,440
Kent Beck called good 
programming habits, and the 

43
00:02:03,440 --> 00:02:07,400
source really emphasizes this. 
Beyond just fixing bugs or 

44
00:02:07,400 --> 00:02:11,039
adding features, developers need
to refactor often to ensure 

45
00:02:11,039 --> 00:02:15,280
systems maintainability. 
Maintainability just means how 

46
00:02:15,280 --> 00:02:18,080
easy it is to understand the 
code, change it, add to it later

47
00:02:18,080 --> 00:02:19,320
on. 
You know, without breaking 

48
00:02:19,320 --> 00:02:22,000
everything. 
If code isn't maintainable, it 

49
00:02:22,000 --> 00:02:24,360
just gets harder and harder and 
more expensive to work with. 

50
00:02:24,400 --> 00:02:27,280
Over time it becomes brittle. 
And that distinction is quite 

51
00:02:27,280 --> 00:02:29,440
important, isn't it, 'cause 
sometimes you hear refactoring 

52
00:02:29,800 --> 00:02:33,080
used for, well, all sorts of 
changes, like making code faster

53
00:02:33,080 --> 00:02:35,640
maybe, or improving the user 
interface. 

54
00:02:35,760 --> 00:02:38,800
That's true, the term can get 
used broadly, but for this deep 

55
00:02:38,800 --> 00:02:41,240
dive and sticking to the books 
definition, we're using 

56
00:02:41,240 --> 00:02:42,920
refactoring in its stricter 
sense. 

57
00:02:43,280 --> 00:02:46,240
It means changes specifically 
aimed at improving the internal 

58
00:02:46,240 --> 00:02:49,720
structure, the maintainability 
of the code itself, making the 

59
00:02:49,720 --> 00:02:52,560
code base healthier. 
Essentially any performance 

60
00:02:52,560 --> 00:02:55,240
gains or easier feature 
development are usually hay side

61
00:02:55,240 --> 00:02:57,960
effects, not the primary goal of
refactoring itself. 

62
00:02:58,200 --> 00:03:00,920
OK, got it. 
O refactoring is about internal 

63
00:03:00,920 --> 00:03:03,680
tidiness. 
Now let's zoom in on that Swiss 

64
00:03:03,680 --> 00:03:05,800
army knife you mentioned. 
Extract method. 

65
00:03:06,040 --> 00:03:08,120
What's the deal with this 
specific technique? 

66
00:03:08,440 --> 00:03:10,160
Why is it considered so 
versatile? 

67
00:03:10,520 --> 00:03:13,320
Well, its power lies in its 
simplicity, really. 

68
00:03:13,480 --> 00:03:17,440
At its core, extract method 
means you take a specific chunk 

69
00:03:17,440 --> 00:03:19,960
of code out of an existing 
method that's called the 

70
00:03:19,960 --> 00:03:24,360
original method F You identify a
logical block within F, maybe it

71
00:03:24,360 --> 00:03:27,680
does one specific thing. 
Then you move that block into a 

72
00:03:27,680 --> 00:03:30,560
completely new separate method, 
let's call the new 1G. 

73
00:03:30,880 --> 00:03:33,800
And then back in the original 
method F, you just replace that 

74
00:03:33,800 --> 00:03:36,760
block of code you removed with a
simple call to your new method 

75
00:03:36,760 --> 00:03:39,400
G. 
OK, so you're taking like one 

76
00:03:39,400 --> 00:03:41,760
step from a complicated recipe, 
putting it in its own little 

77
00:03:41,760 --> 00:03:45,400
label jar, and then the main 
recipe just says add contents of

78
00:03:45,400 --> 00:03:47,200
jar G here. 
Is that kind of the idea? 

79
00:03:47,400 --> 00:03:48,800
That's a perfect analogy 
actually. 

80
00:03:48,800 --> 00:03:54,800
Yeah, Visually if methyl F does 
A then B, then C&B is the bit 

81
00:03:54,800 --> 00:03:57,720
you want to extract. 
After you apply extract method, 

82
00:03:57,840 --> 00:04:02,040
you get a new method G that just
contains B and method S becomes 

83
00:04:02,200 --> 00:04:06,200
do A then call G then do C It 
just cleans things up visually 

84
00:04:06,200 --> 00:04:08,040
and logically makes the flow 
much clearer. 

85
00:04:08,440 --> 00:04:11,040
And it's not just for pulling 
out one small bit, is it? 

86
00:04:11,680 --> 00:04:14,560
The source mentioned variations.
Absolutely, that's why it's the 

87
00:04:14,560 --> 00:04:17,200
Swiss army knife. 
You could extract several pieces

88
00:04:17,200 --> 00:04:20,959
from one long method, creating 
maybe G1G2 and G3. 

89
00:04:20,959 --> 00:04:24,280
Or maybe even more powerfully, 
if you see the exact same chunk 

90
00:04:24,280 --> 00:04:27,880
of code repeated and say 5 
different methods, you can 

91
00:04:27,880 --> 00:04:31,920
extract that common code into 
just one new method, G, and then

92
00:04:31,960 --> 00:04:34,880
all 5 original methods just 
called G Boom, duplication gone.

93
00:04:34,880 --> 00:04:37,000
Which is a huge win for 
maintainability. 

94
00:04:37,080 --> 00:04:39,080
OK, that sounds incredibly 
useful, especially the 

95
00:04:39,080 --> 00:04:41,520
duplication part. 
But what about the details? 

96
00:04:41,520 --> 00:04:44,000
Like if the code you pull out 
needs some information that was 

97
00:04:44,000 --> 00:04:47,080
only available in the original 
method, or if it calculates 

98
00:04:47,080 --> 00:04:49,680
something the original method 
needs later on, how does that 

99
00:04:49,680 --> 00:04:52,080
work? 
Right, good question that gets 

100
00:04:52,080 --> 00:04:55,520
into the mechanics. 
If the code you extract the code

101
00:04:55,520 --> 00:04:59,040
going into G uses variables that
were local to the original 

102
00:04:59,040 --> 00:05:03,120
method F, you have to pass those
variables into G as parameters. 

103
00:05:03,520 --> 00:05:05,000
Think of it like giving 
instructions. 

104
00:05:05,000 --> 00:05:08,040
You have to provide the 
necessary inputs, and similarly,

105
00:05:08,200 --> 00:05:11,600
if that extracted code produces 
some result that method F needs 

106
00:05:11,600 --> 00:05:14,840
to continue its work, then the 
new method G has to return that 

107
00:05:14,840 --> 00:05:18,080
value O. 
And one more thing, if there are

108
00:05:18,080 --> 00:05:20,840
variables used only within the 
extracted part, you should move 

109
00:05:20,840 --> 00:05:23,280
their declarations into the new 
method G as well. 

110
00:05:23,640 --> 00:05:26,600
Keep things tidy. 
These rules ensure everything 

111
00:05:26,600 --> 00:05:28,760
still connects and works 
correctly after the change. 

112
00:05:28,840 --> 00:05:31,880
OK, so it really is versatile, 
handles different situations. 

113
00:05:31,880 --> 00:05:34,480
You breakdown big methods, make 
them focus, give them clear 

114
00:05:34,480 --> 00:05:37,440
names and the original method 
just becomes this sequence of 

115
00:05:37,440 --> 00:05:39,440
calls. 
Much easier to follow, I 

116
00:05:39,440 --> 00:05:40,640
imagine. 
Exactly. 

117
00:05:40,880 --> 00:05:43,440
The book gives a really great 
real world example. 

118
00:05:43,440 --> 00:05:45,880
It was from an Android system. 
There was this method called 

119
00:05:45,880 --> 00:05:48,000
oncreate. 
It's a standard thing in Android

120
00:05:48,000 --> 00:05:50,920
that runs when a component 
starts up, and this particular 

121
00:05:50,920 --> 00:05:53,600
oncreate method was responsible 
for setting up a whole bunch of 

122
00:05:53,600 --> 00:05:56,320
database tables. 
And how long was the source 

123
00:05:56,320 --> 00:05:58,760
said? 
Over 200 lines, right? 

124
00:05:58,960 --> 00:06:02,680
Just creating tables for things 
like cell signals, SMS messages,

125
00:06:02,680 --> 00:06:05,680
locations. 
Wow, trying to understand or 

126
00:06:05,680 --> 00:06:07,080
debug that must have been really
tough. 

127
00:06:07,120 --> 00:06:09,040
Where would you even start? 
Precisely. 

128
00:06:09,040 --> 00:06:12,720
It was a classic example of a 
method doing way too much, so a 

129
00:06:12,720 --> 00:06:16,080
developer took extract method to
it not just once, but seven 

130
00:06:16,080 --> 00:06:18,520
times. 
Each time they pulled out the 

131
00:06:18,520 --> 00:06:21,920
logic for creating one specific 
database table and they created 

132
00:06:21,920 --> 00:06:24,640
new methods with really clear 
names like CREATE CELL SIGNAL 

133
00:06:24,640 --> 00:06:26,400
TABLE or CREATE SILENCE SPACE 
table. 

134
00:06:26,400 --> 00:06:28,520
You knew exactly what each one 
did, just from the name. 

135
00:06:28,760 --> 00:06:32,000
OK, and here's the kicker right?
What happened to the original 

136
00:06:32,000 --> 00:06:35,720
200 line on create method? 
After those 7 extractions, it 

137
00:06:35,720 --> 00:06:38,640
was reduced to just 7 lines, 
just a list of calls. 

138
00:06:38,800 --> 00:06:41,560
Call create cell signal table, 
call Crete silence table, and so

139
00:06:41,560 --> 00:06:43,280
on. 
That's a massive improvement in 

140
00:06:43,280 --> 00:06:45,360
clarity. 
And apparently each new method 

141
00:06:45,560 --> 00:06:48,920
only needed one input, the 
database connection itself. 

142
00:06:49,160 --> 00:06:50,920
Exactly. 
Think about the difference that 

143
00:06:50,920 --> 00:06:53,760
makes. 
A new developer joins the team, 

144
00:06:54,040 --> 00:06:57,480
looks at on create and instantly
understands the high level 

145
00:06:57,480 --> 00:06:59,840
process. 
OK, it creates these 7 tables. 

146
00:07:00,280 --> 00:07:03,040
If there's a bug related to the 
SMS table, you know exactly 

147
00:07:03,040 --> 00:07:05,440
where to look. 
The Create silent Steam stable 

148
00:07:05,440 --> 00:07:08,280
method You don't have to hunt 
through 200 lines anymore. 

149
00:07:08,440 --> 00:07:11,440
It reduces the cognitive load, 
speeds up debugging, makes 

150
00:07:11,440 --> 00:07:14,680
changes less risky, it turns 
complexity into manageable 

151
00:07:14,680 --> 00:07:17,000
units. 
Now understanding the how is 

152
00:07:17,000 --> 00:07:20,240
important, but the why is maybe 
even more interesting. 

153
00:07:20,520 --> 00:07:22,680
Why do developers actually do 
this? 

154
00:07:22,800 --> 00:07:26,400
The book points to a study from 
2016 looking at GitHub projects.

155
00:07:26,800 --> 00:07:29,880
They found developers had, 
believe it or not, 11 distinct 

156
00:07:29,880 --> 00:07:33,240
motivations for using extract 
method. 11 different reasons 

157
00:07:33,560 --> 00:07:36,360
that really does drive home the 
Swiss army knife idea it's long 

158
00:07:36,360 --> 00:07:38,840
to a lot of different problems 
So what were the main ones what 

159
00:07:38,840 --> 00:07:41,520
was the number one reason 
developers reached for this tool

160
00:07:41,840 --> 00:07:43,240
the. 
Biggest one by far. 

161
00:07:43,240 --> 00:07:46,440
It came up 43 times in their 
study was code reuse. 

162
00:07:47,160 --> 00:07:51,000
Often a developer needs a piece 
of logic that's already written,

163
00:07:51,360 --> 00:07:53,280
but it's buried inside some 
other method. 

164
00:07:53,520 --> 00:07:56,880
So they use extract method to 
pull that logic out, make it his

165
00:07:56,880 --> 00:07:59,840
own reusable method, and then 
they can call it from wherever 

166
00:07:59,840 --> 00:08:01,800
they need it. 
No more copy pasting. 

167
00:08:01,800 --> 00:08:03,320
Yeah, that. 
Makes sense, The book had a 

168
00:08:03,320 --> 00:08:05,640
quote from a developer about 
that didn't something like. 

169
00:08:06,160 --> 00:08:08,760
I always try to reuse code 
because when there's a lot of 

170
00:08:08,760 --> 00:08:11,320
code redundancy it gets 
overwhelmingly more complicated.

171
00:08:11,320 --> 00:08:13,800
And when something changes in 
code that is duplicated 

172
00:08:13,800 --> 00:08:16,400
somewhere, it usually needs to 
be changed there also. 

173
00:08:17,240 --> 00:08:20,080
That really highlights the pain 
of duplicated code down the 

174
00:08:20,080 --> 00:08:21,920
line. 
The maintenance nightmare it. 

175
00:08:21,920 --> 00:08:24,680
Really is find a bug in 
duplicated code and you have to 

176
00:08:24,680 --> 00:08:27,800
find and fix every single copy. 
It's so easy to miss. 

177
00:08:27,800 --> 00:08:30,880
One extraction solves that. 
OK, so the second most common 

178
00:08:30,880 --> 00:08:35,000
reason found 25 times was 
introducing an alternative 

179
00:08:35,000 --> 00:08:36,960
signature for an existing method
an. 

180
00:08:36,960 --> 00:08:39,440
Alternative signature. 
OK, what does that mean in plain

181
00:08:39,440 --> 00:08:41,400
English? 
Why extract for that it's? 

182
00:08:41,400 --> 00:08:44,120
Basically about offering 
slightly different ways to use a

183
00:08:44,120 --> 00:08:46,960
piece of functionality, Maybe a 
simpler way or more advanced way

184
00:08:47,240 --> 00:08:48,880
without duplicating the core 
logic. 

185
00:08:49,200 --> 00:08:52,960
So imagine you have a log method
that just writes a message to a 

186
00:08:52,960 --> 00:08:54,560
file. 
Simple enough. 

187
00:08:54,560 --> 00:08:58,440
A developer might extract the 
actual file writing part into a 

188
00:08:58,440 --> 00:09:01,800
new internal method. 
Then they could create a new 

189
00:09:01,800 --> 00:09:04,800
version of the log method, maybe
one that takes an extra option 

190
00:09:04,800 --> 00:09:07,440
like a boolean flag saying. 
Also print this message to 

191
00:09:07,440 --> 00:09:10,480
console the original simpler log
method which is called the 

192
00:09:10,480 --> 00:09:13,720
extracted core logic with the 
console flag set to false. 

193
00:09:13,840 --> 00:09:16,720
The new one calls it with the 
flag set to true or based on the

194
00:09:16,720 --> 00:09:17,880
parameter OK. 
Gotcha. 

195
00:09:17,880 --> 00:09:20,400
So you end up with maybe 2 log 
methods for the user of that 

196
00:09:20,400 --> 00:09:23,960
code, 1 simple one with more 
options, but they both rely on 

197
00:09:23,960 --> 00:09:26,360
that same extracted piece of 
core logic underneath. 

198
00:09:26,680 --> 00:09:29,760
So even a small change like 
adding an if statement based on 

199
00:09:29,760 --> 00:09:33,200
that new flag still leverages 
the extracted method keeps the 

200
00:09:33,200 --> 00:09:36,800
core logic in one place. 
Precisely, it avoids duplicating

201
00:09:36,800 --> 00:09:38,400
the fundamental file writing 
part. 

202
00:09:38,760 --> 00:09:41,400
Beyond those top 2, the study 
found other important reasons 

203
00:09:41,400 --> 00:09:45,560
too, like making code easier to 
test when you break down a big 

204
00:09:45,560 --> 00:09:49,360
method into smaller ones, each 
small method is easier to test 

205
00:09:49,360 --> 00:09:51,920
in isolation. 
You can write specific tests for

206
00:09:51,920 --> 00:09:53,920
create cell signal table, for 
example. 

207
00:09:53,920 --> 00:09:57,120
That's much harder when it's all
tangled up in 200 lines, right? 

208
00:09:57,120 --> 00:09:59,320
Testing becomes more focused, 
more manageable. 

209
00:09:59,480 --> 00:10:01,280
Boost confidence too, I bet. 
What else? 

210
00:10:01,480 --> 00:10:03,720
Well. 
Closely related is improving 

211
00:10:03,720 --> 00:10:05,680
understanding by reducing 
complexity. 

212
00:10:06,080 --> 00:10:08,600
That Android example is perfect 
here, just making the code 

213
00:10:08,600 --> 00:10:12,160
easier to read and grasp by 
breaking it into named logical 

214
00:10:12,160 --> 00:10:15,400
tongues, turning a dense 
paragraph into bullet points. 

215
00:10:15,400 --> 00:10:18,160
Essentially it just lowers the 
mental barrier to understanding 

216
00:10:18,160 --> 00:10:20,480
what's going on. 
And one other common 1 was 

217
00:10:20,480 --> 00:10:23,200
facilitating new features. 
Sometimes you need to add 

218
00:10:23,200 --> 00:10:26,440
something new that uses part of 
an existing complex method. 

219
00:10:26,720 --> 00:10:29,640
Extracting the relevant part 
first makes it cleaner and safer

220
00:10:29,640 --> 00:10:32,840
to then build the new feature on
top of or alongside that 

221
00:10:32,840 --> 00:10:35,800
extracted logic without messing 
up the original method. 

222
00:10:36,080 --> 00:10:38,160
It prepares the code for change 
so. 

223
00:10:38,160 --> 00:10:41,400
You can really see how extract 
method isn't just some technical

224
00:10:41,400 --> 00:10:43,440
trick, it's a fundamental 
strategy. 

225
00:10:43,800 --> 00:10:47,360
It's about creating code that's 
cleaner, easier to understand, 

226
00:10:47,640 --> 00:10:51,360
and crucially, easier to 
maintain and change over the 

227
00:10:51,360 --> 00:10:53,800
long haul. 
It really is a versatile tool 

228
00:10:53,800 --> 00:10:55,240
for code health. 
Absolutely. 

229
00:10:55,560 --> 00:10:58,280
Understanding these kinds of 
foundational concepts and the 

230
00:10:58,280 --> 00:11:01,040
real world reasons why 
developers use them, like we saw

231
00:11:01,040 --> 00:11:04,160
in that study, really helps 
appreciate the craft involved in

232
00:11:04,160 --> 00:11:06,520
software engineering. 
It shows why these practices are

233
00:11:06,520 --> 00:11:08,920
so important for building 
software that lasts. 

234
00:11:09,280 --> 00:11:12,080
Well, thank you for joining us 
on this deep dive into extract 

235
00:11:12,080 --> 00:11:13,800
method. 
Really fascinating stuff.

