1
00:00:00,040 --> 00:00:01,960
Quick note, this episode isn't 
sponsored. 

2
00:00:02,880 --> 00:00:06,560
I'm building a new kind of IDE 
called Rex because existing ones

3
00:00:06,560 --> 00:00:09,040
make it hard to work across 
multiple projects in parallel. 

4
00:00:09,800 --> 00:00:11,760
I'm sharing it to get feedback 
from listeners. 

5
00:00:11,920 --> 00:00:13,440
I'd really love to hear your 
thoughts. 

6
00:00:14,000 --> 00:00:16,800
The link is in the description. 
And now let's move on with 

7
00:00:16,800 --> 00:00:18,600
today's super interesting 
episode. 

8
00:00:18,880 --> 00:00:22,840
Welcome back to the Deep dive. 
Today we are tackling a topic 

9
00:00:22,840 --> 00:00:27,040
that it really acts as a kind of
dividing line in the career of a

10
00:00:27,040 --> 00:00:28,840
software engineer. 
Yeah, it really does. 

11
00:00:28,840 --> 00:00:31,440
It's the difference between 
someone who you know knows how 

12
00:00:31,440 --> 00:00:34,640
to write Java syntax, how to get
the code to compile, and someone

13
00:00:34,640 --> 00:00:37,440
who actually understands what 
the machine is doing with that 

14
00:00:37,440 --> 00:00:40,920
code. 
We are diving into the abyss of 

15
00:00:40,920 --> 00:00:43,720
the Java Memory Model or the 
JMM. 

16
00:00:43,800 --> 00:00:46,600
It really is the final boss of 
Java interviews, isn't it? 

17
00:00:46,760 --> 00:00:50,480
It's that layer where the 
abstraction we all rely on 

18
00:00:50,480 --> 00:00:52,800
starts to leak. 
It leaks everywhere. 

19
00:00:52,880 --> 00:00:54,640
It does. 
I mean you spend your early 

20
00:00:54,640 --> 00:00:59,040
career being told Java is safe. 
The JVM manages memory for you, 

21
00:00:59,040 --> 00:01:01,760
don't worry about pointers. 
And then you open the JMM 

22
00:01:01,760 --> 00:01:05,080
specification and realize, oh, 
actually there is a tremendous 

23
00:01:05,080 --> 00:01:08,000
amount of invisible chaos 
happening underneath my feet. 

24
00:01:08,000 --> 00:01:10,000
An invisible chaos is the 
perfect phrase for it. 

25
00:01:10,000 --> 00:01:12,360
I was reading through the 
breakdown of the Java Language 

26
00:01:12,360 --> 00:01:15,160
specification for this, and 
honestly the thing that struck 

27
00:01:15,160 --> 00:01:17,600
me most is that the computer is 
basically lying to us. 

28
00:01:17,600 --> 00:01:20,160
It is constantly, yeah, and for 
good reason. 

29
00:01:20,320 --> 00:01:22,560
We grow up thinking of code as a
recipe, right? 

30
00:01:23,360 --> 00:01:26,800
Step one, break the eggs. 
Step 2 whisk the eggs. 

31
00:01:27,080 --> 00:01:30,720
Step three, fry the eggs. 
We assume this strict linear 

32
00:01:30,720 --> 00:01:34,440
order, of course, but looking at
how the JM Ed works, the 

33
00:01:34,440 --> 00:01:37,400
computer might decide to heat 
the pan before it even buys the 

34
00:01:37,440 --> 00:01:40,200
eggs, provided it knows the 
result will be the same to you. 

35
00:01:40,520 --> 00:01:44,240
That is the illusion of serial 
execution, and honestly, it is a

36
00:01:44,240 --> 00:01:46,720
necessary lie. 
If the computer actually 

37
00:01:46,720 --> 00:01:49,640
executed instructions in this 
strict order you wrote them, 

38
00:01:49,920 --> 00:01:51,600
modern software would be 
crawling. 

39
00:01:51,600 --> 00:01:54,760
It would be unusably slow. 
Really that slow? 

40
00:01:54,760 --> 00:01:57,160
Oh, absolutely. 
The hardware relies on that 

41
00:01:57,160 --> 00:01:59,960
deception, on that reordering, 
to view the performance you 

42
00:01:59,960 --> 00:02:02,080
expect. 
So our mission today is to 

43
00:02:02,080 --> 00:02:05,400
dismantle that illusion. 
We need to explain why the 

44
00:02:05,400 --> 00:02:08,320
hardware cheats, how it gets 
away with it, and most 

45
00:02:08,320 --> 00:02:10,880
importantly, when that cheating 
causes our programs to crash in 

46
00:02:10,880 --> 00:02:12,600
ways that seem completely 
impossible. 

47
00:02:12,600 --> 00:02:15,120
Right, We're going to cover 
happens before, which sounds 

48
00:02:15,120 --> 00:02:18,000
like a time travel paradox, but 
is actually the law of the land.

49
00:02:18,640 --> 00:02:20,320
We're going to talk about 
volatile, which I feel like 

50
00:02:20,320 --> 00:02:22,840
almost everyone uses wrong at 
least once in their career. 

51
00:02:23,040 --> 00:02:25,600
Or they use it everywhere 
because they're scared, which is

52
00:02:25,600 --> 00:02:27,840
also wrong. 
Exactly, that's the other side 

53
00:02:27,840 --> 00:02:31,320
of the coin, and we'll dissect 
the infamous double checked 

54
00:02:31,320 --> 00:02:34,960
locking bug, which is basically 
a Horror Story for developers. 

55
00:02:34,960 --> 00:02:37,240
A true rite of passage. 
Plus we'll look at the modern 

56
00:02:37,240 --> 00:02:39,680
tools like bar handles. 
The very sharp knives in the 

57
00:02:39,680 --> 00:02:41,520
drawer. 
Let's start with the basics, 

58
00:02:41,520 --> 00:02:43,920
then segment one the reality 
check. 

59
00:02:44,680 --> 00:02:47,840
Why does the hardware reorder my
code? 

60
00:02:47,960 --> 00:02:52,320
I mean, if I write X = 1 
followed by y = 2, why is it so 

61
00:02:52,320 --> 00:02:56,360
hard for the CPU to just, you 
know, do that in order to? 

62
00:02:56,360 --> 00:02:58,120
Understand that you have to 
understand something called the 

63
00:02:58,120 --> 00:03:01,120
memory wall. 
This is the fundamental massive 

64
00:03:01,120 --> 00:03:04,080
gap between CPU speed and memory
speed. 

65
00:03:04,560 --> 00:03:08,000
A modern CPU is screaming fast. 
It can execute billions of 

66
00:03:08,000 --> 00:03:11,480
cycles per second. 
It is a Ferrari but main member 

67
00:03:11,480 --> 00:03:13,800
your Ram sticks. 
Comparatively, they're ancient. 

68
00:03:13,840 --> 00:03:17,360
They're just so slow. 
So if the CPU is a Ferrari, what

69
00:03:17,360 --> 00:03:19,000
is RAM? 
A horse and buggy. 

70
00:03:19,240 --> 00:03:22,760
If the CPU is a Ferrari, 
fetching data from main RAM is 

71
00:03:22,760 --> 00:03:25,360
like waiting for a postal 
delivery from the next town 

72
00:03:25,360 --> 00:03:28,040
over. 
It takes forever in CPU time. 

73
00:03:29,000 --> 00:03:32,920
Hundreds, sometimes thousands of
CPU cycles can pass in the time 

74
00:03:32,920 --> 00:03:34,640
it takes to get one piece of 
data. 

75
00:03:34,960 --> 00:03:37,480
Wow. 
So imagine the CPU hits a line 

76
00:03:37,480 --> 00:03:39,880
of code that says load data from
variable A. 

77
00:03:40,280 --> 00:03:41,840
Variable A is out in main 
memory. 

78
00:03:41,880 --> 00:03:45,320
The CPU has a choice. 
It can sit there idling, wasting

79
00:03:45,320 --> 00:03:48,160
all those hundreds of cycles 
doing absolutely nothing while 

80
00:03:48,160 --> 00:03:49,680
it waits for that data to 
arrive. 

81
00:03:49,720 --> 00:03:51,840
Which is a massive waste of 
expensive silicon. 

82
00:03:51,840 --> 00:03:54,200
A huge waste. 
Or you can look ahead. 

83
00:03:54,320 --> 00:03:57,200
It scans the next few lines of 
your code and sees oh the next 

84
00:03:57,200 --> 00:04:01,040
line is set variable B to 10. 
Now a variable B is already 

85
00:04:01,040 --> 00:04:03,600
sitting in the cache right there
in the CPUs local memory. 

86
00:04:03,600 --> 00:04:05,960
CPU says you know what, I'm 
going to do this second thing 

87
00:04:05,960 --> 00:04:07,600
while I wait for the first thing
to arrive. 

88
00:04:07,600 --> 00:04:11,280
So it reorders the instructions 
purely to hide the latency to 

89
00:04:11,280 --> 00:04:15,040
hide that postal delivery time. 
Exactly to the CPU and to the 

90
00:04:15,040 --> 00:04:17,200
Just in Time compiler which does
similar tricks. 

91
00:04:17,200 --> 00:04:20,320
The rule is we can shuffle this 
deck of instructions however we 

92
00:04:20,320 --> 00:04:23,080
want as long as the result 
within the single thread remains

93
00:04:23,080 --> 00:04:24,920
correct. 
That's the key phrase right 

94
00:04:24,920 --> 00:04:27,560
there within this single thread.
That's everything. 

95
00:04:27,760 --> 00:04:29,600
The CPU keeps track of its own 
mess. 

96
00:04:29,880 --> 00:04:33,880
It knows OK I wrote to B before 
A even though the code said A 

97
00:04:33,920 --> 00:04:36,160
before B. 
So if the user asks for the 

98
00:04:36,160 --> 00:04:39,320
value of A later, I have to make
sure I stall and present the 

99
00:04:39,320 --> 00:04:42,400
correct result. 
It maintains what's called as if

100
00:04:42,400 --> 00:04:44,360
serial semantics. 
As if. 

101
00:04:44,760 --> 00:04:47,640
As long as you are the only 
thread running, you will never 

102
00:04:47,640 --> 00:04:49,680
ever know the reordering 
happened. 

103
00:04:49,880 --> 00:04:52,680
You live in a happy, consistent,
linear world. 

104
00:04:52,840 --> 00:04:54,920
But the moment we introduce a 
second thread. 

105
00:04:54,920 --> 00:04:56,960
The illusion shatters 
completely. 

106
00:04:57,080 --> 00:05:00,440
Yeah, because thread 2 doesn't 
know about thread one's internal

107
00:05:00,440 --> 00:05:02,440
bookkeeping. 
Fred 2 just looks at shared 

108
00:05:02,440 --> 00:05:05,080
memory. 
If thread one wrote B first 

109
00:05:05,080 --> 00:05:08,240
because it was faster and didn't
need a memory fetch, thread 2 

110
00:05:08,240 --> 00:05:11,000
might look at memory and see 
that B has changed, but A is 

111
00:05:11,000 --> 00:05:13,920
still null or its old value. 
Even though in the code I wrote,

112
00:05:14,000 --> 00:05:15,880
a was clearly assigned first. 
Correct. 

113
00:05:16,200 --> 00:05:19,320
From Thread 2's perspective, 
Thread one is executing code 

114
00:05:19,320 --> 00:05:22,800
backwards or in random bursts, 
and that is where you get those 

115
00:05:22,800 --> 00:05:25,200
bugs that are literally 
impossible to reproduce on 

116
00:05:25,200 --> 00:05:27,440
purpose. 
We call them Heisenbugs, right 

117
00:05:27,760 --> 00:05:29,520
from the Heisenberg Uncertainty 
principle. 

118
00:05:29,520 --> 00:05:33,760
Yes, because the moment you try 
to observe them, say by adding a

119
00:05:33,760 --> 00:05:37,400
system dot out dot printlen or 
attaching a debugger that 

120
00:05:37,400 --> 00:05:39,600
changes the timing, it 
introduces overhead. 

121
00:05:39,640 --> 00:05:41,560
And the reordering disappeared. 
Exactly. 

122
00:05:41,680 --> 00:05:44,760
The CPU might stop optimizing so
aggressively because of the new 

123
00:05:44,760 --> 00:05:47,040
overhead. 
So you attach the debugger and 

124
00:05:47,040 --> 00:05:49,200
the bug vanishes. 
You take it off and your 

125
00:05:49,200 --> 00:05:51,160
application crashes an hour 
later. 

126
00:05:51,520 --> 00:05:53,840
It's madding. 
That is terrifying. 

127
00:05:53,840 --> 00:05:56,240
It's effectively gaslighting by 
the processor. 

128
00:05:56,280 --> 00:05:58,720
It really is. 
And it's not just reordering. 

129
00:05:58,720 --> 00:06:01,640
That's only half the problem. 
The other half is visibility. 

130
00:06:01,640 --> 00:06:04,480
OK, we have to talk about the 
physical architecture again. 

131
00:06:05,040 --> 00:06:08,040
We tend to assume memory is this
one big shared whiteboard that 

132
00:06:08,040 --> 00:06:10,720
everyone sees at the same time, 
but it's not. 

133
00:06:10,960 --> 00:06:14,480
Right, the architecture involves
caches L1L2L3 caches on the 

134
00:06:14,480 --> 00:06:15,840
chip. 
Think of it this way. 

135
00:06:16,760 --> 00:06:18,640
You and I are working on a 
shared project. 

136
00:06:18,920 --> 00:06:22,320
I am thread A, you are thread B,
but we are working in different 

137
00:06:22,320 --> 00:06:23,960
offices down the hall from each 
other. 

138
00:06:24,280 --> 00:06:26,360
Main memory is a bulletin board 
in the hallway. 

139
00:06:26,440 --> 00:06:29,240
OK, I like this analogy. 
I have a notebook on my desk. 

140
00:06:29,520 --> 00:06:32,680
That's my super fast L1 cache. 
You have your own notebook on 

141
00:06:32,680 --> 00:06:35,040
your desk. 
I write xe key one in my 

142
00:06:35,040 --> 00:06:37,840
notebook. 
Do you see it? 

143
00:06:38,720 --> 00:06:40,960
No, I'm in the other room. 
I only see what's on the 

144
00:06:40,960 --> 00:06:44,720
bulletin board in the hallway, 
which is the old value of X. 

145
00:06:44,720 --> 00:06:47,680
Exactly. 
And here is the kicker, I am 

146
00:06:47,680 --> 00:06:50,480
lazy. 
Or rather, I am efficient. 

147
00:06:51,080 --> 00:06:53,120
I'm not going to stand up and 
walk down the hall to the 

148
00:06:53,120 --> 00:06:56,120
bulletin board every single time
I write a number that takes too 

149
00:06:56,120 --> 00:06:57,800
much time. 
That's like a main memory 

150
00:06:57,800 --> 00:07:00,280
access, too slow. 
Right, I'm going to scribble in 

151
00:07:00,280 --> 00:07:02,920
my notebook for a while. 
Eventually, maybe when my cache 

152
00:07:02,920 --> 00:07:05,760
line gets full or I'm forced to,
I'll walk down and pin my pages 

153
00:07:05,760 --> 00:07:08,360
to the board. 
That action is flushing domain 

154
00:07:08,360 --> 00:07:10,640
memory. 
And even if you do pin it, I 

155
00:07:10,640 --> 00:07:13,360
don't see it immediately, do I? 
Because I'm just staring at my 

156
00:07:13,360 --> 00:07:15,080
own notebook. 
I'm not constantly checking the 

157
00:07:15,080 --> 00:07:16,080
hallway. 
Correct. 

158
00:07:16,080 --> 00:07:19,080
You have to decide to stand up, 
walk to the hall, and check the 

159
00:07:19,080 --> 00:07:20,520
board to update your own 
notebook. 

160
00:07:21,080 --> 00:07:25,360
That's invalidating your cache. 
If we don't have assistant a 

161
00:07:25,360 --> 00:07:28,840
rotocol to agree on when to walk
to the hallway, We're living in 

162
00:07:28,840 --> 00:07:30,600
two different realities. 
I think X is 1. 

163
00:07:30,600 --> 00:07:34,240
You think X is 0 and we are both
right according to our local 

164
00:07:34,240 --> 00:07:36,600
hardware. 
And the Java Memory model is 

165
00:07:36,600 --> 00:07:39,400
basically the schedule for 
walking to the hallway. 

166
00:07:39,640 --> 00:07:42,280
It's the set of rules that 
forces us to communicate. 

167
00:07:42,480 --> 00:07:45,720
That is a perfect way to put it.
The JMM is the contract. 

168
00:07:45,800 --> 00:07:49,320
It tells us if you use these 
specific keywords, the JVM 

169
00:07:49,320 --> 00:07:52,360
guarantees that thread A will 
post to the board and thread B 

170
00:07:52,360 --> 00:07:56,040
will go look at it. 
Without those keywords, the JVM 

171
00:07:56,040 --> 00:07:59,120
promises absolutely nothing. 
The hardware can stay in its 

172
00:07:59,120 --> 00:08:00,960
local cache forever if it wants 
to. 

173
00:08:01,240 --> 00:08:02,960
OK, so let's look at the 
contract. 

174
00:08:03,040 --> 00:08:07,440
This brings us to segment 2. 
The spec uses this term happens 

175
00:08:07,440 --> 00:08:09,200
before. 
It sounds very chronological. 

176
00:08:09,200 --> 00:08:13,120
Action A happens before action 
B, but based on what you just 

177
00:08:13,120 --> 00:08:16,240
said about reordering, getting 
hung up on time seems like a 

178
00:08:16,240 --> 00:08:18,160
huge mistake. 
It's the biggest mistake you can

179
00:08:18,160 --> 00:08:20,040
make. 
Happens before is a legal term, 

180
00:08:20,040 --> 00:08:22,240
not a physics term. 
It's about visibility and 

181
00:08:22,240 --> 00:08:25,160
ordering constraints. 
If we say action A happens 

182
00:08:25,160 --> 00:08:28,120
before action B, we are strictly
saying one thing. 

183
00:08:28,280 --> 00:08:30,880
Whatever result was produced in 
action A is guaranteed to be 

184
00:08:30,880 --> 00:08:32,559
visible to action B, and that's 
it. 

185
00:08:32,679 --> 00:08:36,240
That's the core guarantee. 
Even if in physical wall clock 

186
00:08:36,240 --> 00:08:39,200
time they happen almost 
simultaneously, or if the 

187
00:08:39,200 --> 00:08:42,559
hardware did some weird time 
travel optimization, the final 

188
00:08:42,559 --> 00:08:46,200
observable result must look as 
if a happened first. 

189
00:08:46,640 --> 00:08:49,960
O how do we forge this chain? 
Because by default 2 threads 

190
00:08:49,960 --> 00:08:52,120
have no happens before 
relationship. 

191
00:08:52,360 --> 00:08:54,240
They're in those separate 
offices, right? 

192
00:08:54,240 --> 00:08:56,800
We need synchronization actions 
to create a link. 

193
00:08:57,320 --> 00:09:00,320
The most basic 1 is the program 
order rule, which we already 

194
00:09:00,320 --> 00:09:03,480
discussed inside one thread. 
Line 1 happens before line 2. 

195
00:09:03,760 --> 00:09:05,080
That's the easy one, the one 
that. 

196
00:09:05,080 --> 00:09:07,360
Gives us the illusion of sanity.
Exactly. 

197
00:09:07,360 --> 00:09:08,920
The interesting ones are between
threads. 

198
00:09:09,080 --> 00:09:11,360
The classic mechanism is the 
monitor lock. 

199
00:09:11,440 --> 00:09:12,840
This is. 
When we use the synchronized 

200
00:09:12,840 --> 00:09:15,560
keyword, yes. 
The rule is simple but powerful.

201
00:09:15,800 --> 00:09:19,760
An unlock on a monitor happens 
before every subsequent lock on 

202
00:09:19,760 --> 00:09:21,200
that same monitor. 
Let's. 

203
00:09:21,200 --> 00:09:23,040
Apply the hallway analogy to 
synchronized. 

204
00:09:23,040 --> 00:09:24,280
What's happening there? 
OK. 

205
00:09:24,560 --> 00:09:27,640
The monitor is like a specific 
pass key to the bulletin board. 

206
00:09:28,560 --> 00:09:31,760
When I finish my synchronized 
block, I release the lock. 

207
00:09:32,280 --> 00:09:35,520
The JMM rule forces me to take 
all my scribbles from my 

208
00:09:35,520 --> 00:09:38,880
notebook, everything I have done
up to that point, and pin them 

209
00:09:38,880 --> 00:09:40,520
to the board. 
I flush everything. 

210
00:09:40,520 --> 00:09:41,720
It's full. 
Sync a full. 

211
00:09:41,720 --> 00:09:44,400
Sync. 
Now you come along and you want 

212
00:09:44,400 --> 00:09:47,840
to enter a synchronized block on
that same object you have to 

213
00:09:47,840 --> 00:09:49,120
acquire the lock. 
I have to get the. 

214
00:09:49,120 --> 00:09:51,760
Pass key and that. 
Action of acquiring the lock 

215
00:09:51,960 --> 00:09:54,960
forces you to throw away your 
stale notebook pages regarding 

216
00:09:54,960 --> 00:09:57,360
those variables and read fresh 
ones for the board. 

217
00:09:57,920 --> 00:10:02,320
You invalidate your cache. 
So because I unlocked which 

218
00:10:02,320 --> 00:10:04,760
means I wrote to the board, and 
then you locked which means you 

219
00:10:04,760 --> 00:10:07,760
read from the board, you are 
guaranteed to see all of my 

220
00:10:07,760 --> 00:10:09,280
changes that makes. 
Perfect sense. 

221
00:10:09,400 --> 00:10:12,640
That's why synchronized is safe.
It forces the walk to the 

222
00:10:12,640 --> 00:10:14,920
hallway. 
But synchronized is considered 

223
00:10:14,920 --> 00:10:17,080
heavy. 
It stops other threads, it 

224
00:10:17,080 --> 00:10:18,960
blocks. 
It can cause contention. 

225
00:10:18,960 --> 00:10:20,680
What if I don't want to stop the
world? 

226
00:10:20,760 --> 00:10:22,600
Then you look. 
At volatile and this brings us 

227
00:10:22,600 --> 00:10:24,920
to segment 3. 
This is possibly the most 

228
00:10:24,920 --> 00:10:28,240
misunderstood keyword in the 
entire Java language I feel. 

229
00:10:28,240 --> 00:10:30,400
Like I see volatile used 
whenever someone is just 

230
00:10:30,400 --> 00:10:34,000
guessing. 
My multi threaded code is muggy.

231
00:10:34,200 --> 00:10:37,040
Let's throw volatile on some 
variables and see if it helps. 

232
00:10:37,160 --> 00:10:38,600
That is. 
Voodoo programming. 

233
00:10:39,080 --> 00:10:42,520
You're just poking things hoping
the magic works, but volatile is

234
00:10:42,520 --> 00:10:45,160
much more subtle. 
It is lighter weight than a lock

235
00:10:45,160 --> 00:10:46,880
because it doesn't cause a 
thread to block. 

236
00:10:47,400 --> 00:10:50,040
Thread B doesn't have to wait 
for thread A to finish a big 

237
00:10:50,040 --> 00:10:53,040
block of code, but it has very 
strict rules. 

238
00:10:53,040 --> 00:10:54,920
OK. 
What's the happens before rule 

239
00:10:54,920 --> 00:10:58,560
for volatile the? 
Rule is a write to a volatile 

240
00:10:58,560 --> 00:11:01,600
variable. 
Happens before any subsequent 

241
00:11:01,600 --> 00:11:03,320
read of that same variable. 
OK, let's. 

242
00:11:03,320 --> 00:11:05,800
Use the analogy again. 
I have a volatile boolean called

243
00:11:05,800 --> 00:11:09,320
flag when. 
I in thread A write flag True, 

244
00:11:09,320 --> 00:11:12,120
because flag is volatile, I am 
forced to go to the bulletin 

245
00:11:12,120 --> 00:11:15,080
board and updated immediately. 
But here's the critical part, 

246
00:11:15,320 --> 00:11:18,640
the part most people miss. 
I don't just update flag, I 

247
00:11:18,640 --> 00:11:21,360
update everything I wrote before
that volatile write wait. 

248
00:11:21,360 --> 00:11:24,920
Everything, even the normal non 
volatile variables, yes. 

249
00:11:25,520 --> 00:11:29,360
This is the piggybacking effect.
This is the aha moment for 

250
00:11:29,360 --> 00:11:32,400
understanding the JMM. 
Imagine I write to a normal non 

251
00:11:32,400 --> 00:11:34,280
volatile variable data. 
It's both 42. 

252
00:11:34,920 --> 00:11:36,520
That's just a scribble in my 
notebook. 

253
00:11:36,840 --> 00:11:39,320
Then on the very next line I 
write to a volatile boolean 

254
00:11:39,320 --> 00:11:42,720
ready shrew. 
Because ready is volatile, I am 

255
00:11:42,720 --> 00:11:45,880
forced to go to the hallway, but
the rule says I have to flush my

256
00:11:45,880 --> 00:11:48,600
state. 
So I pin up the data 42 page 

257
00:11:48,720 --> 00:11:51,440
first and then I pin up the 
ready true page. 

258
00:11:51,440 --> 00:11:53,360
So the. 
Volatile variable acts as a 

259
00:11:53,360 --> 00:11:55,400
sweeper. 
It pushes all the prior changes 

260
00:11:55,400 --> 00:11:57,280
along with it exactly. 
It's a memory fence. 

261
00:11:57,280 --> 00:12:00,440
It says all rights prior to this
point must be made visible 

262
00:12:00,440 --> 00:12:02,480
before this volatile right can 
be made visible. 

263
00:12:02,760 --> 00:12:06,760
It's called the release fence. 
Now on your side, you're in 

264
00:12:06,760 --> 00:12:08,880
thread B. 
You're in a loop checking the 

265
00:12:08,880 --> 00:12:11,120
value of ready the moment you 
read through. 

266
00:12:11,120 --> 00:12:12,640
Is a volatile read, which is a 
volatile. 

267
00:12:12,640 --> 00:12:14,440
Read it acts as an acquire 
fence. 

268
00:12:14,760 --> 00:12:17,960
The JMM guarantees that you not 
only see ready is true, but 

269
00:12:17,960 --> 00:12:21,680
you're also guaranteed to see 
data 42, the normal variable 

270
00:12:21,680 --> 00:12:24,120
piggybacked on the visibility of
the volatile one, that is. 

271
00:12:24,120 --> 00:12:27,440
Incredibly powerful. 
You can use one volatile flag to

272
00:12:27,440 --> 00:12:31,160
signal that a huge graph of 
objects is ready to be read 

273
00:12:31,360 --> 00:12:34,080
without having to make every 
single field in those objects 

274
00:12:34,120 --> 00:12:35,160
volatile. 
Correct. 

275
00:12:35,480 --> 00:12:37,960
It is the basis of many lock 
free algorithms. 

276
00:12:38,520 --> 00:12:40,400
But here is where you have to be
so careful. 

277
00:12:40,680 --> 00:12:44,320
Because volatile guarantees 
visibility, people assume it 

278
00:12:44,320 --> 00:12:46,880
guarantees atomicity and it does
not. 

279
00:12:47,000 --> 00:12:48,360
OK. 
Let's define the difference 

280
00:12:48,360 --> 00:12:50,760
clearly. 
Synchronize provides atomicity. 

281
00:12:50,760 --> 00:12:53,440
It's like locking the door to 
the office so only one person 

282
00:12:53,440 --> 00:12:55,680
can be in there touching the 
data at the same time. 

283
00:12:55,960 --> 00:12:57,560
Does volatile do that? 
Not at. 

284
00:12:57,560 --> 00:13:00,440
All volatile guarantees. 
We're all looking at the same 

285
00:13:00,440 --> 00:13:02,560
page on the bulletin board. 
It does not guarantee that we 

286
00:13:02,560 --> 00:13:04,800
won't try to write on that page 
at the same time and overwrite 

287
00:13:04,800 --> 00:13:08,520
each other's handwriting. 
The classic interview trap is Is

288
00:13:08,520 --> 00:13:11,600
count plus plus thread safe if 
the integer count is declared 

289
00:13:11,600 --> 00:13:12,600
volatile? 
My gut. 

290
00:13:12,600 --> 00:13:15,720
Reaction is to say yes, because 
if I read it, I'm guaranteed to 

291
00:13:15,720 --> 00:13:18,720
see the very latest value 
written by another thread and. 

292
00:13:18,720 --> 00:13:20,640
That's the trap everyone falls 
into. 

293
00:13:21,200 --> 00:13:24,640
Count plus looks like 1 
instruction, but under the hood 

294
00:13:24,720 --> 00:13:26,800
it is actually three separate 
operations. 

295
00:13:26,800 --> 00:13:30,520
One read the current value of 
count 2. 

296
00:13:31,040 --> 00:13:34,400
Add 1 to that value in ACPU 
register three. 

297
00:13:34,560 --> 00:13:36,960
Write the new value back to 
count a read. 

298
00:13:36,960 --> 00:13:40,400
Modify write operation OK, so. 
Imagine count is five. 

299
00:13:40,600 --> 00:13:43,320
I am thread AI, read count. 
I see five. 

300
00:13:43,320 --> 00:13:47,640
You're thread B, you read count.
You also see 5 because we both 

301
00:13:47,640 --> 00:13:49,560
have perfect visibility thanks 
to volatile, right? 

302
00:13:49,560 --> 00:13:50,760
We're both. 
Up to date now I. 

303
00:13:50,760 --> 00:13:52,960
Add 1 to my local copy, I have 
6. 

304
00:13:53,160 --> 00:13:56,000
At the same time you add 1 to 
your local copy, you have 6. 

305
00:13:56,200 --> 00:13:59,120
I write my 6 back to the count 
variable, it's now 6. 

306
00:13:59,320 --> 00:14:01,200
Then you write your 6 back to 
the count variable. 

307
00:14:01,280 --> 00:14:03,560
It's still 6. 
We both did the work but the 

308
00:14:03,560 --> 00:14:05,920
counter only went up by one. 
We lost an update because 

309
00:14:05,920 --> 00:14:07,600
volatile didn't. 
Lock the door while we were 

310
00:14:07,600 --> 00:14:09,880
doing the math, it just made 
sure we both saw the number 5 on

311
00:14:09,880 --> 00:14:11,720
the board before we went back to
our desks. 

312
00:14:11,720 --> 00:14:13,840
Precisely. 
Yeah, it just made sure we saw 

313
00:14:13,840 --> 00:14:17,120
the door was open. 
So volatile is fantastic for 

314
00:14:17,120 --> 00:14:19,880
status flags. 
Things like I am done stop 

315
00:14:19,880 --> 00:14:23,680
working configuration loaded 
simple booleans or indicators 

316
00:14:23,680 --> 00:14:25,680
that are written by one thread 
and read by many. 

317
00:14:26,360 --> 00:14:29,880
It is not for counters and it is
not for any complex business 

318
00:14:29,880 --> 00:14:32,880
logic where the new value 
depends on the old value. 

319
00:14:33,040 --> 00:14:34,960
So if. 
I need to do count plus plus 

320
00:14:34,960 --> 00:14:37,600
safely. 
I need to use an atomic integer 

321
00:14:37,600 --> 00:14:40,880
or a lock, yes? 
Atomic integer uses a different 

322
00:14:40,880 --> 00:14:44,560
special hardware instructions 
CAS or compare and swap which 

323
00:14:44,560 --> 00:14:48,040
essentially does that whole 
read, modify, write cycle as one

324
00:14:48,120 --> 00:14:50,360
indivisible atomic hardware 
operation. 

325
00:14:50,360 --> 00:14:52,400
But that, as you said is a whole
other deep dive. 

326
00:14:52,440 --> 00:14:54,040
OK, I want. 
To pivot to the Horror Story we 

327
00:14:54,040 --> 00:14:56,160
mentioned in the intro, this 
seems like a good time for a 

328
00:14:56,160 --> 00:14:58,600
case study. 
Segment 4 Double checked 

329
00:14:58,600 --> 00:15:01,400
locking. 
This pattern looks so smart on 

330
00:15:01,400 --> 00:15:03,600
paper and yet it has caused so 
much grief. 

331
00:15:03,680 --> 00:15:05,480
It's the. 
Siren song of concurrency. 

332
00:15:05,880 --> 00:15:08,240
Every intermediate developer 
thinks they've invented it, and 

333
00:15:08,240 --> 00:15:11,080
every senior developer has to 
patiently explain why they need 

334
00:15:11,080 --> 00:15:12,280
to delete it. 
Let's set. 

335
00:15:12,280 --> 00:15:16,120
The scene we have a Singleton, 
some big heavy object. 

336
00:15:16,480 --> 00:15:18,760
We don't want to create it when 
the application starts, we want 

337
00:15:18,760 --> 00:15:20,840
to create it lazily the first 
time someone asks for it. 

338
00:15:20,840 --> 00:15:23,360
Lazy. 
Loading a standard requirement, 

339
00:15:23,360 --> 00:15:26,080
so. 
The simple 100% correct Safeway 

340
00:15:26,080 --> 00:15:28,760
is just make the entire get 
instance method synchronized. 

341
00:15:28,920 --> 00:15:33,040
You basically say lock the door,
check if the instance is null, 

342
00:15:33,160 --> 00:15:36,520
if yes, create it, unlock the 
door, return it. 

343
00:15:36,880 --> 00:15:39,360
That works. 
Perfectly, it is 100% correct. 

344
00:15:39,360 --> 00:15:42,320
You can ship that code and sleep
well at night must. 

345
00:15:42,720 --> 00:15:45,720
There's always a but but. 
Synchronization implies 

346
00:15:45,720 --> 00:15:48,400
overhead. 
In early Java versions it was 

347
00:15:48,400 --> 00:15:52,720
quite slow and developers being 
clever thought why am I paying 

348
00:15:52,720 --> 00:15:55,280
the cost of locking every single
time I want to get this object? 

349
00:15:55,520 --> 00:15:58,000
I only need to lock once, the 
very first time I created. 

350
00:15:58,000 --> 00:15:59,680
After that it's just a read 
operation. 

351
00:15:59,960 --> 00:16:01,720
Reading shouldn't require a lock
that. 

352
00:16:01,720 --> 00:16:03,520
Logic sounds solid. 
The lock is only for the 

353
00:16:03,520 --> 00:16:05,960
creation path, so. 
They invented double check 

354
00:16:05,960 --> 00:16:07,280
locking. 
It goes like this. 

355
00:16:07,480 --> 00:16:09,560
Check. 
If the instance is null without 

356
00:16:09,560 --> 00:16:11,760
a lock, If it is not null, just 
return it. 

357
00:16:11,760 --> 00:16:14,720
That's the fast path. 
If it is null, then and not only

358
00:16:14,720 --> 00:16:16,360
then do you enter a synchronized
block. 

359
00:16:16,440 --> 00:16:20,720
Step 2, Inside the lock, you 
check if it's still null. 

360
00:16:20,880 --> 00:16:23,920
You have to do this because 
another thread might have beaten

361
00:16:23,920 --> 00:16:26,400
you to the lock while you were 
waiting to double check exactly.

362
00:16:26,600 --> 00:16:29,560
And if it's still null. 
Step three, you create the 

363
00:16:29,560 --> 00:16:32,760
object instance. 
New Singleton, it sounds. 

364
00:16:32,760 --> 00:16:36,240
Logic proof You're only locking 
when absolutely necessary, but. 

365
00:16:36,240 --> 00:16:40,160
In Java versions before 5, which
was in 2004, and even today, if 

366
00:16:40,160 --> 00:16:43,000
you don't use volatile 
correctly, this code is lethal. 

367
00:16:43,120 --> 00:16:44,360
It's a. 
Ticking time bomb? 

368
00:16:44,560 --> 00:16:46,760
Explain. 
The failure mode because if I'm 

369
00:16:46,760 --> 00:16:49,400
inside the synchronized block 
I'm the only on creating the 

370
00:16:49,400 --> 00:16:50,800
object. 
I feel safe you are. 

371
00:16:50,800 --> 00:16:52,920
Safe thread A, the creator is 
fine. 

372
00:16:53,200 --> 00:16:55,160
The victim is thread B who comes
along. 

373
00:16:55,160 --> 00:16:57,760
A nanosecond later, thread B 
comes along. 

374
00:16:57,760 --> 00:16:59,800
It hits that first. 
Check the one outside the lock 

375
00:17:00,240 --> 00:17:02,160
if in. 
Java bytecode and in machine 

376
00:17:02,160 --> 00:17:06,319
code, creating an object is not 
one atomic action, it's 

377
00:17:06,319 --> 00:17:10,839
effectively 3 high level steps. 
One, allocate memory for the 

378
00:17:10,839 --> 00:17:14,960
object, grab some bytes from the
heap 2 initialize the object, 

379
00:17:15,200 --> 00:17:17,480
run the constructor code, set 
the fields to their initial 

380
00:17:17,480 --> 00:17:21,640
values, 3 point the instance 
variable to that newly allocated

381
00:17:21,640 --> 00:17:22,599
memory address. 
OK. 

382
00:17:23,200 --> 00:17:27,480
Allocate, initialize, publish in
that order 123 but remember. 

383
00:17:27,480 --> 00:17:30,280
Our old friend reordering the 
compiler and the CPU are 

384
00:17:30,280 --> 00:17:32,400
perfectly allowed to swap steps 
two and three. 

385
00:17:32,400 --> 00:17:34,720
If it's faster, they might say 
you know what, I'll point the 

386
00:17:34,720 --> 00:17:37,160
instance variable to the memory 
address first and then I'll run 

387
00:17:37,160 --> 00:17:39,480
the slow constructor. 
Wait, so there's a brief moment 

388
00:17:39,480 --> 00:17:41,600
where the instance variable is 
not null. 

389
00:17:41,960 --> 00:17:44,920
It points to a valid memory 
address, but the constructor 

390
00:17:45,000 --> 00:17:46,280
hasn't run yet. 
Precisely. 

391
00:17:46,480 --> 00:17:49,480
The memory it points to contains
default values, zeros for 

392
00:17:49,480 --> 00:17:51,680
integers, nulls for any object 
references. 

393
00:17:51,680 --> 00:17:53,960
It is a partially constructed 
object. 

394
00:17:53,960 --> 00:17:56,960
Now thread B hits that first 
check is instance null. 

395
00:17:57,360 --> 00:18:00,040
Well, step three happened. 
The reordered right to instance 

396
00:18:00,040 --> 00:18:01,520
happened. 
So instances pointing to a 

397
00:18:01,520 --> 00:18:05,760
memory address, it is not null. 
Thread B says great, it's ready.

398
00:18:06,040 --> 00:18:09,080
It skips the lock entirely, 
grabs the object reference and 

399
00:18:09,080 --> 00:18:10,080
tries to use it. 
But the. 

400
00:18:10,080 --> 00:18:13,480
Constructor hasn't run. 
The fields aren't initialized 

401
00:18:13,480 --> 00:18:15,680
exactly. 
Thread B is holding a hollow 

402
00:18:15,680 --> 00:18:18,680
object, a shell. 
It tries to call a method on, it

403
00:18:18,880 --> 00:18:21,840
encounters A null field that 
should have been initialized, 

404
00:18:22,080 --> 00:18:24,560
and your program flows A null 
pointer exception. 

405
00:18:25,160 --> 00:18:28,560
Or worse, it calculates a 
financial transaction using 0 as

406
00:18:28,560 --> 00:18:30,320
the interest rate. 
Because the rate field wasn't 

407
00:18:30,320 --> 00:18:31,680
set yet, that is. 
Insidious. 

408
00:18:31,680 --> 00:18:34,000
The object exists, but it's a 
ghost we. 

409
00:18:34,000 --> 00:18:37,360
Call it unsafe publication. 
You publish the reference to the

410
00:18:37,360 --> 00:18:39,520
object before you finish 
building the object. 

411
00:18:39,560 --> 00:18:40,800
So how? 
Does the fix work? 

412
00:18:41,120 --> 00:18:44,320
The notes say you just add 
volatile to the instance 

413
00:18:44,320 --> 00:18:46,720
variable. 
How does one word fix all of 

414
00:18:46,720 --> 00:18:49,440
this right? 
Remember the happens before rule

415
00:18:49,440 --> 00:18:51,800
for volatile. 
A volatile right creates a 

416
00:18:51,800 --> 00:18:55,080
release fence if the instance 
field is declared volatile. 

417
00:18:55,200 --> 00:18:58,080
The JMM forbids the reordering 
of the right to instance with 

418
00:18:58,080 --> 00:19:00,600
any of the operations that 
happened before it in program 

419
00:19:00,600 --> 00:19:03,880
order, which includes the 
constructor initialization, So. 

420
00:19:03,880 --> 00:19:06,920
Volatile forces the order back 
to 123. 

421
00:19:07,240 --> 00:19:09,680
Allocate, initialize, then 
publish. 

422
00:19:09,920 --> 00:19:12,480
Exactly. 
It ensures that if thread B sees

423
00:19:12,480 --> 00:19:15,800
a non dull instance, it is 
absolutely guaranteed to see the

424
00:19:15,800 --> 00:19:18,560
fully constructed, fully 
initialized object. 

425
00:19:18,880 --> 00:19:20,960
It cannot see the half baked 
version. 

426
00:19:21,280 --> 00:19:22,960
It's. 
Amazing that one keyword has 

427
00:19:22,960 --> 00:19:26,960
such a profound effect on the 
CPU pipeline, but the expert 

428
00:19:26,960 --> 00:19:29,480
notes also mentioned that we 
probably shouldn't be writing 

429
00:19:29,480 --> 00:19:31,160
doublechecked locking at all 
anymore. 

430
00:19:31,200 --> 00:19:33,840
Yeah, it's. 
Basically considered an anti 

431
00:19:33,840 --> 00:19:35,880
pattern now. 
It's a great teaching tool for 

432
00:19:35,880 --> 00:19:38,200
the JMM, but in practice there 
are better ways. 

433
00:19:38,600 --> 00:19:41,240
If you want a lazy thread safe 
Singleton, just use the 

434
00:19:41,240 --> 00:19:44,040
initialization on demand holder 
idiom which is. 

435
00:19:44,120 --> 00:19:45,560
Using a static inner class, 
right? 

436
00:19:45,560 --> 00:19:47,240
Yes. 
You create a private static 

437
00:19:47,240 --> 00:19:49,840
inner class that holds the 
static final instance. 

438
00:19:50,040 --> 00:19:52,040
You rely on the JVM's class 
loader. 

439
00:19:52,320 --> 00:19:55,560
The Java Language Specification 
guarantees that class loading is

440
00:19:55,560 --> 00:19:58,000
thread safe. 
It handles all the locking and 

441
00:19:58,000 --> 00:19:59,960
synchronization for you behind 
the scenes. 

442
00:20:00,160 --> 00:20:03,640
It's cleaner, it's often faster,
and it is impossible to mess up.

443
00:20:04,080 --> 00:20:05,240
So the. 
Take away there is. 

444
00:20:05,720 --> 00:20:08,400
Don't try to be clever with low 
level locking unless you have a 

445
00:20:08,400 --> 00:20:11,760
very very good reason with the 
JVM. 

446
00:20:11,760 --> 00:20:14,440
Do the heavy lifting. 100% 
speaking. 

447
00:20:14,440 --> 00:20:18,240
Of letting the JVM help us, 
let's talk about final fields. 

448
00:20:18,320 --> 00:20:21,880
This is segment 5. 
We usually think of final as I 

449
00:20:21,880 --> 00:20:25,560
can't reassign this variable 
like constant other languages, 

450
00:20:25,840 --> 00:20:29,320
but in the JMM it has a special 
superpower regarding visibility 

451
00:20:29,440 --> 00:20:31,960
this. 
Is one of the coolest and most 

452
00:20:31,960 --> 00:20:35,520
useful parts of the spec. 
It's called freeze semantics. 

453
00:20:36,000 --> 00:20:38,320
We just talked about the 
nightmare of seeing a partially 

454
00:20:38,320 --> 00:20:41,080
constructed object. 
Well, final is the antidote. 

455
00:20:41,440 --> 00:20:45,680
The JM gives a very strong 
guarantee if a field is declared

456
00:20:45,680 --> 00:20:48,560
final and the object is 
constructed correctly defined. 

457
00:20:48,560 --> 00:20:51,840
Correctly, what can go wrong it?
Means you didn't let the this 

458
00:20:51,840 --> 00:20:53,760
reference escape during the 
constructor. 

459
00:20:53,760 --> 00:20:56,200
You didn't do something silly 
like passing this to another 

460
00:20:56,200 --> 00:20:59,200
object or adding this to a 
global list inside the 

461
00:20:59,200 --> 00:21:00,520
constructor code. 
Why is that? 

462
00:21:00,520 --> 00:21:03,080
Bad because. 
If another thread can get a 

463
00:21:03,080 --> 00:21:05,160
reference to your object while 
the constructor is still 

464
00:21:05,160 --> 00:21:07,880
running, then even the final 
fields might not be visible yet.

465
00:21:08,080 --> 00:21:10,040
The freeze happens at the end of
the constructor. 

466
00:21:10,200 --> 00:21:12,320
You let someone look at the 
object before it was frozen. 

467
00:21:12,440 --> 00:21:14,040
Got it. 
So as long as you don't leak 

468
00:21:14,040 --> 00:21:16,240
this if you. 
Construct it cleanly. 

469
00:21:16,400 --> 00:21:21,040
Then any thread that sees that 
object at any time is to see the

470
00:21:21,040 --> 00:21:24,760
correct initialize values of all
its final fields, so they. 

471
00:21:24,760 --> 00:21:28,200
Can never see the default 0 or 
null value for a final field. 

472
00:21:28,200 --> 00:21:29,880
Never. 
Even if there is no 

473
00:21:29,880 --> 00:21:33,360
synchronization whatsoever. 
Even if there is a data race to 

474
00:21:33,360 --> 00:21:37,240
publish the object itself, the 
final values are guaranteed to 

475
00:21:37,240 --> 00:21:40,680
be visible and correct. 
They are frozen into the object.

476
00:21:40,680 --> 00:21:42,960
This is. 
Why everyone says immutable 

477
00:21:42,960 --> 00:21:44,920
objects are inherently thread 
safe? 

478
00:21:45,040 --> 00:21:46,440
This. 
Is the exact reason. 

479
00:21:46,600 --> 00:21:49,360
If you make a class where every 
field is final and there are no 

480
00:21:49,360 --> 00:21:53,120
setters, you can pass instances 
of that class between threads 

481
00:21:53,240 --> 00:21:56,760
without any locks, without any 
volatile, without any worry at 

482
00:21:56,760 --> 00:21:58,840
all. 
It is impossible for another 

483
00:21:58,840 --> 00:22:01,440
thread to see it in an 
inconsistent state that is a. 

484
00:22:01,440 --> 00:22:03,680
Massive design tip. 
If you're struggling with 

485
00:22:03,680 --> 00:22:06,640
concurrency bugs, maybe the 
answer there isn't more locks, 

486
00:22:06,840 --> 00:22:08,680
maybe it's to stop using mutable
state. 

487
00:22:08,680 --> 00:22:11,960
Just make everything final it. 
Solves 90% of the problems. 

488
00:22:12,240 --> 00:22:15,080
You move the problem from. 
How do I synchronize access to 

489
00:22:15,080 --> 00:22:18,720
this changing variable to? 
I'll just create a new immutable

490
00:22:18,720 --> 00:22:21,440
object with the new value. 
It's the functional programming 

491
00:22:21,440 --> 00:22:23,920
approach and it works 
beautifully in Java for 

492
00:22:23,920 --> 00:22:25,240
concurrency. 
OK. 

493
00:22:25,240 --> 00:22:27,640
We have covered the safe 
recommended stuff. 

494
00:22:28,200 --> 00:22:32,800
Now we have to go to the danger 
zone Segment 6, the modern era. 

495
00:22:33,080 --> 00:22:35,560
For a long time we had 
synchronized and volatile and 

496
00:22:35,560 --> 00:22:38,720
then there was this hidden 
unofficial class called sun dot 

497
00:22:38,720 --> 00:22:43,320
misky dot unsafe. 
Unsafe the forbidden fruit. 

498
00:22:43,440 --> 00:22:45,560
It gave you direct C style 
memory access. 

499
00:22:45,560 --> 00:22:47,480
You could allocate memory off 
the Java heap. 

500
00:22:47,480 --> 00:22:49,280
You could perform atomic 
operations. 

501
00:22:49,840 --> 00:22:52,280
It was incredibly fast, 
incredibly dangerous, and 

502
00:22:52,280 --> 00:22:53,800
technically unsupported. 
But. 

503
00:22:53,800 --> 00:22:56,240
Everyone used it every. 
Major high performance library, 

504
00:22:56,240 --> 00:22:59,000
Cassandra, Kafka, Netti, you 
name it used it because they 

505
00:22:59,000 --> 00:23:02,400
needed that last ounce of speed 
and Java. 9 finally said OK, we 

506
00:23:02,400 --> 00:23:04,440
can't stop you, so let's at 
least give you a standard 

507
00:23:04,440 --> 00:23:07,280
supported API, and that's where 
Varhandle comes from. 

508
00:23:07,520 --> 00:23:11,320
Varhandle is basically a safe 
standard way to do the things 

509
00:23:11,320 --> 00:23:14,200
Unsafe used to do. 
It gives you extremely fine 

510
00:23:14,200 --> 00:23:16,920
grained control over the JMM 
memory barriers. 

511
00:23:17,480 --> 00:23:19,520
With volatile you have a heavy 
hammer. 

512
00:23:19,760 --> 00:23:23,000
You get perfect visibility and 
perfect ordering always, but 

513
00:23:23,000 --> 00:23:26,040
sometimes that's too expensive. 
Var handle lets you choose your 

514
00:23:26,040 --> 00:23:27,800
mode of memory. 
Access the. 

515
00:23:27,800 --> 00:23:32,160
Notes list these modes. 
We have Plain, Opaque, Acquire, 

516
00:23:32,160 --> 00:23:34,320
Release, and volatile. 
Let's walk through them briefly 

517
00:23:34,320 --> 00:23:35,240
so we know what we're dealing 
with. 

518
00:23:35,440 --> 00:23:37,640
Plain mode plain. 
Modes is the Wild West. 

519
00:23:37,640 --> 00:23:40,360
It treats the variable as a 
normal Java variable. 

520
00:23:40,360 --> 00:23:42,760
No visibility guarantees. 
Reordering is allowed. 

521
00:23:43,120 --> 00:23:46,000
It's as fast as possible, but 
effectively useless for cross 

522
00:23:46,000 --> 00:23:48,320
thread coordination. 
It's for single threaded work 

523
00:23:48,440 --> 00:23:50,040
next. 
Up opaque. 

524
00:23:50,240 --> 00:23:51,960
That's a strange name. 
Opaque. 

525
00:23:51,960 --> 00:23:53,440
Is weird. 
It basically just tells the 

526
00:23:53,440 --> 00:23:56,080
compiler do not delete this line
of code. 

527
00:23:56,440 --> 00:24:00,200
Sometimes the JIT optimizer is 
so smart it sees you right to a 

528
00:24:00,200 --> 00:24:02,920
variable that nobody reads 
nearby and it just deletes the 

529
00:24:02,920 --> 00:24:04,600
instruction entirely to save 
time. 

530
00:24:05,200 --> 00:24:08,120
Opaque says no really I need you
to perform this memory access, 

531
00:24:08,280 --> 00:24:10,120
but it still doesn't guarantee 
other threads see it 

532
00:24:10,120 --> 00:24:12,560
immediately. 
It's mostly for ensuring 

533
00:24:12,560 --> 00:24:15,680
progress, making sure the code 
actually executes, then we. 

534
00:24:15,680 --> 00:24:18,920
Get to acquire, release. 
This sounds like the 

535
00:24:18,920 --> 00:24:21,400
piggybacking effect we talked 
about earlier with volatile it 

536
00:24:21,400 --> 00:24:23,240
is. 
Exactly that, but decoupled. 

537
00:24:23,480 --> 00:24:25,920
This is the sweet spot for 
library writers. 

538
00:24:26,400 --> 00:24:29,000
A standard volatile variable is 
a two way St. 

539
00:24:29,280 --> 00:24:32,720
A volatile write is a release 
fence and a volatile read is an 

540
00:24:32,720 --> 00:24:35,200
acquire fence. 
Acquire release let's you split 

541
00:24:35,200 --> 00:24:37,720
them. 
A set Release says make all my 

542
00:24:37,720 --> 00:24:39,520
previous rights visible to other
threads. 

543
00:24:39,520 --> 00:24:43,320
Then write this value and get. 
Acquire says read this value and

544
00:24:43,320 --> 00:24:45,080
make sure I see all the rights 
that happened before the 

545
00:24:45,080 --> 00:24:46,600
corresponding release. 
So if. 

546
00:24:46,600 --> 00:24:49,720
I only need one way 
coordination, like a producer 

547
00:24:49,720 --> 00:24:51,880
thread pushing data into a 
queue. 

548
00:24:51,880 --> 00:24:56,320
For a consumer thread, I can use
acquire, release and save the 

549
00:24:56,320 --> 00:24:59,360
cost of the full volatile fence 
on one side of the operation. 

550
00:24:59,360 --> 00:25:02,400
Exactly. 
You are shaving off nanoseconds,

551
00:25:02,920 --> 00:25:05,720
but if you're writing something 
like the L Max disruptor or a 

552
00:25:05,720 --> 00:25:09,360
high frequency trading platform,
those nanoseconds add U to 

553
00:25:09,360 --> 00:25:10,880
millions of dollars. 
But the. 

554
00:25:10,880 --> 00:25:13,080
Warning in the expert notes is 
pretty stark. 

555
00:25:13,360 --> 00:25:16,960
It says these are sharp tools. 
Most application developers 

556
00:25:16,960 --> 00:25:19,640
should stick to Volatile and 
synchronized, absolutely. 

557
00:25:19,640 --> 00:25:22,600
The complexity explodes. 
You are now manually managing 

558
00:25:22,600 --> 00:25:25,280
memory barriers. 
It's incredibly easy to get 

559
00:25:25,280 --> 00:25:27,160
wrong. 
There was a case study in the 

560
00:25:27,160 --> 00:25:30,040
notes about a system where two 
worker threads try to claim an 

561
00:25:30,040 --> 00:25:33,480
e-mail to send it. 
If you use plain or opaque modes

562
00:25:33,480 --> 00:25:35,640
via VAR handles because you 
think, oh, a little race 

563
00:25:35,640 --> 00:25:37,400
condition won't hurt, it's a 
benign race. 

564
00:25:37,680 --> 00:25:40,560
You end up in a disaster. 
You end up with double sends. 

565
00:25:41,120 --> 00:25:43,840
Both workers think they claim 
the task because worker A's 

566
00:25:43,840 --> 00:25:47,240
right to the claimed flag wasn't
visible to worker B in time. 

567
00:25:47,480 --> 00:25:50,720
You have to really, really 
understand the hardware memory 

568
00:25:50,720 --> 00:25:53,560
model you're running on to use 
var handles correctly. 

569
00:25:54,080 --> 00:25:57,320
If you aren't writing a library 
that lives inside Java dot util 

570
00:25:57,320 --> 00:26:00,000
dot concurrent, you probably 
don't need them. 

571
00:26:00,400 --> 00:26:01,000
That's. 
Fair. 

572
00:26:01,160 --> 00:26:03,920
It's like owning A blowtorch. 
It's very useful for a 

573
00:26:03,920 --> 00:26:06,840
professional plumber, but 
extremely dangerous for the guy 

574
00:26:06,840 --> 00:26:08,480
who's just trying to make toast,
that is. 

575
00:26:08,480 --> 00:26:10,560
A perfect analogy. 
Stick to the toaster. 

576
00:26:10,560 --> 00:26:14,600
Use synchronized and volatile. 
So we've been deep in the weeds.

577
00:26:14,600 --> 00:26:16,640
Let's surface for a moment and 
summarize. 

578
00:26:17,160 --> 00:26:19,760
If a listener is driving to work
right now and wants to retain 

579
00:26:19,760 --> 00:26:23,040
the core survival guide for the 
Java Memory Model, what are the 

580
00:26:23,040 --> 00:26:24,720
three commandments they need to 
remember? 

581
00:26:24,720 --> 00:26:27,120
OK. 
Commandment #1 the hardware is 

582
00:26:27,120 --> 00:26:29,280
not your friend, the JMM spec 
is. 

583
00:26:29,520 --> 00:26:31,360
Don't rely on it works on my 
machine. 

584
00:26:31,880 --> 00:26:36,080
Your laptop is likely in by 86 
processor from Intel or AMD by 

585
00:26:36,080 --> 00:26:39,160
86 has a strong memory model. 
It acts nice. 

586
00:26:39,280 --> 00:26:41,600
It often keeps things in order 
even when the spec says it 

587
00:26:41,600 --> 00:26:42,520
doesn't have to. 
It's. 

588
00:26:42,520 --> 00:26:43,320
Forgiving. 
Very. 

589
00:26:43,320 --> 00:26:46,600
Forgiving, but if you deploy 
that code to an AWS Graviton 

590
00:26:46,600 --> 00:26:51,120
instance or a modern Mac which 
uses ARM chips, those are weak 

591
00:26:51,120 --> 00:26:53,600
memory models. 
They reorder aggressively to get

592
00:26:53,600 --> 00:26:56,280
performance. 
Your code that worked perfectly 

593
00:26:56,280 --> 00:26:59,200
on your laptop for years will 
suddenly break in production. 

594
00:26:59,520 --> 00:27:03,080
You must code to the JMM 
contract, not the hardware 

595
00:27:03,080 --> 00:27:05,160
behavior you happen to observe. 
That's a big. 

596
00:27:05,160 --> 00:27:08,800
One testing on by 86 does not 
prove thread safety on ARM 

597
00:27:08,840 --> 00:27:12,680
commandment. #2 Know your tools 
and use them for their intended 

598
00:27:12,680 --> 00:27:15,360
purpose. 
Use volatile for simple status 

599
00:27:15,360 --> 00:27:18,320
flags like booleans. 
Use synchronized or atomic 

600
00:27:18,320 --> 00:27:21,160
classes for counters and 
operations where the new state 

601
00:27:21,160 --> 00:27:23,680
depends on the old state. 
Do not mix them up. 

602
00:27:24,120 --> 00:27:26,560
Volatile is not a lock. 
It will not protect you from 

603
00:27:26,560 --> 00:27:29,400
race conditions and read, 
modify, write sequences and the 

604
00:27:29,400 --> 00:27:31,360
third. 
Commandment. #3 embrace 

605
00:27:31,360 --> 00:27:33,800
immutability. 
Final fields are the cheat code 

606
00:27:33,800 --> 00:27:36,480
for safe concurrency. 
If you can design your objects 

607
00:27:36,480 --> 00:27:39,280
to be immutable, set all their 
values in the constructor, and 

608
00:27:39,280 --> 00:27:42,160
never change them, you bypass 
this entire headache of 

609
00:27:42,160 --> 00:27:45,400
visibility and ordering. 
The JMM does all the work for 

610
00:27:45,400 --> 00:27:47,240
you free. 
Semantics for the the win. 

611
00:27:47,400 --> 00:27:50,720
This has been a fascinating look
into the engine room of the JVM.

612
00:27:50,960 --> 00:27:55,040
It's a bit unsettling to realize
that our human concept of time 

613
00:27:55,280 --> 00:27:57,680
is just a friendly suggestion to
a modern computer. 

614
00:27:57,920 --> 00:27:59,200
And that's. 
The final thought I'd want to 

615
00:27:59,200 --> 00:28:02,400
leave people with We humans are 
obsessed with what's called 

616
00:28:02,400 --> 00:28:06,720
sequential consistency. 
We want a single global timeline

617
00:28:06,720 --> 00:28:10,640
for the universe. 
Event A happened at 12.000 point

618
00:28:10,640 --> 00:28:15,160
01. 
Event B happened at 12.000 point

619
00:28:15,160 --> 00:28:19,480
02 and every one agrees right? 
A single source of truth, but 

620
00:28:19,480 --> 00:28:21,080
for. 
Hardware designers and compiler 

621
00:28:21,080 --> 00:28:23,240
writers. 
Strict time, strict sequential 

622
00:28:23,240 --> 00:28:26,360
consistency is death. 
It absolutely kills performance.

623
00:28:26,640 --> 00:28:28,840
The deeper you go into 
concurrent systems, the more you

624
00:28:28,840 --> 00:28:32,480
realize time in computers isn't 
a straight line, it's a graph of

625
00:28:32,480 --> 00:28:34,440
causal dependencies. 
The deeper you go into 

626
00:28:34,440 --> 00:28:37,160
concurrent systems, the more you
realize the question isn't when 

627
00:28:37,160 --> 00:28:38,920
did this happen, What is it? 
Though the question is. 

628
00:28:38,920 --> 00:28:40,960
Who is allowed to see that this 
happened? 

629
00:28:41,040 --> 00:28:42,680
Who is? 
Allowed to see that changes the 

630
00:28:42,680 --> 00:28:44,560
whole perspective. 
It's not about clocks, it's 

631
00:28:44,560 --> 00:28:46,800
about permissions and visibility
between threads. 

632
00:28:47,360 --> 00:28:49,800
Thank you for guiding us through
the invisible chaos. 

633
00:28:49,800 --> 00:28:50,640
It was a. 
Pleasure. 

634
00:28:50,680 --> 00:28:54,440
It's a fun, if scary topic and. 
Thank you to everyone listening.

635
00:28:54,840 --> 00:28:58,560
Hopefully the next time you type
synchronized or volatile, you'll

636
00:28:58,640 --> 00:29:00,560
picture that little notebook in 
the bulletin board in the 

637
00:29:00,560 --> 00:29:03,920
hallway and appreciate the 
incredible heavy lifting that 

638
00:29:03,920 --> 00:29:06,640
JVM is doing to keep your 
reality consistent. 

639
00:29:06,680 --> 00:29:08,800
This has been the deep dive. 
We'll see you next time.

