1
00:00:00,040 --> 00:00:04,040
As software engineers, only a 
fraction of your time is spent 

2
00:00:04,120 --> 00:00:05,920
coding. 
A lot of your time is spent 

3
00:00:05,920 --> 00:00:08,960
thinking, how should I lay out 
the interfaces? 

4
00:00:08,960 --> 00:00:12,040
How do I work with the other 
software within the company? 

5
00:00:12,280 --> 00:00:15,960
How do I make sure I've got 
really clear requirements for my

6
00:00:15,960 --> 00:00:18,640
code? 
Like, all of these things are 

7
00:00:18,640 --> 00:00:22,040
the really big problems that 
still humans have to wrestle 

8
00:00:22,040 --> 00:00:25,360
with, and I'm not seeing LLMS 
taking that away from us anytime

9
00:00:25,360 --> 00:00:34,040
soon, at least for now. 
Hey everyone, my name is Henry 

10
00:00:34,040 --> 00:00:38,160
Surya Virawan and you're 
listening to the Tech Lead 

11
00:00:38,160 --> 00:00:41,320
Journal Podcast, the show where 
I'll be bringing you the 

12
00:00:41,320 --> 00:00:44,480
greatest technical leaders, 
practitioners and thought 

13
00:00:44,480 --> 00:00:47,880
leaders in the industry to 
discuss about their journey, 

14
00:00:48,160 --> 00:00:52,640
ideas and practices that we all 
can learn and apply to build a 

15
00:00:52,640 --> 00:00:56,200
highly performing technical team
and to make an impact in your 

16
00:00:56,200 --> 00:00:59,440
personal work. 
So let's dive into our journal. 

17
00:01:04,480 --> 00:01:07,040
Hello Doctor Dan, Doctor Leo, 
Good to see you in Technically 

18
00:01:07,040 --> 00:01:08,920
Journal Podcast. 
Welcome to the show. 

19
00:01:09,640 --> 00:01:11,080
Now, thank you for having us. 
We're excited to be here. 

20
00:01:11,720 --> 00:01:15,080
So I always love to start my 
conversation by asking my guests

21
00:01:15,080 --> 00:01:17,560
to actually share a little bit 
more about your career. 

22
00:01:17,720 --> 00:01:20,240
If you can maybe mention your 
highlights or turning points 

23
00:01:20,240 --> 00:01:22,280
that we all can learn from, that
will be great. 

24
00:01:22,960 --> 00:01:24,240
Thanks, Henry. 
I'd be happy to. 

25
00:01:24,640 --> 00:01:27,600
I think the beginning might not 
be too surprising. 

26
00:01:27,600 --> 00:01:32,400
So I started in university in 
computer science and I did my 

27
00:01:32,520 --> 00:01:35,240
undergrad degree in computer 
science. 

28
00:01:35,240 --> 00:01:37,960
And then I started a grad 
program. 

29
00:01:37,960 --> 00:01:41,480
So I started a master's program 
and I was doing something called

30
00:01:41,480 --> 00:01:44,160
formal methods. 
So like normally when people 

31
00:01:44,160 --> 00:01:46,960
want to get confidence in their 
programs, they run them with a 

32
00:01:46,960 --> 00:01:49,560
bunch of test cases. 
Like you know, as many test 

33
00:01:49,560 --> 00:01:52,560
cases as they can come up with. 
But formal methods is different.

34
00:01:52,560 --> 00:01:56,360
Formal methods, you try to prove
mathematically that the program 

35
00:01:56,360 --> 00:01:59,240
is correct. 
And I was pretty interested in 

36
00:01:59,240 --> 00:02:03,560
that, although I could kind of 
tell as I was working on it that

37
00:02:03,600 --> 00:02:05,440
it was just extremely difficult 
for me. 

38
00:02:05,440 --> 00:02:07,760
And I don't mean to say, oh, if 
you're, you know, if you're not 

39
00:02:07,760 --> 00:02:09,280
getting something right away, 
you should just give up. 

40
00:02:09,280 --> 00:02:11,320
Like that's not what I mean. 
But I I mean, there was some 

41
00:02:11,320 --> 00:02:15,200
extremely impressive people in 
this field and I was happy to be

42
00:02:15,200 --> 00:02:17,240
part of it. 
But I also just sort of 

43
00:02:17,480 --> 00:02:20,880
realistically and realized that 
I wasn't going to be able to 

44
00:02:20,880 --> 00:02:23,800
make a huge impact in that area.
But I was still having a great 

45
00:02:23,800 --> 00:02:26,040
time with it and I guess that's 
all that mattered to me. 

46
00:02:26,680 --> 00:02:31,520
But then my supervisor one day 
happens not to be able to teach.

47
00:02:31,520 --> 00:02:34,320
I think he wasn't feeling well. 
So he called me sort of last 

48
00:02:34,320 --> 00:02:36,720
minute and said, can you cover 
my class For me? 

49
00:02:36,720 --> 00:02:41,720
It was a compiler's class and I 
was worried because I hadn't 

50
00:02:41,720 --> 00:02:44,840
taught a class before, but I, 
you know, I gave it a shot and 

51
00:02:44,920 --> 00:02:49,360
that was the turning point for 
me, going from this kind of 

52
00:02:49,360 --> 00:02:51,800
abstract research to education 
research. 

53
00:02:51,800 --> 00:02:54,360
So I taught that lecture and 
then it was over for me. 

54
00:02:54,360 --> 00:02:57,760
I was just how can I start 
teaching more and studying 

55
00:02:57,760 --> 00:02:59,880
education. 
So it was a complete shift at 

56
00:02:59,880 --> 00:03:02,280
that point. 
I think to this day my 

57
00:03:02,280 --> 00:03:05,120
supervisor would probably say 
that like, his biggest mistake 

58
00:03:05,120 --> 00:03:08,720
was not getting himself to class
that day because we were 

59
00:03:08,720 --> 00:03:11,880
planning on working together 
further in in that area and it 

60
00:03:11,880 --> 00:03:15,800
didn't happen. 
And then maybe in 2010 or 2011 

61
00:03:15,800 --> 00:03:21,080
or so, another big career change
happened, which is I met Leo at 

62
00:03:21,080 --> 00:03:24,320
a conference, we were at both at
an education conference and just

63
00:03:24,320 --> 00:03:27,600
sort of met up there and 
immediately just had a lot in 

64
00:03:27,600 --> 00:03:32,160
common, you know, not just about
our research but just hobbies 

65
00:03:32,160 --> 00:03:35,720
and sports and video games and I
think just like worldview. 

66
00:03:35,720 --> 00:03:40,080
And so we connected immediately 
and just we've worked on dozens 

67
00:03:40,080 --> 00:03:43,800
of papers since then and our our
book most recently. 

68
00:03:43,800 --> 00:03:48,240
And it's a career highlight. 
It's a real honor to be working 

69
00:03:48,240 --> 00:03:50,240
with him on in so many 
capacities. 

70
00:03:50,680 --> 00:03:53,720
So I I have a bit of a non 
traditional career path in that 

71
00:03:53,720 --> 00:03:56,120
I did my undergrad in computer 
science. 

72
00:03:56,120 --> 00:03:58,240
I actually switched into 
computer science as a major and 

73
00:03:58,240 --> 00:04:00,600
then I did 4 years as an officer
in the United States Navy. 

74
00:04:00,640 --> 00:04:02,760
I was a navigator reguided, 
missile destroyer. 

75
00:04:03,160 --> 00:04:07,520
And so a lot of my lessons about
leadership and building teams 

76
00:04:07,520 --> 00:04:09,080
and ethics actually come from my
time there. 

77
00:04:09,320 --> 00:04:11,560
After I finished my time in the 
Navy, I went back for a PhD in 

78
00:04:11,560 --> 00:04:16,240
computer architecture, did lots 
of processor design, very 

79
00:04:16,240 --> 00:04:18,920
similar to kind of how Dan's 
describing formal methods. 

80
00:04:18,920 --> 00:04:21,640
I enjoyed the work, but it was 
really when I started teaching 

81
00:04:21,640 --> 00:04:23,640
and being for the classroom that
I got the most excited. 

82
00:04:24,080 --> 00:04:27,000
Right towards the end of my PhD.
It started shifting into to 

83
00:04:27,000 --> 00:04:28,360
computer science education 
research. 

84
00:04:28,360 --> 00:04:30,840
And it started really with a 
colleague, Simon, kind of 

85
00:04:30,840 --> 00:04:34,000
introducing me to Dan. 
And as Dan kind of pointed out, 

86
00:04:34,000 --> 00:04:37,920
it was just incredibly 
fortuitous me because I didn't 

87
00:04:37,920 --> 00:04:40,320
know the methods of computing 
education at the time. 

88
00:04:40,320 --> 00:04:44,280
Dan was in an education PhD and 
it was really through the two of

89
00:04:44,280 --> 00:04:47,880
us defining our research 
direction together that we did 

90
00:04:47,880 --> 00:04:50,640
tons of really productive work. 
And I'm super appreciative to 

91
00:04:50,640 --> 00:04:53,680
Dan for all we've done. 
I mean, Dan's been in terms of 

92
00:04:53,760 --> 00:04:56,320
the kind of big stuff we've done
in our careers. 

93
00:04:56,520 --> 00:04:59,760
We've investigated how effective
this pedagogy called peer 

94
00:04:59,760 --> 00:05:03,200
instruction is in computer 
science classes and done most 

95
00:05:03,200 --> 00:05:07,440
the main research on that topic,
at least up to like 2017 or so. 

96
00:05:07,440 --> 00:05:08,880
And then other people kind of 
took over the research. 

97
00:05:08,880 --> 00:05:13,160
From there, we built this 
assessment of how people learn 

98
00:05:13,160 --> 00:05:15,960
basic data structures that's 
actually validated and is used 

99
00:05:15,960 --> 00:05:18,160
by the community, called the 
Basic Data Structures Inventory.

100
00:05:18,160 --> 00:05:20,440
The two of us use machine 
learning to predict which 

101
00:05:20,440 --> 00:05:22,520
students are likely to succeed 
and fail, and we were being 

102
00:05:22,520 --> 00:05:25,040
pretty accurate about predicting
student success very early in 

103
00:05:25,040 --> 00:05:28,200
the quarter. 
And just in terms of our book, 

104
00:05:28,280 --> 00:05:32,840
that happened because I happened
to hear about some of these 

105
00:05:32,920 --> 00:05:36,760
generative AI tools like ChatGPT
and GitHub Copilot. 

106
00:05:36,760 --> 00:05:40,840
And I tried them out, and I 
immediately was worried about 

107
00:05:40,840 --> 00:05:43,240
our introductory programming 
courses. 

108
00:05:43,280 --> 00:05:46,320
And I thought, everybody's going
to be panicking. 

109
00:05:46,640 --> 00:05:48,080
It did end up happening like 
that. 

110
00:05:48,080 --> 00:05:50,080
Everybody was panicking, trying 
to figure out what to do. 

111
00:05:50,520 --> 00:05:54,480
And it's like I had gone on a 
call with Leo and I I said, Leo,

112
00:05:54,680 --> 00:05:56,840
there are these tools. 
We have to look at these. 

113
00:05:57,080 --> 00:05:58,760
Maybe we should write a book. 
Somebody has to. 

114
00:05:59,080 --> 00:06:03,880
And it didn't take Leo very 
long, maybe 15 minutes or so of 

115
00:06:03,880 --> 00:06:06,040
playing with the tools himself 
before he agreed. 

116
00:06:06,040 --> 00:06:08,400
He was like, yeah, I mean, 
somebody's got to do this for 

117
00:06:08,400 --> 00:06:13,640
the community, just to give some
direction, some perspective on 

118
00:06:13,640 --> 00:06:17,080
what was happening. 
And then Dan, as Dan said, he 

119
00:06:17,080 --> 00:06:20,400
said, hey, I I started playing 
with these LLM things and they 

120
00:06:20,400 --> 00:06:23,120
are terrifying for trying to 
teach programming because 

121
00:06:23,120 --> 00:06:26,160
they're solving essentially all 
the tasks we used to give them. 

122
00:06:26,480 --> 00:06:29,160
And so we're going to need to 
change how we teach interactive 

123
00:06:29,160 --> 00:06:31,480
programming. 
And as Dan said, it took me 

124
00:06:31,480 --> 00:06:34,240
about probably 15 minutes 
playing with this thing before I

125
00:06:34,240 --> 00:06:36,080
went, Oh my gosh, we're in 
serious trouble. 

126
00:06:36,400 --> 00:06:39,400
And we then sat down and started
figuring out how are we going to

127
00:06:39,400 --> 00:06:43,600
build a class that would adjust 
to the fact these amazing tools 

128
00:06:43,600 --> 00:06:45,840
are available. 
And then as we kept going with 

129
00:06:45,840 --> 00:06:47,840
that, we said, well, wait a 
second, we need a book to help 

130
00:06:47,840 --> 00:06:50,240
structure the class. 
And that's when Dan kind of 

131
00:06:50,240 --> 00:06:53,480
suckered me into writing a book.
And then we, we had a lot of fun

132
00:06:53,480 --> 00:06:55,440
writing in after that. 
So it was great. 

133
00:06:56,200 --> 00:06:57,520
Thank you for sharing your 
story. 

134
00:06:57,520 --> 00:07:00,560
I think one thing that I really 
picked up is like, how did you 

135
00:07:00,560 --> 00:07:03,560
start writing this book? 
Right, Learn AI assisted Python 

136
00:07:03,560 --> 00:07:05,640
programming. 
I think you just thought about 

137
00:07:05,640 --> 00:07:07,520
it in 15 minutes. 
That's really cool. 

138
00:07:07,840 --> 00:07:10,680
So today we're gonna talk a 
little bit more about what 

139
00:07:10,680 --> 00:07:13,600
you've done in terms of doing 
the research and also from your 

140
00:07:13,600 --> 00:07:16,840
experience playing around with 
this LLMAI assistant. 

141
00:07:17,120 --> 00:07:20,160
And also maybe the impact to 
people learning about 

142
00:07:20,160 --> 00:07:23,080
programming. 
Hey, thank you for being part of

143
00:07:23,080 --> 00:07:26,400
the tech regional community. 
This show wouldn't be the same 

144
00:07:26,400 --> 00:07:29,840
without your ears and you are 
the reason this show exists. 

145
00:07:30,600 --> 00:07:33,600
If you're loving TLJ and want to
see it keep on growing. 

146
00:07:34,000 --> 00:07:38,480
Consider becoming a patron at 
techledjournal dot dev patron or

147
00:07:38,480 --> 00:07:41,920
buying me a coffee at 
techledjournal dot dev coffee. 

148
00:07:42,720 --> 00:07:46,560
Every little bit helps field the
research, editing, and sleepless

149
00:07:46,560 --> 00:07:49,560
nights that go into making this 
show the best it can be. 

150
00:07:50,360 --> 00:07:53,160
Thanks for being the best 
listeners any podcast could ask 

151
00:07:53,160 --> 00:07:55,280
for. 
And now let's get back to our 

152
00:07:55,280 --> 00:07:57,280
episode. 
So let's start probably in the 

153
00:07:57,280 --> 00:08:01,000
beginning to just level set our 
understanding what is actually 

154
00:08:01,000 --> 00:08:03,360
AI Assistant. 
You know some people heard about

155
00:08:03,360 --> 00:08:06,600
Copilot, but maybe if you can 
maybe describe what is AI 

156
00:08:06,600 --> 00:08:09,640
Assistant? 
So I guess I would define an AI 

157
00:08:09,640 --> 00:08:14,480
assistant as a piece of software
that helps you get work done 

158
00:08:14,800 --> 00:08:18,400
more efficiently. 
But the key to it is the way you

159
00:08:18,400 --> 00:08:21,640
communicate with it. 
So typically when people think 

160
00:08:21,640 --> 00:08:27,040
about using computers, they 
think you need to be very rigid 

161
00:08:27,040 --> 00:08:29,040
in how you communicate with a 
computer. 

162
00:08:29,440 --> 00:08:32,000
And that's what we would say 
when we teach programming 

163
00:08:32,000 --> 00:08:34,360
courses, right? 
Every symbol matters, every 

164
00:08:34,360 --> 00:08:37,200
space matters. 
Every key press that you make is

165
00:08:37,200 --> 00:08:39,120
important. 
And that's the language of 

166
00:08:39,120 --> 00:08:41,240
computing, right? 
It's very precise. 

167
00:08:41,640 --> 00:08:47,440
What an AI system allows you to 
do is communicate in English or 

168
00:08:47,440 --> 00:08:50,560
any natural language. 
And there are way more people in

169
00:08:50,560 --> 00:08:52,840
the world to know natural 
languages compared to 

170
00:08:52,840 --> 00:08:56,280
programming languages. 
And so the hope and the goal is 

171
00:08:56,280 --> 00:09:00,360
that people be able to use their
own language and have the 

172
00:09:00,360 --> 00:09:04,640
computer translate that, because
computers still can't run or do 

173
00:09:04,640 --> 00:09:07,960
anything with these languages, 
like English for example. 

174
00:09:07,960 --> 00:09:10,880
The only one I know. 
So I keep talking about it, but 

175
00:09:11,080 --> 00:09:13,640
still it has to be translated 
into something a computer can 

176
00:09:13,640 --> 00:09:16,840
work with. 
That's what we're trying to 

177
00:09:16,920 --> 00:09:19,000
automate with these AI 
assistants. 

178
00:09:19,160 --> 00:09:22,320
It's like making it so that 
people can communicate in their 

179
00:09:22,320 --> 00:09:26,040
language and have it 
automatically translated over to

180
00:09:26,040 --> 00:09:29,360
lower level stuff that the 
computer understands. 

181
00:09:29,960 --> 00:09:33,640
And so I would expand and just 
say that what's great about 

182
00:09:33,800 --> 00:09:37,560
these AI assistants is it's 
almost a step in the natural 

183
00:09:37,560 --> 00:09:41,160
progression of making 
programming and interacting with

184
00:09:41,160 --> 00:09:46,200
computers easier for humans. 
And so we've seen this evolution

185
00:09:46,400 --> 00:09:50,520
from having to write assembly 
code or actually before that you

186
00:09:50,520 --> 00:09:52,720
have to push buttons on a 
machine to make it do things. 

187
00:09:53,040 --> 00:09:57,120
And then writing assembly code 
was a huge improvement of the 

188
00:09:57,120 --> 00:10:00,120
stored program computer. 
And then we moved to the point 

189
00:10:00,120 --> 00:10:02,840
that we could start writing in 
higher level languages that were

190
00:10:02,840 --> 00:10:05,640
more English readable than 
assembly, compile down to 

191
00:10:05,640 --> 00:10:07,720
assembly and actually then the 
assembly ran. 

192
00:10:08,000 --> 00:10:10,640
And then over time, we've 
developed more and more advanced

193
00:10:10,640 --> 00:10:16,240
languages that become easier to 
express our goals in the 

194
00:10:16,240 --> 00:10:20,360
language available to us. 
Now, what's unclear to us is 

195
00:10:20,360 --> 00:10:24,360
whether or not these LMS are 
going to be just the next 

196
00:10:24,360 --> 00:10:27,680
language in which we interact. 
Right now you can interact with 

197
00:10:27,680 --> 00:10:31,600
them and get working code fairly
often, but it's not always 

198
00:10:31,600 --> 00:10:34,360
correct, and so it's not quite 
the same as a compiler which is 

199
00:10:34,360 --> 00:10:35,680
deterministically going to be 
correct. 

200
00:10:36,080 --> 00:10:39,320
But it does seem like the next 
step in a really nice evolution 

201
00:10:39,600 --> 00:10:42,800
of making it easier and more 
accessible to write software, 

202
00:10:43,480 --> 00:10:45,000
right? 
So I think that's a very 

203
00:10:45,000 --> 00:10:47,640
interesting thing. 
I've not played around with all 

204
00:10:47,640 --> 00:10:50,480
these copilot tools a lot, I 
mean in my day-to-day role 

205
00:10:50,480 --> 00:10:53,920
because of the nature of my job.
But one thing that I think very 

206
00:10:53,920 --> 00:10:56,920
interesting when I heard 
developers using it right, it's 

207
00:10:56,920 --> 00:10:59,360
like it seems to improve their 
productivity. 

208
00:10:59,680 --> 00:11:01,600
So just like what you said, 
right, it could be the next 

209
00:11:01,600 --> 00:11:03,720
evolution of how we write 
software. 

210
00:11:04,040 --> 00:11:08,160
And I think in your lecturing 
role as well, I think it will be

211
00:11:08,200 --> 00:11:11,200
different now that you teach 
programming to new students, 

212
00:11:11,440 --> 00:11:14,600
maybe a little bit, you know, 
how do you find the difference 

213
00:11:14,600 --> 00:11:18,080
now that there is this LLM AI 
assistant and students now 

214
00:11:18,080 --> 00:11:21,960
learning programming, does that 
become much easier or does it 

215
00:11:21,960 --> 00:11:24,920
actually make it harder? 
Is there any kind of a stock 

216
00:11:24,920 --> 00:11:28,120
difference that you can tell? 
Oh geez, this is kind of a long 

217
00:11:28,120 --> 00:11:32,040
answer, so I'll give it a go. 
So we've been experimenting with

218
00:11:32,200 --> 00:11:35,880
how do you teach new learners 
how to program in the presence 

219
00:11:36,120 --> 00:11:39,000
of LLMS? 
And I'll be upfront that we 

220
00:11:39,000 --> 00:11:41,280
don't have all the answers yet 
by any means. 

221
00:11:41,280 --> 00:11:44,560
I think that's going to be 
probably a decade before we 

222
00:11:44,560 --> 00:11:48,640
actually really know from the 
research what the best way is to

223
00:11:48,760 --> 00:11:50,480
teach students. 
Now, these new tools are 

224
00:11:50,480 --> 00:11:54,240
available, but I can say a few 
things, which is that students 

225
00:11:54,240 --> 00:11:56,560
very quickly recognize how 
powerful the tools are. 

226
00:11:56,880 --> 00:11:59,800
I actually downloaded in my 
class, the very first class, and

227
00:11:59,800 --> 00:12:02,240
they all were. 
I got a gasp from the crowd. 

228
00:12:02,520 --> 00:12:04,720
They couldn't believe that I was
just basically writing the code 

229
00:12:04,720 --> 00:12:07,560
for them. 
And so the question then is what

230
00:12:07,560 --> 00:12:10,040
are we teaching the students? 
What changes here? 

231
00:12:10,440 --> 00:12:13,160
And I think there's still a lot 
to teach students. 

232
00:12:13,200 --> 00:12:16,160
It's just the scoping has 
shifted and what the skills are 

233
00:12:16,160 --> 00:12:18,520
shift. 
We also decide what our goal is.

234
00:12:18,520 --> 00:12:22,080
Are we training the next 
software engineers or are we 

235
00:12:22,080 --> 00:12:25,360
training someone who works in 
business or data science or 

236
00:12:25,360 --> 00:12:28,560
accounting to be able to write 
software that does something 

237
00:12:28,560 --> 00:12:30,360
useful for them. 
And I think those are actually 

238
00:12:30,360 --> 00:12:32,800
slightly different audiences in 
terms of what we want to teach. 

239
00:12:33,320 --> 00:12:37,880
What we've done is we've kind of
realized that the skills that 

240
00:12:37,880 --> 00:12:41,360
you need to interact with an LLM
are actually fundamental 

241
00:12:41,360 --> 00:12:43,200
programming and software 
development skills. 

242
00:12:43,640 --> 00:12:47,920
So the general workflow in 
working with an LLM is you give 

243
00:12:47,920 --> 00:12:51,000
it some, you have some desired 
function that you want to write,

244
00:12:51,000 --> 00:12:55,160
You know you want to accomplish 
some small tasks, and you then 

245
00:12:55,160 --> 00:12:58,200
describe that task. 
And then the LLM is going to 

246
00:12:58,200 --> 00:13:00,760
generate code for you. 
Now the code it's going to 

247
00:13:00,760 --> 00:13:03,200
generate may not be right. 
It may not even be close to kind

248
00:13:03,200 --> 00:13:07,040
of addressing what you want. 
If you read through it, you can 

249
00:13:07,040 --> 00:13:09,120
quickly recognize this isn't 
what I want. 

250
00:13:09,400 --> 00:13:11,400
And then you can pull up 
basically a window of saying, 

251
00:13:11,520 --> 00:13:13,160
are there other solutions that 
are good for me? 

252
00:13:13,720 --> 00:13:17,000
And so already students needed 
to know how to read code, 

253
00:13:17,400 --> 00:13:19,920
understand what it's doing, 
potentially be able to trace 

254
00:13:19,920 --> 00:13:23,320
code, and be able to pick from 
multiple code examples which 

255
00:13:23,320 --> 00:13:26,320
one's going to work. 
And then the next step for them,

256
00:13:26,320 --> 00:13:29,560
once they've kind of picked one 
they think works, it's the right

257
00:13:29,560 --> 00:13:32,840
test because you can't trust the
LM. 

258
00:13:32,840 --> 00:13:35,360
And this is actually a point 
that I think is a really 

259
00:13:35,360 --> 00:13:39,400
encouraging piece for new 
students is testing has been a 

260
00:13:39,400 --> 00:13:40,800
point where new students 
struggle. 

261
00:13:41,080 --> 00:13:44,040
They tend to write code and by 
definition assume it's right, 

262
00:13:44,280 --> 00:13:46,160
which is basically the opposite 
of what we all do as software 

263
00:13:46,160 --> 00:13:47,520
engineers, right? 
You write the code and you 

264
00:13:47,800 --> 00:13:50,160
gather tons of evidence for it 
being correct before you have 

265
00:13:50,160 --> 00:13:52,760
any faith in it. 
And so the students, I think 

266
00:13:52,760 --> 00:13:55,120
because it's coming from a 
machine that they know that 

267
00:13:55,120 --> 00:13:57,680
makes mistakes, they're actually
more willing to test. 

268
00:13:57,920 --> 00:13:59,080
And so that's a research 
question. 

269
00:13:59,080 --> 00:14:00,760
We haven't. 
I don't have the data to support

270
00:14:00,760 --> 00:14:02,800
that yet, but I have suspicions 
that they're more willing to 

271
00:14:02,800 --> 00:14:05,320
test when it's coming from a 
tool that they know can make 

272
00:14:05,320 --> 00:14:07,200
mistakes. 
So then you test the code. 

273
00:14:07,200 --> 00:14:08,920
Writing good tests is super 
important. 

274
00:14:08,920 --> 00:14:10,720
And saying that we haven't 
taught as well as we should have

275
00:14:10,880 --> 00:14:13,680
candidly in the past. 
And then once they've tested it,

276
00:14:13,680 --> 00:14:15,760
now they know that that piece of
code is working. 

277
00:14:16,280 --> 00:14:18,880
However the one catches, 
sometimes the code doesn't work 

278
00:14:18,880 --> 00:14:21,840
and sometimes you can't get the 
LLM to give you the exact right 

279
00:14:21,840 --> 00:14:23,600
answer. 
And so there's still this last 

280
00:14:23,600 --> 00:14:25,680
step of being able to debug. 
And so you have to teach 

281
00:14:25,680 --> 00:14:30,520
explicitly how do you modify 
code that's slightly buggy to do

282
00:14:30,520 --> 00:14:32,560
what you want. 
So we're still teaching 

283
00:14:32,560 --> 00:14:35,400
debugging skills, and that also 
is a fundamental skill, right? 

284
00:14:35,400 --> 00:14:37,640
So, so far your listeners are 
probably saying, well, what's 

285
00:14:37,640 --> 00:14:40,600
different here? 
But there is a shift onto 

286
00:14:40,880 --> 00:14:46,240
reading and modifying and 
testing code away from looking 

287
00:14:46,240 --> 00:14:48,840
at a blank screen and writing 
code from scratch, which is what

288
00:14:48,840 --> 00:14:51,960
we used to do the most of. 
Yeah, and Leo, maybe I I could 

289
00:14:51,960 --> 00:14:55,200
just say this may not be obvious
to people who have been 

290
00:14:55,240 --> 00:14:59,760
programming for a while, but the
syntax of a programming language

291
00:15:00,040 --> 00:15:05,000
is something that takes some 
students weeks and some students

292
00:15:05,000 --> 00:15:08,760
just don't get past it. 
It's extremely stressful for 

293
00:15:08,760 --> 00:15:12,600
students, and also artificial, 
right? 

294
00:15:12,600 --> 00:15:14,760
Like the only reason we need all
this syntax. 

295
00:15:15,000 --> 00:15:17,320
And you know, maybe for this 
nice little bit of discussion, 

296
00:15:17,320 --> 00:15:20,640
people can think about some 
horrendous syntax, like the way 

297
00:15:20,640 --> 00:15:24,240
that like C function pointers 
are defined or something like. 

298
00:15:24,240 --> 00:15:26,760
The only reason it's like that 
is because the compiler has to 

299
00:15:26,760 --> 00:15:31,040
be able to unambiguously 
understand what your goal is. 

300
00:15:31,400 --> 00:15:36,000
And if you're getting stuck on 
syntax, some of us do program 

301
00:15:36,000 --> 00:15:39,040
for the sake of programming. 
I I get that a lot of people 

302
00:15:39,040 --> 00:15:41,520
probably listening to this 
podcast just love programming, 

303
00:15:41,760 --> 00:15:44,720
present company included. 
But there are many people who 

304
00:15:44,720 --> 00:15:48,800
just want to do something with 
the code, right? 

305
00:15:49,080 --> 00:15:52,400
For example, many of us may not 
care how our appliances work. 

306
00:15:52,640 --> 00:15:55,680
Like if my microwave, it makes 
my popcorn for me, I'm happy. 

307
00:15:56,000 --> 00:15:59,000
And that's how some people are 
with code, right? 

308
00:15:59,000 --> 00:16:03,120
They don't need to know what 
exactly each piece of syntax is 

309
00:16:03,120 --> 00:16:05,600
doing. 
What's super exciting for me and

310
00:16:05,600 --> 00:16:11,240
Leo about LLMS is I think for 
the first time we see a future 

311
00:16:11,240 --> 00:16:15,520
where people who don't know how 
to program could be afforded 

312
00:16:15,520 --> 00:16:18,120
some of the same benefits as 
people who do. 

313
00:16:18,960 --> 00:16:22,120
Thanks for the explanation right
of how it changes the dynamics. 

314
00:16:22,120 --> 00:16:24,600
Now for people to learn 
programming languages. 

315
00:16:24,840 --> 00:16:27,240
I think the skill set that you 
mentioned, you know, like still 

316
00:16:27,320 --> 00:16:30,120
people needs to be able to read 
the code, needs to be able to 

317
00:16:30,120 --> 00:16:33,320
test it, need to be able to 
debug it and maybe also need to 

318
00:16:33,320 --> 00:16:36,280
be able to express the task that
they want to solve, right? 

319
00:16:36,440 --> 00:16:39,160
Because LLM cannot just solve 
everything in one go. 

320
00:16:39,320 --> 00:16:41,880
You will probably need to play 
around, do a little bit back and

321
00:16:41,880 --> 00:16:45,720
forth before you come to a 
perfect solution so to speak. 

322
00:16:46,240 --> 00:16:49,000
But one thing I think I want to 
highlight about this tool is 

323
00:16:49,000 --> 00:16:52,320
that there's a risk of not 
getting it right the first time,

324
00:16:52,320 --> 00:16:53,720
right? 
I mean, even if you ask the same

325
00:16:53,720 --> 00:16:56,480
questions, it might spit out 
different answers, right? 

326
00:16:56,640 --> 00:17:00,600
Maybe a little bit of the 
underlying why, why the tool 

327
00:17:00,600 --> 00:17:02,160
doesn't seem to be 
deterministic. 

328
00:17:02,280 --> 00:17:04,480
That's the first thing. 
And maybe a little bit about LLM

329
00:17:04,480 --> 00:17:06,440
how it works for people to 
understand. 

330
00:17:07,280 --> 00:17:10,160
So I really appreciate you 
pointing out the you can't just 

331
00:17:10,160 --> 00:17:12,359
give a large task to an LLM in 
one go. 

332
00:17:12,839 --> 00:17:16,000
And that's actually a super 
important skill that we're now 

333
00:17:16,000 --> 00:17:18,800
teaching in classes that 
candidly we did not in the past.

334
00:17:19,200 --> 00:17:21,960
So what we used to do in the 
past, and this is common across 

335
00:17:21,960 --> 00:17:24,640
pretty much all of computing 
education and I bet many of your

336
00:17:24,640 --> 00:17:28,400
listeners had this in their 
classes is students are given 

337
00:17:28,640 --> 00:17:31,240
essentially just a function. 
The function's basically 

338
00:17:31,240 --> 00:17:33,760
perfectly described because you 
need to be able to auto grade it

339
00:17:33,960 --> 00:17:37,560
and so every possible case is 
covered in that description, 

340
00:17:37,560 --> 00:17:39,920
right. 
And then they have to just fill 

341
00:17:39,920 --> 00:17:42,840
in the code for that function. 
Now, LLMS do that incredibly 

342
00:17:42,840 --> 00:17:45,680
well. 
And so the shift now is, if I 

343
00:17:45,680 --> 00:17:49,040
give a fairly vague task, like a
large project to work through, 

344
00:17:49,360 --> 00:17:53,080
how do they break apart that 
large project into smaller tasks

345
00:17:53,080 --> 00:17:55,080
that the LLM can then help them 
solve? 

346
00:17:55,440 --> 00:17:58,280
Now this for your audience and 
for all of us as software 

347
00:17:58,280 --> 00:18:00,120
engineers, this is problem 
decomposition. 

348
00:18:00,120 --> 00:18:03,360
This is probably one of the most
important skills we learn as 

349
00:18:03,360 --> 00:18:06,920
software engineers, and it used 
to be that we didn't teach that 

350
00:18:06,920 --> 00:18:10,040
to new students learning how to 
program until much later in 

351
00:18:10,040 --> 00:18:13,560
their careers, and now it's 
actually front and centre. 

352
00:18:13,800 --> 00:18:16,360
Incredibly important to learn in
your first programming class 

353
00:18:16,360 --> 00:18:18,800
because that's how you have to 
interact with these LLMS. 

354
00:18:19,160 --> 00:18:21,960
And so in my class this last 
fall, I had students doing 

355
00:18:21,960 --> 00:18:24,080
things that they would have 
never been able to do in a 

356
00:18:24,080 --> 00:18:27,080
previous CS1 class. 
I gave projects like find a data

357
00:18:27,080 --> 00:18:31,560
set on Cagle, ask a question of 
the data, and then write the 

358
00:18:31,560 --> 00:18:33,280
software to answer that 
question. 

359
00:18:33,680 --> 00:18:36,360
That's way beyond the scope of 
what we'd ever asked in ACS One,

360
00:18:36,880 --> 00:18:39,840
especially given the scope of 
what many of my students did, 

361
00:18:40,080 --> 00:18:42,520
where they often did really nice
visualizations. 

362
00:18:42,520 --> 00:18:44,400
They pulled in really 
interesting data sets from the 

363
00:18:44,400 --> 00:18:47,120
domains that they cared about. 
Some of them had interactive 

364
00:18:47,120 --> 00:18:50,840
programs where they were you 
could interact with it and ask, 

365
00:18:50,840 --> 00:18:54,400
I want to see the relationship 
between there's a stroke data 

366
00:18:54,400 --> 00:18:57,000
set and you could say I want to 
see the relationship between age

367
00:18:57,000 --> 00:18:59,480
and stroke, and it would 
actually plot age against stroke

368
00:18:59,800 --> 00:19:02,320
in really clever ways. 
These are things again, you've 

369
00:19:02,320 --> 00:19:05,120
never seen ACS one. 
It's really open-ended and the 

370
00:19:05,120 --> 00:19:08,040
students are doing all the 
problem decomposition on their 

371
00:19:08,040 --> 00:19:09,920
own. 
And so we're really excited 

372
00:19:09,920 --> 00:19:13,720
about teaching this skill. 
And if I'm real reflective, I'm 

373
00:19:13,720 --> 00:19:17,200
a bit disappointed that we as a 
community stopped prioritizing 

374
00:19:17,200 --> 00:19:19,880
that so early in the careers. 
It really should be a first word

375
00:19:19,880 --> 00:19:22,080
priority. 
You mentioned about problem 

376
00:19:22,080 --> 00:19:24,200
decomposition, right? 
I think that's really important 

377
00:19:24,200 --> 00:19:27,440
skills for any programmers with 
or without AI assistant, right? 

378
00:19:27,640 --> 00:19:30,640
It's like to be able to break 
down problems or even 

379
00:19:30,640 --> 00:19:35,000
requirements into small tasks, 
into modules, design classes and

380
00:19:35,000 --> 00:19:38,520
things like that and decompose 
that such that we can make a 

381
00:19:38,520 --> 00:19:40,920
good software right? 
That is maintainable rather than

382
00:19:40,920 --> 00:19:44,120
just one big function that does 
everything in one go. 

383
00:19:44,400 --> 00:19:47,440
So I think problem decomposition
definitely is very important and

384
00:19:47,440 --> 00:19:49,920
I want to come back to the 
question earlier about LLM 

385
00:19:49,920 --> 00:19:52,000
because it's non deterministic 
so far. 

386
00:19:52,240 --> 00:19:56,800
Maybe if you can explain why is 
that non deterministic, How does

387
00:19:56,800 --> 00:19:59,280
it work underlying so that 
people actually understand that?

388
00:19:59,520 --> 00:20:01,960
I mean, it's not going to 
replace programmers in one day, 

389
00:20:01,960 --> 00:20:03,880
right? 
Yeah, Thanks, Henry. 

390
00:20:03,880 --> 00:20:07,000
It's a really important point 
and it's also counterintuitive 

391
00:20:07,400 --> 00:20:11,440
for a lot of computing people 
because, you know, we all joke 

392
00:20:11,440 --> 00:20:14,600
about shutting your computer 
off, turning it back on if the 

393
00:20:14,600 --> 00:20:16,080
thing you're trying to do 
doesn't work. 

394
00:20:16,080 --> 00:20:20,160
But I think many of us go under 
the assumption in our day-to-day

395
00:20:20,160 --> 00:20:24,320
computing lives that computers 
are deterministic. 

396
00:20:24,400 --> 00:20:27,400
If you do something, you're 
going to get the same response. 

397
00:20:27,760 --> 00:20:30,920
I mean, they're always examples 
of this and race conditions and 

398
00:20:30,920 --> 00:20:33,280
stuff, but overall, that's how 
we feel about computing. 

399
00:20:33,280 --> 00:20:36,000
If you write a program and you 
run it, you're going to expect 

400
00:20:36,000 --> 00:20:38,120
that if you run it a second 
time, the same thing is going to

401
00:20:38,120 --> 00:20:41,160
happen. 
And if it doesn't, you probably 

402
00:20:41,160 --> 00:20:44,600
start thinking about, oh I I 
maybe I have a memory allocation

403
00:20:44,600 --> 00:20:48,080
bug or some sort of like 
transient behavior problem in my

404
00:20:48,080 --> 00:20:51,160
program. 
But Henry, as you mentioned, 

405
00:20:51,520 --> 00:20:53,880
LLMS are inherently 
nondeterministic. 

406
00:20:54,240 --> 00:20:59,480
So you ask for some code and you
get the code, and then you ask a

407
00:20:59,480 --> 00:21:01,640
second time and you'll get 
different code and you ask a 

408
00:21:01,640 --> 00:21:03,000
third time and you'll get 
different code. 

409
00:21:03,400 --> 00:21:06,840
And this is first of all kind of
challenging as an instructor. 

410
00:21:07,360 --> 00:21:10,040
We have reported recently that 
it makes it very difficult to 

411
00:21:10,040 --> 00:21:12,960
plan sometimes because 
typically, you know, our 

412
00:21:12,960 --> 00:21:16,000
lectures are sort of scripted in
some ways where we need to 

413
00:21:16,000 --> 00:21:18,800
demonstrate specific things in 
our lecture. 

414
00:21:19,040 --> 00:21:21,720
And it's very hard to do that 
when you don't know what the LLM

415
00:21:21,720 --> 00:21:24,400
is going to respond in real time
in class. 

416
00:21:24,760 --> 00:21:27,120
So there is that. 
But on the other hand there's 

417
00:21:27,120 --> 00:21:31,360
also a benefit, believe it or 
not, of being non deterministic.

418
00:21:31,520 --> 00:21:34,760
And that is because these things
can make mistakes. 

419
00:21:35,160 --> 00:21:39,400
Imagine how frustrating it would
be if you asked it for some code

420
00:21:39,400 --> 00:21:41,560
and they gave you some code and 
it was wrong and then you were 

421
00:21:41,560 --> 00:21:43,720
like, well OK, now what do I do?
Do I try it again. 

422
00:21:44,000 --> 00:21:46,600
And in this case you don't want 
it to be deterministic, right? 

423
00:21:46,600 --> 00:21:48,560
Because then you're just going 
to get the same wrong code every

424
00:21:48,560 --> 00:21:51,520
time. 
So the fact that it is non 

425
00:21:51,520 --> 00:21:54,840
deterministic means that you 
have a chance even if the most 

426
00:21:54,840 --> 00:21:59,320
probable response is wrong. 
Maybe you can ask again, or look

427
00:21:59,320 --> 00:22:03,520
at maybe the top five or top ten
and maybe you could pick out the

428
00:22:03,920 --> 00:22:07,960
correct code from that list. 
And this is a skill, right? 

429
00:22:07,960 --> 00:22:11,320
This is a skill that our 
students or learners did not 

430
00:22:11,320 --> 00:22:13,800
need before, but they do now, 
right? 

431
00:22:13,800 --> 00:22:16,800
So because the first response 
may not be correct, students 

432
00:22:16,800 --> 00:22:20,920
have to know how to go through 
the list of potential solutions 

433
00:22:21,360 --> 00:22:26,240
and figure out which ones are 
perhaps not correct immediately,

434
00:22:26,240 --> 00:22:30,480
but which ones are worth further
testing, further consideration. 

435
00:22:30,920 --> 00:22:35,040
And so for that reason, Leo and 
I are very careful to continue 

436
00:22:35,040 --> 00:22:37,600
to teach the programming 
language. 

437
00:22:37,640 --> 00:22:44,560
So it's true, our book and Leo's
students are working with LMS 

438
00:22:44,800 --> 00:22:47,800
throughout the course, but at 
the same time, we are still 

439
00:22:47,800 --> 00:22:53,240
teaching the students Python, 
because right now Python is the 

440
00:22:53,240 --> 00:22:56,480
language that we teach in the 
introductory computing courses, 

441
00:22:56,480 --> 00:22:59,280
and it's still a important part 
of the loop. 

442
00:22:59,680 --> 00:23:03,920
And so we're not yet at the 
point where you can give English

443
00:23:03,920 --> 00:23:08,240
or whatever natural language 
instructions to the LLM and get 

444
00:23:08,240 --> 00:23:11,600
back your language. 
You're still getting back Python

445
00:23:11,600 --> 00:23:16,680
code, and so learners still need
to understand and work with 

446
00:23:16,720 --> 00:23:20,160
Python, not at the syntax level.
Like we said, we're spending 

447
00:23:20,160 --> 00:23:24,760
less time on the low level 
syntax details, but they still 

448
00:23:24,760 --> 00:23:28,360
need to understand Python, And 
one of the reasons is so that 

449
00:23:28,560 --> 00:23:31,400
when this non determinism is 
happening, they can look at and 

450
00:23:31,400 --> 00:23:35,000
evaluate a bunch of different 
solutions for which ones may be 

451
00:23:35,000 --> 00:23:38,040
correct. 
So I think the interactions that

452
00:23:38,040 --> 00:23:40,640
you mentioned, right, asking 
back and forth, you know if you 

453
00:23:40,640 --> 00:23:44,400
got the first solution probably 
not quite right, you ask again 

454
00:23:44,400 --> 00:23:47,160
and you ask again back and forth
until you find the right 

455
00:23:47,160 --> 00:23:49,640
solution. 
I think this comes to the term 

456
00:23:49,640 --> 00:23:51,760
prompting, right? 
So I think many people would 

457
00:23:51,760 --> 00:23:53,840
have heard now prompt 
engineering is kind of like a 

458
00:23:53,840 --> 00:23:56,680
new job even. 
So tell us about this prompting,

459
00:23:56,680 --> 00:23:57,960
right. 
I think it's a skill set. 

460
00:23:57,960 --> 00:24:01,120
It's a new skill set that 
everyone needs to learn in order

461
00:24:01,120 --> 00:24:03,160
to get the best out of AI 
system. 

462
00:24:03,560 --> 00:24:05,920
Maybe a little bit about 
prompting, Like what do you feel

463
00:24:05,920 --> 00:24:09,680
about this new skill set? 
So prompt engineering is just 

464
00:24:09,680 --> 00:24:13,960
the task of writing a prompt in 
such a way that the LLM will 

465
00:24:13,960 --> 00:24:18,760
give you back a good response. 
And what is tricky there when 

466
00:24:18,760 --> 00:24:22,840
we're teaching people who don't 
know how to program to start, is

467
00:24:22,920 --> 00:24:25,920
the LMS do very well if you 
describe problems in a technical

468
00:24:25,920 --> 00:24:29,120
language. 
So if you say I want this 

469
00:24:29,120 --> 00:24:32,120
function to find the largest 
value in this list, right? 

470
00:24:32,120 --> 00:24:34,440
Like that. 
I'm using terms that we know as 

471
00:24:34,440 --> 00:24:36,480
computer scientists, right? 
I want to make sure I describe 

472
00:24:36,480 --> 00:24:39,040
it as a list if it's in Python 
or if an array if it's in Java 

473
00:24:39,040 --> 00:24:42,240
or things like that, right? 
And I'm saying I want to find 

474
00:24:42,240 --> 00:24:44,240
the maximum value. 
So I'm specifically saying 

475
00:24:44,240 --> 00:24:47,120
exactly the behavior I want. 
I may even describe it even 

476
00:24:47,120 --> 00:24:50,600
better to say something like I 
write a function that returns 

477
00:24:50,600 --> 00:24:53,480
the largest value in the 
parameter list. 

478
00:24:53,760 --> 00:24:57,160
Now I really specify the 
behavior and the LM does a much 

479
00:24:57,160 --> 00:24:59,160
better job of that. 
The problem is, I'm using 

480
00:24:59,160 --> 00:25:03,240
keywords that you have to teach 
learners, and so there is still 

481
00:25:03,240 --> 00:25:07,520
this task of teaching them how 
we would speak about these 

482
00:25:07,600 --> 00:25:10,640
functions. 
And I think, and by no means an 

483
00:25:10,640 --> 00:25:14,000
expert on how LLMS work. 
However, they are reading from 

484
00:25:14,000 --> 00:25:16,520
large code bases and they're 
learning from those large code 

485
00:25:16,520 --> 00:25:18,360
bases. 
And so in a sense, what we're 

486
00:25:18,360 --> 00:25:22,080
trying to get them to do is just
give that function header that a

487
00:25:22,080 --> 00:25:24,200
human would have written to 
describe their function. 

488
00:25:24,560 --> 00:25:26,800
And if you can generate one 
that's very close to behavior 

489
00:25:26,800 --> 00:25:29,920
you want, the model is going to 
find something very similar in 

490
00:25:29,920 --> 00:25:33,520
its train set and then generate 
code that's going to be paired 

491
00:25:33,520 --> 00:25:35,000
with that. 
The other thing you do with 

492
00:25:35,000 --> 00:25:37,080
prompts is there's a whole bunch
of ways in which you can 

493
00:25:37,400 --> 00:25:41,720
basically tell the LM to behave 
in a particular way, and Dan has

494
00:25:41,720 --> 00:25:44,160
a lot more experience with that,
so I'll let him take it from 

495
00:25:44,160 --> 00:25:46,520
there. 
Yeah, we had a good time near 

496
00:25:46,520 --> 00:25:49,760
the end of our book, sort of 
going into these other prompt 

497
00:25:49,760 --> 00:25:54,280
interaction patterns, people 
have started to catalogue the 

498
00:25:54,280 --> 00:25:57,160
different ways of interacting 
with the LLM. 

499
00:25:57,560 --> 00:26:01,400
It kind of reminds me of the 
object oriented patterns, you 

500
00:26:01,400 --> 00:26:04,480
know Henry, like Observer 
pattern and Model View, 

501
00:26:04,480 --> 00:26:08,360
Controller pattern and Visitor 
pattern and all these patterns 

502
00:26:08,360 --> 00:26:11,960
that people have identified and 
they documented over the past 

503
00:26:12,200 --> 00:26:14,560
several decades. 
And people are starting to do 

504
00:26:14,560 --> 00:26:18,400
that now with LLMS. 
And I probably could have kept 

505
00:26:18,400 --> 00:26:20,360
going and going about it in our 
textbook, but I managed to 

506
00:26:20,360 --> 00:26:22,200
control myself and I only, I 
think I only talked about a 

507
00:26:22,200 --> 00:26:24,080
couple, but they're very 
interesting. 

508
00:26:24,360 --> 00:26:27,960
And so, for example, one of them
is what if you don't know what 

509
00:26:28,080 --> 00:26:32,400
information the LM needs to 
perform your task? 

510
00:26:32,440 --> 00:26:34,840
And this links back to what Leo 
just said. 

511
00:26:34,840 --> 00:26:37,760
You have to be very precise 
sometimes in your natural 

512
00:26:37,760 --> 00:26:40,840
language, hopefully not as 
precise as you do with 

513
00:26:40,840 --> 00:26:42,320
programming. 
But like Leo said, you would 

514
00:26:42,320 --> 00:26:46,040
still need to know a lot of the 
terminology that you might not 

515
00:26:46,040 --> 00:26:48,920
know. 
And so one thing you can do is 

516
00:26:49,000 --> 00:26:52,160
you can use this flipped 
interaction pattern where you 

517
00:26:52,160 --> 00:26:58,000
ask the LM to ask you questions.
I think the example we have in 

518
00:26:58,000 --> 00:27:02,280
the book is you want a function 
to validate a password and you 

519
00:27:02,280 --> 00:27:05,960
might not know how to ask for 
such a function. 

520
00:27:06,200 --> 00:27:09,920
And So what you can do is you 
can ask the LLM to ask you for 

521
00:27:09,920 --> 00:27:12,600
all the information and needs 
and once it's done asking you, 

522
00:27:12,600 --> 00:27:16,120
it will write the function. 
So one of its first questions to

523
00:27:16,120 --> 00:27:18,120
you might be OK what are the 
parameters? 

524
00:27:18,320 --> 00:27:20,360
And you might be like, well I 
don't know what parameters are. 

525
00:27:20,560 --> 00:27:22,400
And then it will tell you what 
the parameters are. 

526
00:27:22,400 --> 00:27:25,560
And hopefully then you can be 
able to answer that question. 

527
00:27:25,960 --> 00:27:29,400
I mean, there is a risk here of 
you going down a rabbit hole 

528
00:27:29,800 --> 00:27:32,920
that you don't understand. 
And so we need to balance these 

529
00:27:33,080 --> 00:27:35,720
prompts patterns against 
teaching the fundamentals of 

530
00:27:35,800 --> 00:27:38,320
programming. 
But there's another pattern that

531
00:27:38,440 --> 00:27:40,320
I find kind of interesting, 
which is called the persona 

532
00:27:40,320 --> 00:27:42,680
pattern. 
And a lot of educators are using

533
00:27:42,680 --> 00:27:47,840
this to good effects right now 
where you can ask the AI to act 

534
00:27:47,840 --> 00:27:52,680
like a specific kind of person. 
And So what educators right now 

535
00:27:52,680 --> 00:27:54,880
are doing is they're using the 
persona pattern and they're 

536
00:27:54,880 --> 00:28:00,120
saying, OK, LLMURA CS1 
instructor. 

537
00:28:00,320 --> 00:28:03,440
And that conveys a lot of 
information like don't use 

538
00:28:03,440 --> 00:28:06,000
advanced programming concepts 
that have not been taught yet. 

539
00:28:06,240 --> 00:28:10,080
Or the persona pattern could be 
like you are a student in an 

540
00:28:10,080 --> 00:28:13,080
introductory computer science 
course, things like that. 

541
00:28:13,080 --> 00:28:16,720
So that you try to scope the 
types of responses that you get.

542
00:28:16,720 --> 00:28:20,040
You try to change the types of 
responses from the default ones,

543
00:28:20,040 --> 00:28:22,320
because there are a lot of 
situations where the default 

544
00:28:22,600 --> 00:28:25,760
responses might include code 
that students have not seen 

545
00:28:25,760 --> 00:28:29,400
before, might use types of code 
that you don't want them to see 

546
00:28:29,400 --> 00:28:30,760
yet. 
It's like not in the scope of 

547
00:28:30,760 --> 00:28:34,040
the course, and it's kind of 
amazing to me how much leverage 

548
00:28:34,040 --> 00:28:39,160
you can get by just telling the 
LLM how to behave in the 

549
00:28:39,160 --> 00:28:41,600
upcoming interactions. 
So it's definitely an ongoing 

550
00:28:41,600 --> 00:28:44,280
area of research and I'm 
definitely of listening 

551
00:28:44,280 --> 00:28:45,520
carefully to what's going on 
there. 

552
00:28:46,280 --> 00:28:48,280
Right. 
Very interesting to hear some of

553
00:28:48,280 --> 00:28:50,440
the patterns right. 
I've read your book as well, so 

554
00:28:50,440 --> 00:28:52,400
I find it also very very 
interesting. 

555
00:28:52,760 --> 00:28:55,880
So for people who might have 
applied AI assistant like with 

556
00:28:56,040 --> 00:28:59,160
ChatGPT or bot, right, They 
would have seen this pattern as 

557
00:28:59,160 --> 00:29:00,840
well. 
Some people also share if you 

558
00:29:00,840 --> 00:29:04,600
wanna do something, here are the
catalogs of prompts that you can

559
00:29:04,600 --> 00:29:07,560
use to solve the problem, right.
I think same thing applies for 

560
00:29:07,560 --> 00:29:09,880
programming. 
So I mean flip interaction 

561
00:29:09,880 --> 00:29:12,640
pattern, persona pattern, those 
are definitely interesting. 

562
00:29:12,640 --> 00:29:16,120
So it's not just one way we ask 
question and the AI assistant 

563
00:29:16,120 --> 00:29:18,800
will just give the solution. 
You can also be creative and 

564
00:29:18,800 --> 00:29:20,640
sometimes use it differently, 
right? 

565
00:29:20,640 --> 00:29:22,520
So I think Thanks for mentioning
the patterns. 

566
00:29:22,800 --> 00:29:24,920
I would love to see more 
patterns in the future. 

567
00:29:25,160 --> 00:29:28,040
So I think we'll leave to those 
creative people to come up with 

568
00:29:28,040 --> 00:29:31,120
the patterns, right. 
So maybe if you can share from 

569
00:29:31,120 --> 00:29:33,880
your experience so far using the
tool, you know, cracking the 

570
00:29:33,880 --> 00:29:36,120
code right? 
Are there any techniques that 

571
00:29:36,200 --> 00:29:39,800
probably is a little bit less 
utilized for now, but for people

572
00:29:39,800 --> 00:29:42,800
to try it so that they can 
actually see the true power of 

573
00:29:43,040 --> 00:29:44,920
LLM in their day-to-day 
workflow? 

574
00:29:45,760 --> 00:29:48,240
It's a good question Henry. 
So I I think a lot of people are

575
00:29:48,360 --> 00:29:51,400
used to, if they've played with 
these tools at all, they're kind

576
00:29:51,400 --> 00:29:56,240
of used to asking chat GDT or 
GitHub copilot for code and 

577
00:29:56,520 --> 00:29:59,160
that's a great use case. 
Something else Leo and I have 

578
00:29:59,480 --> 00:30:04,800
learned is you can also use 
these tools to ask for libraries

579
00:30:04,840 --> 00:30:07,800
or modules that you might be 
able to use to make your task 

580
00:30:07,800 --> 00:30:10,680
easier. 
So the chapter we have in our 

581
00:30:10,680 --> 00:30:16,440
book is called Automating 
Tedious Tasks and it's amazing 

582
00:30:16,440 --> 00:30:21,560
to me how many libraries in 
Python are available to help 

583
00:30:21,560 --> 00:30:23,440
you. 
If you didn't know about these 

584
00:30:23,440 --> 00:30:25,520
libraries like I'll just pick a 
random example. 

585
00:30:25,720 --> 00:30:29,880
We have an example in the 
chapter about automating the 

586
00:30:29,880 --> 00:30:36,080
tedious tasks where you've got 
two huge directories of images 

587
00:30:36,240 --> 00:30:39,600
and the back story in the book 
is maybe they came from 

588
00:30:39,600 --> 00:30:43,320
different phones. 
So like your partner that has a 

589
00:30:43,320 --> 00:30:45,640
bunch of pictures on their phone
and you've a bunch of pictures 

590
00:30:45,640 --> 00:30:47,760
on your phone and they're 
duplicates because you've been 

591
00:30:47,760 --> 00:30:50,600
sending them back and forth. 
I think everybody listening kind

592
00:30:50,600 --> 00:30:52,480
of knows what kind of mess you 
can get yourself into. 

593
00:30:52,960 --> 00:30:56,160
And so the idea is we want to 
remove the duplicate pictures. 

594
00:30:56,560 --> 00:31:00,680
And this I think sounds like a 
super daunting task until you 

595
00:31:00,680 --> 00:31:05,040
realize that if you ask copilot 
or ChatGPT, hey, here's a task I

596
00:31:05,040 --> 00:31:07,440
want to perform. 
Is there a Python module or 

597
00:31:07,440 --> 00:31:10,720
library that I can use? 
It will come back and tell you 

598
00:31:10,720 --> 00:31:12,960
about the libraries that are 
available that might help you. 

599
00:31:13,320 --> 00:31:16,880
For example, something that 
tells you if two pictures are 

600
00:31:16,880 --> 00:31:19,640
the same picture, like all the 
pixels are the same. 

601
00:31:20,120 --> 00:31:24,360
And then you can ask copilot for
clarifications. 

602
00:31:24,360 --> 00:31:25,840
You can say is this module built
in? 

603
00:31:25,840 --> 00:31:27,240
Is it something I have to 
install? 

604
00:31:27,240 --> 00:31:30,440
Are there other alternatives? 
And actually I want to throw it 

605
00:31:30,440 --> 00:31:34,240
over to Leo for a second, 
because in your class at the end

606
00:31:34,240 --> 00:31:41,200
of 2023, I think you managed in 
one lecture to do an example of 

607
00:31:41,440 --> 00:31:45,360
adding up all the word counts in
like a ton of documents. 

608
00:31:45,680 --> 00:31:48,320
And this is something I'm 
assuming you would not have done

609
00:31:48,320 --> 00:31:50,640
in a normal CS1. 
I guess is, you know, like, is 

610
00:31:50,640 --> 00:31:52,280
that accurate? 
Absolutely. 

611
00:31:52,360 --> 00:31:54,720
I I I think it's really hard for
students to work with new 

612
00:31:54,720 --> 00:31:58,480
libraries, and so the LLM having
that conversation is really 

613
00:31:58,480 --> 00:32:00,280
clean. 
It gives you nice examples. 

614
00:32:00,480 --> 00:32:02,280
And so I meant for this to be a 
whole lecture. 

615
00:32:02,520 --> 00:32:05,440
And what happened was that I ran
along from the previous lecture 

616
00:32:06,000 --> 00:32:08,320
and we just had that quick 
conversation with the students. 

617
00:32:08,360 --> 00:32:11,080
But I'm asking Copilot, what's a
good library for me to find out 

618
00:32:11,080 --> 00:32:12,560
how many words are in a Word 
document? 

619
00:32:12,960 --> 00:32:15,880
It gave me a great answer. 
It gave me some star code to 

620
00:32:15,880 --> 00:32:18,440
work with. 
And within I think it was about 

621
00:32:18,520 --> 00:32:20,800
15 minutes I actually spent with
my students and we we'd solved 

622
00:32:20,800 --> 00:32:23,640
the problem. 
And that's just way beyond the 

623
00:32:23,640 --> 00:32:25,680
scope of what we normally teach.
NC is when I completely agree 

624
00:32:25,680 --> 00:32:27,040
with you again. 
Yeah. 

625
00:32:27,040 --> 00:32:30,960
And again, it gets at that 
higher level of abstraction, 

626
00:32:31,120 --> 00:32:32,400
right? 
Like a lot of listeners have 

627
00:32:32,400 --> 00:32:38,120
probably had to dig into API 
docs reading about how functions

628
00:32:38,120 --> 00:32:39,640
are called. 
Oh, this thing takes 5 

629
00:32:39,640 --> 00:32:42,000
parameters, the first one is a 
pointer to a pointer to a 

630
00:32:42,000 --> 00:32:44,840
stream, and the second one is 
blah blah blah blah blah and all

631
00:32:44,840 --> 00:32:48,560
you want to do is just use this 
thing and the sample code is out

632
00:32:48,560 --> 00:32:50,880
there and these LMS could access
it. 

633
00:32:50,880 --> 00:32:52,600
And I just found the example 
really cool. 

634
00:32:52,600 --> 00:32:56,480
Leo, that something we would not
have even attempted in a 

635
00:32:56,480 --> 00:32:58,720
previous introductory computing 
course. 

636
00:32:59,240 --> 00:33:02,040
We wouldn't have attempted it 
because we would have wasted too

637
00:33:02,040 --> 00:33:06,880
much time learning or showing 
students how to use the library,

638
00:33:06,880 --> 00:33:09,080
how to call these functions 
correctly, and now we can just 

639
00:33:09,080 --> 00:33:10,880
do it. 
And I think it speaks to the 

640
00:33:10,880 --> 00:33:13,760
resilience of the students right
when they learn this way, when 

641
00:33:13,760 --> 00:33:15,480
they're working with a whole 
bunch of different libraries, 

642
00:33:15,480 --> 00:33:18,000
because they're working with a 
whole bunch of different samples

643
00:33:18,000 --> 00:33:21,200
from the LLMI think they become 
more resilient than we can give 

644
00:33:21,200 --> 00:33:23,520
them to small little code 
snippets that perform specific 

645
00:33:23,520 --> 00:33:26,800
tasks and specific domains. 
Jumping from domain domain I 

646
00:33:26,800 --> 00:33:29,640
think helps them a lot in 
building kind of a robust 

647
00:33:29,640 --> 00:33:31,440
understanding. 
Right. 

648
00:33:31,840 --> 00:33:35,040
So I think another important 
thing about AI Assistant, right,

649
00:33:35,040 --> 00:33:36,760
I think Leo mentioned it a 
little bit. 

650
00:33:36,960 --> 00:33:39,320
It is trained from a large code 
base. 

651
00:33:39,560 --> 00:33:43,120
So if you're using Copilot, I 
think maybe most likely it is 

652
00:33:43,120 --> 00:33:45,400
trained from GitHub 
repositories, right. 

653
00:33:45,600 --> 00:33:47,360
I think there's a race about 
copyright. 

654
00:33:47,640 --> 00:33:50,200
There's a race about just 
bluntly copying, you know from 

655
00:33:50,200 --> 00:33:52,840
those repositories. 
Any kind of race that you have 

656
00:33:52,840 --> 00:33:55,920
seen so far from the 
introduction of these tools. 

657
00:33:56,680 --> 00:34:00,560
Henry, that's a fantastic point 
about the ethics of using these 

658
00:34:00,560 --> 00:34:04,200
tools, and I think there's a few
directions we could take this 

659
00:34:04,200 --> 00:34:06,840
discussion. 
And the first is the ownership 

660
00:34:06,840 --> 00:34:09,120
of that code base that they used
to learn from. 

661
00:34:09,440 --> 00:34:13,440
I don't think we as a society 
have figured out first how we 

662
00:34:13,440 --> 00:34:15,840
should view this ethically and 
2nd, how should we view this 

663
00:34:15,840 --> 00:34:18,360
legally? 
We are obviously building tools 

664
00:34:18,360 --> 00:34:21,440
that can help empower people. 
And so in some light we would 

665
00:34:21,440 --> 00:34:23,000
say this is a good ethical 
thing. 

666
00:34:23,199 --> 00:34:27,320
But we do have to ask how are 
the tools built and who aren't 

667
00:34:27,320 --> 00:34:30,199
benefiting from their code 
potentially being taken or 

668
00:34:30,199 --> 00:34:32,320
things like this? 
So that's kind of the first 

669
00:34:32,400 --> 00:34:34,880
concern. 
The second concern would be a 

670
00:34:34,880 --> 00:34:38,840
copyright. 
Are the LMS commonly parroting 

671
00:34:39,120 --> 00:34:41,760
code which might be under 
someone's ownership? 

672
00:34:42,199 --> 00:34:44,719
Is it hard to assess, 
particularly for the kind of 

673
00:34:44,719 --> 00:34:47,960
small pieces of code that LMS 
tend to be able to generate 

674
00:34:47,960 --> 00:34:50,159
Well? 
But occasionally, I mean, I've 

675
00:34:50,159 --> 00:34:51,719
seen it in my interaction with a
copilot. 

676
00:34:51,719 --> 00:34:57,000
Occasionally it will generate an
author name like in in its 

677
00:34:57,000 --> 00:34:59,160
recommendations to me, and then 
clearly like, I don't know if 

678
00:34:59,160 --> 00:35:01,880
that's the author or if it's 
just doing next word prediction 

679
00:35:01,880 --> 00:35:04,360
and it happened to say author 
and then predicted some words 

680
00:35:04,360 --> 00:35:07,400
after that. 
But it does give you some doubts

681
00:35:07,400 --> 00:35:09,720
about where this came from and 
whether or not we have ownership

682
00:35:09,720 --> 00:35:11,880
over it. 
And so we said fairly early on 

683
00:35:11,880 --> 00:35:14,240
in our book, this hasn't been 
resolved legally yet. 

684
00:35:14,480 --> 00:35:17,280
Feel free to use these tools 
kind of for your own use, but if

685
00:35:17,280 --> 00:35:19,520
you were to go try to build a 
company off the software that 

686
00:35:19,520 --> 00:35:22,640
you're writing, you should be a 
little careful until these laws 

687
00:35:22,640 --> 00:35:25,000
get resolved. 
And the third piece I'd say in 

688
00:35:25,000 --> 00:35:28,720
terms of the ethics is models. 
And we've seen this across 

689
00:35:28,720 --> 00:35:32,640
artificial intelligence. 
Models reflect biases within 

690
00:35:32,640 --> 00:35:36,200
society. 
And so if you ask for a list of 

691
00:35:36,200 --> 00:35:40,960
names, it will probably give you
a list of Caucasian male names 

692
00:35:41,240 --> 00:35:44,000
just on first try. 
And you have to ask a question, 

693
00:35:44,000 --> 00:35:47,000
why would it do that? 
Like why is that its default? 

694
00:35:47,200 --> 00:35:49,320
And it's obviously learning from
a code base that probably has 

695
00:35:49,320 --> 00:35:53,280
those more representative, but 
that's not a good sign for 

696
00:35:53,280 --> 00:35:55,280
students who are coming in not 
from those groups. 

697
00:35:55,920 --> 00:35:58,920
What I think is important to do,
since I don't think, again, 

698
00:35:58,920 --> 00:36:03,280
we've worked out these issues as
a society, is to bring in the 

699
00:36:03,280 --> 00:36:06,520
readers for a book and bring in 
the students in our class into 

700
00:36:06,520 --> 00:36:09,040
this conversation and say these 
are the ethical concerns of 

701
00:36:09,040 --> 00:36:12,320
these models and have that a 
direct conversation about it and

702
00:36:12,320 --> 00:36:14,200
be frank about what we know and 
what we don't know. 

703
00:36:14,480 --> 00:36:17,000
I think the fear is if we 
aren't, if we kind of pretend 

704
00:36:17,000 --> 00:36:19,320
these models don't exist and we 
try not to let the students use 

705
00:36:19,320 --> 00:36:21,920
the models and they go on to use
them on their own, they're going

706
00:36:21,920 --> 00:36:24,480
to run into these issues. 
And so it's better for us to 

707
00:36:24,480 --> 00:36:28,120
teach them up upfront than to 
just leave them the blind on it.

708
00:36:28,640 --> 00:36:32,240
It also just sort of seems just 
to keep going off what Leo just 

709
00:36:32,240 --> 00:36:34,520
said. 
It also seems a little to me as 

710
00:36:34,520 --> 00:36:39,240
a teacher, it seems a little 
dishonest to not show students 

711
00:36:39,240 --> 00:36:43,160
these tools as soon as we do. 
But then what Leo said comes 

712
00:36:43,160 --> 00:36:45,960
into the picture, right then. 
We cannot pretend that these 

713
00:36:45,960 --> 00:36:49,560
issues don't exist. 
But there are people who try, 

714
00:36:49,560 --> 00:36:53,000
you know, to pretend these tools
don't exist and ban them so the 

715
00:36:53,000 --> 00:36:54,560
students can't use them in their
courses. 

716
00:36:54,560 --> 00:36:59,600
And I totally get why it's a 
very upsetting thing that has 

717
00:36:59,600 --> 00:37:01,840
happened. 
I don't mean upsetting in terms 

718
00:37:01,840 --> 00:37:03,800
of, you know, like making me 
sad. 

719
00:37:03,800 --> 00:37:06,000
I mean in terms of, like, 
upsetting the status quo of how 

720
00:37:06,000 --> 00:37:09,280
courses are taught. 
And it's very tempting to just 

721
00:37:09,280 --> 00:37:14,600
try to pretend these tools away.
But the tools are out there and 

722
00:37:14,600 --> 00:37:16,480
our students are going to be 
using them. 

723
00:37:17,000 --> 00:37:19,720
And I think more importantly, 
they're going to be using them 

724
00:37:19,720 --> 00:37:24,120
when they get their next Co-op 
position or their next industry 

725
00:37:24,120 --> 00:37:26,120
job. 
Or at the very least, they're 

726
00:37:26,120 --> 00:37:29,760
going to be asked about these at
future companies and asked about

727
00:37:29,760 --> 00:37:33,800
their opinions of these tools. 
And I just have to super agree 

728
00:37:33,800 --> 00:37:36,520
with Leo on this. 
We need to be teaching these 

729
00:37:36,520 --> 00:37:39,040
ethical concerns. 
We may not have solutions to 

730
00:37:39,040 --> 00:37:43,440
them, but I don't think a 
solution is to try to scare 

731
00:37:43,440 --> 00:37:47,520
students away from using these 
tools or somehow trying to 

732
00:37:47,520 --> 00:37:50,040
prevent them from using these 
tools, because it's never going 

733
00:37:50,040 --> 00:37:53,120
to happen. 
And I think it's more useful if 

734
00:37:53,360 --> 00:37:56,560
we teach the tools along with 
the concerns that we have. 

735
00:37:56,560 --> 00:37:59,240
Like, I think it goes without 
saying, but we have a lot of 

736
00:37:59,240 --> 00:38:01,360
work to do, right? 
Like, the issues that Leo just 

737
00:38:01,360 --> 00:38:04,160
mentioned are not small. 
There's a reason that they're at

738
00:38:04,160 --> 00:38:05,800
the beginning of our book and 
not at the end. 

739
00:38:06,000 --> 00:38:08,160
Right? 
Like, these are not like a oh, 

740
00:38:08,360 --> 00:38:10,880
by the way, you know, these 
things are going to reproduce 

741
00:38:10,880 --> 00:38:13,280
like cultural norms. 
Like, still, this is a big deal,

742
00:38:13,400 --> 00:38:15,320
right? 
We can't just say it at this. 

743
00:38:15,360 --> 00:38:17,480
Oh, this is like, look at 
Appendix A for the problems. 

744
00:38:17,640 --> 00:38:20,160
This is not an Appendix A stuff.
This is like chapter one stuff. 

745
00:38:20,800 --> 00:38:24,640
So Leo, you know in your course 
you talk about these early on, 

746
00:38:24,920 --> 00:38:28,400
but I don't think that that 
means we can't use these tools. 

747
00:38:28,400 --> 00:38:31,640
I think actually it makes it 
more likely that our students 

748
00:38:31,640 --> 00:38:34,160
will use these tools 
appropriately. 

749
00:38:34,480 --> 00:38:37,640
I think one of the worst things 
we can do is introduce students 

750
00:38:37,640 --> 00:38:42,200
to these tools and then not help
them understand what the costs 

751
00:38:42,200 --> 00:38:44,160
are. 
Because I even think once 

752
00:38:44,160 --> 00:38:47,800
students understand what's going
on, they'll be on the lookout 

753
00:38:47,800 --> 00:38:51,120
for this and they won't just 
accept whatever the LLM tells 

754
00:38:51,120 --> 00:38:53,560
them as the correct answer, 
right. 

755
00:38:53,800 --> 00:38:57,440
So we're trying to balance the 
fact that they're out there and 

756
00:38:57,440 --> 00:39:00,880
students are going to be using 
these tools with also training 

757
00:39:00,880 --> 00:39:02,800
students to understand the 
deficits. 

758
00:39:03,160 --> 00:39:05,640
And who knows, our students 
might be the ones who end up in 

759
00:39:05,640 --> 00:39:07,960
positions where they can make 
these kinds of improvements. 

760
00:39:08,160 --> 00:39:12,440
Like you know, students are 
potentially a couple of years 

761
00:39:12,440 --> 00:39:16,840
away from graduating and being 
able to inform how these tools 

762
00:39:16,840 --> 00:39:18,880
are deployed and how these tools
are used. 

763
00:39:19,000 --> 00:39:23,080
So I definitely think that this 
is a very important part and a 

764
00:39:23,080 --> 00:39:26,200
new component of an introductory
computer science course. 

765
00:39:27,040 --> 00:39:30,360
So thank you for highlighting 
this potential risk of using AII

766
00:39:30,360 --> 00:39:32,960
think it's not just for coding 
or programming, right? 

767
00:39:33,000 --> 00:39:35,880
I think it's a bigger 
conversation, responsible use of

768
00:39:35,960 --> 00:39:39,720
AI, copyright and for example 
ownership as well bias. 

769
00:39:40,120 --> 00:39:42,040
I think all these is like it's a
new thing, right? 

770
00:39:42,040 --> 00:39:45,320
So people are trying to grasp 
some countries also try to come 

771
00:39:45,320 --> 00:39:48,160
up with the guidelines, right? 
But I think you are right, maybe

772
00:39:48,160 --> 00:39:50,920
banning it all together might 
not be the wise idea. 

773
00:39:50,960 --> 00:39:54,600
We have to adapt with this tool.
And I think we all as a user of 

774
00:39:54,600 --> 00:39:57,360
this AI assistant, right? 
At the end of the day, when you 

775
00:39:57,400 --> 00:40:00,400
use the code and apply it to 
your system, right, it is also 

776
00:40:00,400 --> 00:40:03,480
your responsibility to actually 
make sure that the thing that 

777
00:40:03,480 --> 00:40:06,400
you apply is correct, right? 
Because it might potentially 

778
00:40:06,520 --> 00:40:08,360
affect other people's lives as 
well. 

779
00:40:08,880 --> 00:40:12,760
So I think one related question 
about using this tool right in 

780
00:40:12,760 --> 00:40:16,440
our day-to-day life is 
definitely people are afraid of 

781
00:40:16,640 --> 00:40:19,280
being replaced. 
Many people think that, oh, we 

782
00:40:19,280 --> 00:40:20,960
don't need so many developers 
anymore. 

783
00:40:21,160 --> 00:40:24,040
You can probably cut down the 
number of people that we have in

784
00:40:24,040 --> 00:40:26,960
the companies, right? 
The potential is there for 

785
00:40:26,960 --> 00:40:30,440
people to think that we may not 
need so many, you know, 

786
00:40:30,600 --> 00:40:33,040
developers anymore. 
What's your take about this? 

787
00:40:33,040 --> 00:40:37,040
I know it's probably hard to 
know the actual impact, but 

788
00:40:37,040 --> 00:40:40,320
what's your take about some 
people being afraid of OK, AI is

789
00:40:40,320 --> 00:40:43,240
going to take over the world 
and, you know, replace so many 

790
00:40:43,240 --> 00:40:46,520
people. 
OK, I think if any of your 

791
00:40:46,520 --> 00:40:50,080
listeners spend a little time 
with Copilot, their fears will 

792
00:40:50,080 --> 00:40:52,760
be quickly taken away. 
So, I mean, these tools are 

793
00:40:52,760 --> 00:40:54,920
fantastic. 
They do great things, but they 

794
00:40:54,920 --> 00:40:57,680
make mistakes. 
You realize very quickly there 

795
00:40:57,680 --> 00:41:00,880
are still essential skills that 
are required to use them 

796
00:41:00,880 --> 00:41:02,760
properly. 
And so I don't think we as 

797
00:41:02,760 --> 00:41:05,120
programmers are going to go away
and that that's kind of the 

798
00:41:05,120 --> 00:41:07,360
first take away. 
The 2nd is of our jobs as 

799
00:41:07,360 --> 00:41:09,360
software engineers. 
And you know this far better 

800
00:41:09,360 --> 00:41:12,160
than I do. 
But as software engineers, only 

801
00:41:12,280 --> 00:41:14,760
a fraction of your time is spent
coding. 

802
00:41:15,080 --> 00:41:17,920
A lot of your time is spent 
thinking, how should I lay out 

803
00:41:18,080 --> 00:41:21,440
the interfaces, how do I work 
with the other software within 

804
00:41:21,440 --> 00:41:23,960
the company? 
How do I make sure I've got 

805
00:41:23,960 --> 00:41:26,760
really clear requirements for my
code? 

806
00:41:27,000 --> 00:41:30,760
Like, all of these things are 
the really big problems that 

807
00:41:30,760 --> 00:41:34,120
still humans have to wrestle 
with, and I'm not seeing LLMS 

808
00:41:34,120 --> 00:41:36,760
taking that away from us anytime
soon, at least for now. 

809
00:41:37,800 --> 00:41:40,920
And Leo, I guess just to add, I 
think if you look back at 

810
00:41:40,920 --> 00:41:45,040
computing evolution, I wonder if
people have had the same 

811
00:41:45,040 --> 00:41:49,640
discussion when Visual Basic 
came out, you know, where you 

812
00:41:49,640 --> 00:41:52,160
could drag and drop components 
onto a form. 

813
00:41:52,160 --> 00:41:56,120
I wonder if back in 1995 people 
were saying, Oh, well, that's 

814
00:41:56,120 --> 00:41:57,680
it. 
We have like rapid application 

815
00:41:57,680 --> 00:41:59,080
development. 
I heard that term. 

816
00:41:59,200 --> 00:42:02,520
It was called RAD, I think, and 
I I wasn't around really. 

817
00:42:02,760 --> 00:42:04,560
I was like a kid having a good 
time. 

818
00:42:04,560 --> 00:42:07,960
But I think probably people back
then were saying the same sorts 

819
00:42:07,960 --> 00:42:10,320
of things, right? 
Like, oh, look at this, We can 

820
00:42:10,320 --> 00:42:12,520
develop these applications by 
dragging and dropping. 

821
00:42:12,920 --> 00:42:17,360
And I think these advances, I 
don't know if they lead to more 

822
00:42:17,360 --> 00:42:21,560
or less jobs, but I think it's 
likely that it's going to be a 

823
00:42:21,560 --> 00:42:26,880
steady state and perhaps will be
more productive with what we're 

824
00:42:26,880 --> 00:42:29,040
able to do. 
Just to reiterate what Leo said,

825
00:42:29,040 --> 00:42:32,920
I don't think that they make 
jobs in programming go away. 

826
00:42:33,320 --> 00:42:36,560
I should also add, and Leo, I 
wonder what your opinion is on 

827
00:42:36,560 --> 00:42:40,000
this. 
Most of what we've been talking 

828
00:42:40,000 --> 00:42:43,560
about in reading has been for 
introductory programming. 

829
00:42:43,840 --> 00:42:49,560
I don't know if we know what the
impact will be on industry level

830
00:42:49,560 --> 00:42:52,200
projects. 
We know people are using these 

831
00:42:52,200 --> 00:42:56,480
tools in industry and we know 
they're more productive, but I 

832
00:42:56,480 --> 00:43:01,400
don't know if we know whether 
there are more or fewer jobs or 

833
00:43:01,400 --> 00:43:04,760
if there will be in the future. 
I just I have a feeling that the

834
00:43:04,760 --> 00:43:08,160
result will be that the existing
programmers are just going to be

835
00:43:08,160 --> 00:43:10,240
more efficient. 
I agree. 

836
00:43:10,440 --> 00:43:14,200
I suspect there's going to be AI
think there'll be a shift to 

837
00:43:14,280 --> 00:43:16,560
like with all the other major 
advances in technology. 

838
00:43:16,880 --> 00:43:19,680
When Python came out, we didn't 
say, oh OK, well, we need fewer 

839
00:43:19,680 --> 00:43:23,400
people to write code. 
It was a oh geez, we can write 

840
00:43:23,480 --> 00:43:26,720
larger software or more quickly 
do data analysis or now deal 

841
00:43:26,720 --> 00:43:31,120
with the influx of big data. 
Like we've just adjusted and 

842
00:43:31,120 --> 00:43:33,960
done bigger and better things as
the technology got better. 

843
00:43:34,320 --> 00:43:36,640
And so naively, I think that's 
the case. 

844
00:43:36,800 --> 00:43:38,600
But I do think there's gonna be 
a bunch of research on this 

845
00:43:38,600 --> 00:43:40,200
topic. 
I mean the next 10 years, 

846
00:43:40,200 --> 00:43:41,720
probably in the software 
engineering community. 

847
00:43:43,200 --> 00:43:46,680
Yeah, maybe one few things that 
I pick from the industry point 

848
00:43:46,680 --> 00:43:48,680
of view, right? 
I mean there are maybe people 

849
00:43:48,680 --> 00:43:52,640
saying that it improves their 
productivity, maybe like 3040%. 

850
00:43:52,960 --> 00:43:56,320
Maybe the gap from junior and 
senior might be lesser now 

851
00:43:56,320 --> 00:43:59,520
because the juniors might be 
able to take on more advanced 

852
00:43:59,520 --> 00:44:02,680
and complicated problems. 
But I agree with Leo that 

853
00:44:02,720 --> 00:44:05,680
writing code is not the only job
for software developers, right? 

854
00:44:05,680 --> 00:44:07,720
So they still need to understand
requirements. 

855
00:44:07,920 --> 00:44:10,440
And we know in the industry a 
lot of times requirements are 

856
00:44:10,440 --> 00:44:12,840
vague or not well specified, 
right? 

857
00:44:13,240 --> 00:44:15,640
So I think it's the software 
developer's job to actually 

858
00:44:15,640 --> 00:44:18,440
translate that into a good 
design, proper design. 

859
00:44:18,480 --> 00:44:21,560
And also don't forget about 
evolving the code right, writing

860
00:44:21,560 --> 00:44:24,480
it in such a maintainable way, 
writing it in such a way it can 

861
00:44:24,480 --> 00:44:26,720
scale. 
I think Those things I still 

862
00:44:26,720 --> 00:44:29,280
haven't heard that the LLMS can 
do for us. 

863
00:44:29,640 --> 00:44:33,120
For example, you tell them build
me a few microservices that can 

864
00:44:33,120 --> 00:44:34,560
interact with these kind of 
APIs. 

865
00:44:34,560 --> 00:44:38,080
I think that will be too much 
task for LLMS to solve, but 

866
00:44:38,240 --> 00:44:41,360
maybe one day it would Happy to 
see that future. 

867
00:44:41,600 --> 00:44:44,400
But for now, I think it might 
take as well that we have to be 

868
00:44:44,400 --> 00:44:47,240
able to live with it, leverage 
with it to improve the 

869
00:44:47,240 --> 00:44:50,400
productivity so that we can move
on to solve bigger and bigger 

870
00:44:50,400 --> 00:44:52,320
problems just like Leo said, 
right? 

871
00:44:52,640 --> 00:44:56,560
And I'd add that it's not great 
at writing efficient code. 

872
00:44:57,000 --> 00:44:59,880
So if you're you say no, this is
this is an inefficient 

873
00:44:59,880 --> 00:45:02,160
algorithm, Could you use dynamic
programming to solve this? 

874
00:45:02,520 --> 00:45:04,520
At least in my experience it 
hasn't done very well. 

875
00:45:04,840 --> 00:45:07,840
And then I did try. 
I teach a really specialized 

876
00:45:07,840 --> 00:45:10,920
class on writing high 
performance software that's 

877
00:45:10,960 --> 00:45:13,680
architecture aware, so like 
knowing about caches and like 

878
00:45:13,880 --> 00:45:17,000
super high efficient code, 
extracting cache locality, 

879
00:45:17,000 --> 00:45:19,240
things like that. 
And it did terrible. 

880
00:45:19,240 --> 00:45:22,000
Like I asked it to write like a 
blocked matrix, matrix multiply 

881
00:45:22,000 --> 00:45:24,200
and it could not do that in any 
way. 

882
00:45:24,440 --> 00:45:26,600
So I think there's still for the
advanced code. 

883
00:45:26,600 --> 00:45:29,400
There's a lot of room for us as 
software engineers to be 

884
00:45:29,440 --> 00:45:31,680
developing those ourselves. 
Yeah. 

885
00:45:31,680 --> 00:45:33,720
I guess what I find kind of 
interesting about the discourse 

886
00:45:33,720 --> 00:45:37,320
right now is because it's so new
that people want to be able to 

887
00:45:37,320 --> 00:45:41,240
make these claims like LMS or 
crap, right? 

888
00:45:41,240 --> 00:45:44,560
Or LLMS are amazing. 
And, you know, it's very early, 

889
00:45:44,560 --> 00:45:47,160
so people are going to make 
these kinds of claims right now.

890
00:45:47,240 --> 00:45:51,200
But I guess I'm more interested 
in what happens when the dust 

891
00:45:51,200 --> 00:45:54,320
settles. 
And I think all polarizing 

892
00:45:54,320 --> 00:45:56,920
opinions right now, I don't 
think any of them are going to 

893
00:45:56,920 --> 00:45:58,880
end up being what actually 
happens, right? 

894
00:45:58,880 --> 00:46:01,280
Like, is every software engineer
going to be fired? 

895
00:46:01,280 --> 00:46:03,680
No. 
Are we going to have a different

896
00:46:03,680 --> 00:46:06,480
number of software engineers? 
Probably, right. 

897
00:46:06,480 --> 00:46:08,520
Like it's to some extent. 
I don't know if it's going to be

898
00:46:08,520 --> 00:46:11,560
more or less, but I think there 
are many, many opinions right 

899
00:46:11,560 --> 00:46:14,160
now. 
Leo, what's that statement you 

900
00:46:14,160 --> 00:46:17,800
have about overestimating the 
effects of technology? 

901
00:46:18,040 --> 00:46:20,040
Oh, there's this famous quote. 
Let me see. 

902
00:46:20,080 --> 00:46:22,160
Yeah, track it down. 
But there's a famous quote which

903
00:46:22,160 --> 00:46:25,200
is like we tend to overestimate 
the effects of technology in the

904
00:46:25,200 --> 00:46:27,120
short term and underestimate it 
in the long term. 

905
00:46:27,600 --> 00:46:30,120
Yeah. 
And so perhaps that's what's 

906
00:46:30,120 --> 00:46:32,880
happening and that's this kind 
of stuff that grabs headlines 

907
00:46:32,880 --> 00:46:34,520
too, right? 
So I mean, we're still in the 

908
00:46:34,520 --> 00:46:37,840
throes of this thing where it's 
very difficult right now to 

909
00:46:37,840 --> 00:46:40,560
separate pipe from what's 
actually happening. 

910
00:46:40,880 --> 00:46:44,160
I guess I look forward to maybe 
getting to the point where we 

911
00:46:44,160 --> 00:46:49,200
have more research backing, 
because until that it's fun and 

912
00:46:49,200 --> 00:46:52,160
everything, but it's just people
talking about the things that we

913
00:46:52,160 --> 00:46:54,320
don't really know the answer 
until the research gets done. 

914
00:46:54,800 --> 00:46:57,520
And that it's Amara's Law is the
name of it. 

915
00:46:57,560 --> 00:47:01,800
So Amara's Law is the we tend to
underestimate the effects in the

916
00:47:01,800 --> 00:47:03,840
long run, but overestimate in 
the short term. 

917
00:47:05,040 --> 00:47:06,600
Yep. 
So I think, yeah, one thing 

918
00:47:06,600 --> 00:47:10,240
clear for sure, right, if you 
rely too much on LLM, I think 

919
00:47:10,240 --> 00:47:11,880
still we are not there yet, 
right? 

920
00:47:11,880 --> 00:47:14,640
So I think in your book you also
mentioned it is not an expert, 

921
00:47:14,680 --> 00:47:18,040
it is actually trained from 
existing code bases, right. 

922
00:47:18,040 --> 00:47:20,640
So for example, if you want to 
solve a new problem, maybe 

923
00:47:20,640 --> 00:47:23,800
quantum computing, let's say, it
may not be able to even give you

924
00:47:23,800 --> 00:47:26,600
a proper solution, right. 
So let's not forget about that. 

925
00:47:26,760 --> 00:47:30,680
I think we still need to use our
judgment as well as a human to 

926
00:47:30,680 --> 00:47:34,160
actually apply what LLM is 
suggesting to us into our 

927
00:47:34,160 --> 00:47:36,400
software. 
So maybe one last point I would 

928
00:47:36,400 --> 00:47:39,080
like to ask. 
Since you are also part of the 

929
00:47:39,200 --> 00:47:41,000
university teaching students, 
right? 

930
00:47:41,120 --> 00:47:44,440
You mentioned about equity 
opportunity, probably is last 

931
00:47:44,440 --> 00:47:47,480
time for people to learn about 
programming computer science, 

932
00:47:47,480 --> 00:47:48,920
right? 
There are only limited number of

933
00:47:48,920 --> 00:47:51,800
people now with this 
introduction of AI system, 

934
00:47:51,800 --> 00:47:54,920
potentially more people will be 
able to get into computer 

935
00:47:54,920 --> 00:47:57,240
science and learn about 
programming. 

936
00:47:57,480 --> 00:48:00,840
Maybe about the syntax problem 
will soon becomes lesser of a 

937
00:48:00,840 --> 00:48:03,200
challenge. 
So what is your take on this? 

938
00:48:03,200 --> 00:48:06,560
Creating a more equitable kind 
of a society for people to learn

939
00:48:06,680 --> 00:48:08,840
computer science? 
Yeah. 

940
00:48:08,840 --> 00:48:10,760
Thanks, Henry. 
So this is something that Leo 

941
00:48:10,760 --> 00:48:14,920
and I have been thinking a lot 
about and we're excited by the 

942
00:48:14,920 --> 00:48:18,240
possibilities here. 
But we don't want to say 

943
00:48:18,240 --> 00:48:20,640
anything too early because 
again, we don't know what's 

944
00:48:20,640 --> 00:48:22,920
going to end up happening. 
But just to summarize for 

945
00:48:22,920 --> 00:48:27,560
everybody, the deal is that 
people who already have prior 

946
00:48:27,560 --> 00:48:32,040
programming experience they it's
unsurprising, but they tend to 

947
00:48:32,040 --> 00:48:35,480
do better in introductory CS 
courses. 

948
00:48:35,840 --> 00:48:38,560
So if they had more 
opportunities in high school, 

949
00:48:38,600 --> 00:48:42,880
for example for you know, their 
parents had access to some maybe

950
00:48:42,880 --> 00:48:47,200
computing or courses or they 
directed the students into this 

951
00:48:47,200 --> 00:48:49,360
field, then they tend to perform
better. 

952
00:48:49,760 --> 00:48:52,200
And I wouldn't necessarily have 
a problem with this except that 

953
00:48:52,360 --> 00:48:56,400
these opportunities are not 
evenly distributed and so 

954
00:48:56,400 --> 00:49:00,400
they're made more accessible to 
dominant groups. 

955
00:49:00,800 --> 00:49:05,720
And so then this gap in prior 
experience leads to a gap across

956
00:49:05,720 --> 00:49:09,080
different types of students, 
which is obviously not OK. 

957
00:49:09,680 --> 00:49:13,440
It's what we're hoping, and the 
research is ongoing or hoping is

958
00:49:13,440 --> 00:49:19,720
that because there's a reduced 
emphasis on syntax using LMS, 

959
00:49:19,960 --> 00:49:22,560
we're hoping that prior 
experience, while it will 

960
00:49:22,560 --> 00:49:25,440
certainly still exist, the gaps 
will still exist. 

961
00:49:25,760 --> 00:49:29,800
Perhaps the gaps in prior 
experience will not lead to the 

962
00:49:29,800 --> 00:49:34,320
gaps in outcomes that we've been
seeing in introductory CS 

963
00:49:34,320 --> 00:49:36,440
courses. 
So again, there are a lot of 

964
00:49:36,440 --> 00:49:39,000
caveats here. 
One of them, for example, is 

965
00:49:39,000 --> 00:49:44,920
that maybe the students with 
privilege are going to be using 

966
00:49:44,920 --> 00:49:49,160
LMS earlier than other students,
and then they'll have prior 

967
00:49:49,160 --> 00:49:54,000
experience using LLMS too, and 
that may conveyed advantage just

968
00:49:54,000 --> 00:49:56,760
like a prior programming 
experience does right now. 

969
00:49:57,080 --> 00:50:01,040
I guess our hope stems from the 
fact that learning syntax is so 

970
00:50:01,040 --> 00:50:04,720
difficult and it's such a 
barrier for so many students. 

971
00:50:05,040 --> 00:50:10,720
And then maybe these LLM skills,
maybe the gap can be made 

972
00:50:10,720 --> 00:50:13,920
smaller more quickly. 
I want to ask Leo to jump in 

973
00:50:13,920 --> 00:50:17,160
here too, because this is a 
question that's definitely 

974
00:50:17,160 --> 00:50:19,160
worthy of multiple discussion 
points. 

975
00:50:19,920 --> 00:50:21,720
Oh, absolutely. 
I I think you've summarized the 

976
00:50:21,720 --> 00:50:24,320
issue really well. 
I think there's a couple of 

977
00:50:24,600 --> 00:50:27,880
other reasons for optimism and I
am being very cautiously 

978
00:50:27,880 --> 00:50:29,720
optimistic. 
As Dan points out, we have to do

979
00:50:29,720 --> 00:50:33,720
the research, but I mentioned 
earlier the kind of status quo 

980
00:50:33,720 --> 00:50:36,280
of how we assess students in 
computer science classes and 

981
00:50:36,280 --> 00:50:40,240
it's solving these really small 
functions that aren't 

982
00:50:40,240 --> 00:50:42,200
particularly exciting to be 
quite frank. 

983
00:50:42,560 --> 00:50:46,200
And there's been a whole bunch 
of work within the community 

984
00:50:46,480 --> 00:50:50,840
that has shown that students 
from demographic groups that are

985
00:50:50,840 --> 00:50:54,960
currently under representing 
computing tend to want to see 

986
00:50:55,000 --> 00:50:58,920
that their work is going to help
society, It's going to be for 

987
00:50:58,920 --> 00:51:02,760
the societal good and they want 
to see that the computing can 

988
00:51:02,880 --> 00:51:07,760
serve that good. 
I think when we move to LLMS you

989
00:51:07,760 --> 00:51:10,800
end up, unless you want to do 
these outdated assignments that 

990
00:51:10,800 --> 00:51:13,800
they LM solve for you, you have 
to move to these kind of 

991
00:51:13,800 --> 00:51:16,480
open-ended large projects, which
is what we were using in our 

992
00:51:16,480 --> 00:51:18,560
class. 
And then they can pick the 

993
00:51:18,560 --> 00:51:20,960
domain that matters to them and 
then it can be something that's 

994
00:51:20,960 --> 00:51:24,160
meaningful to them personally. 
And I think if you can do that, 

995
00:51:24,160 --> 00:51:26,400
I think we're going to bring in 
a broader audience of people who

996
00:51:26,400 --> 00:51:28,520
are interested in competing 
because they see how it matters 

997
00:51:28,520 --> 00:51:31,040
for them as people. 
So that'd be the first reason 

998
00:51:31,040 --> 00:51:33,120
for optimism. 
And then the second one is 1, 

999
00:51:33,120 --> 00:51:35,160
where I'm also kind of 
cautiously optimistic. 

1000
00:51:35,680 --> 00:51:38,160
And as there's been a whole 
bunch of research and already 

1001
00:51:38,160 --> 00:51:40,720
started by members of our 
community, that's really 

1002
00:51:40,720 --> 00:51:45,840
interesting in terms of how can 
we turn these AI assistants into

1003
00:51:45,840 --> 00:51:48,960
tutors, essentially intelligent 
tutoring systems. 

1004
00:51:48,960 --> 00:51:52,440
How could we help through prompt
engineering, through really 

1005
00:51:52,440 --> 00:51:55,040
careful crafting of the 
introductory prompts. 

1006
00:51:55,360 --> 00:51:59,000
How can we make these make it so
when the students struggling, 

1007
00:51:59,200 --> 00:52:02,280
they don't have to wait till the
next office hours of instructor,

1008
00:52:02,520 --> 00:52:05,000
they can just have a quick 
conversation and they're going 

1009
00:52:05,000 --> 00:52:08,160
to get mostly correct answers. 
Which is how with LLMS you got 

1010
00:52:08,160 --> 00:52:10,640
to get correct answers, you're 
going to get encouraging answers

1011
00:52:10,640 --> 00:52:12,280
ones, they'll encourage them to 
keep trying. 

1012
00:52:12,520 --> 00:52:16,040
How can we get them the help 
they need, when they need it is 

1013
00:52:16,040 --> 00:52:19,000
if there's a gap in terms of how
much support different groups 

1014
00:52:19,000 --> 00:52:20,960
need. 
Making sure everyone has lots of

1015
00:52:20,960 --> 00:52:23,720
support will help everyone. 
That will help disadvantage 

1016
00:52:23,720 --> 00:52:25,920
groups more. 
Yeah, and Leo, It's not 

1017
00:52:25,920 --> 00:52:28,080
impossible that this happens. 
Like in case people are 

1018
00:52:28,080 --> 00:52:31,280
skeptical out there. 
Leo and I, of course are 

1019
00:52:31,280 --> 00:52:33,760
disinterested as well. 
Slash skeptical because we're 

1020
00:52:33,760 --> 00:52:38,400
scientists, but there is 
precedent for something good to 

1021
00:52:38,400 --> 00:52:40,480
happen here. 
And Leo specifically, I'm 

1022
00:52:40,480 --> 00:52:42,800
thinking about the way that we 
teach our classes. 

1023
00:52:43,120 --> 00:52:47,360
So for example, using student 
discussion in classes through 

1024
00:52:47,360 --> 00:52:50,520
something called Peer 
instruction seems to be able to 

1025
00:52:50,520 --> 00:52:53,920
reduce this gap. 
Yeah, it seems that this 

1026
00:52:54,120 --> 00:52:57,120
techniques like active learning 
disproportionately help students

1027
00:52:57,200 --> 00:53:00,560
who are underprivileged, and so 
it helps everyone kind of boat 

1028
00:53:00,560 --> 00:53:03,520
raises all waters, but the folks
who are struggling are raised 

1029
00:53:03,520 --> 00:53:06,120
more when you see a larger 
impact for those struggling 

1030
00:53:06,120 --> 00:53:07,360
groups. 
Yeah. 

1031
00:53:07,360 --> 00:53:10,280
And and that's because the new 
supports are there, right? 

1032
00:53:10,280 --> 00:53:14,160
Like other students, in the case
of peer instruction, it's I 

1033
00:53:14,160 --> 00:53:16,720
think perhaps partially a 
community aspect. 

1034
00:53:17,000 --> 00:53:20,960
So now they have more students 
who can help them kind of catch 

1035
00:53:20,960 --> 00:53:22,960
up. 
And so this is the hope right 

1036
00:53:22,960 --> 00:53:24,560
now. 
So people are already referring 

1037
00:53:24,560 --> 00:53:26,880
to LL Ms. as like one-on-one 
tutors. 

1038
00:53:27,200 --> 00:53:29,440
And I'm not willing to go there 
quite yet. 

1039
00:53:29,440 --> 00:53:30,800
But I think that's the dream, 
right? 

1040
00:53:30,800 --> 00:53:35,640
Like Leo said, the dream is that
they can reduce the time delay 

1041
00:53:35,640 --> 00:53:37,560
between having a question and 
getting an answer. 

1042
00:53:37,560 --> 00:53:40,560
Because if we can reduce that to
0, like just imagine that any 

1043
00:53:40,560 --> 00:53:42,800
question a student has could be 
answered immediately. 

1044
00:53:43,240 --> 00:53:45,600
That bodes well for students to 
catch up, right? 

1045
00:53:45,920 --> 00:53:49,000
As a lot of the times I think 
the limiting factor is just 

1046
00:53:49,000 --> 00:53:51,800
resources, right? 
Like I only have office hours 

1047
00:53:51,800 --> 00:53:54,800
once a week for example. 
So if a student gets stuck 

1048
00:53:55,000 --> 00:53:59,000
before, maybe they have to wait 
for me to get them unstuck and 

1049
00:53:59,000 --> 00:54:03,480
maybe they can get unstuck 
sooner with LLMS and then catch 

1050
00:54:03,480 --> 00:54:05,520
up. 
So again, this is just kind of 

1051
00:54:05,680 --> 00:54:08,840
the hope right now. 
Maybe in a few years we can 

1052
00:54:08,840 --> 00:54:11,440
revisit this and say yes, we 
were right or no we were not. 

1053
00:54:11,440 --> 00:54:13,920
But for now, it's definitely 
something we're interested in, 

1054
00:54:13,920 --> 00:54:15,400
I. 
Think that brings up a really 

1055
00:54:15,400 --> 00:54:18,520
good point of kind of comparison
groups, which is where I've kind

1056
00:54:18,520 --> 00:54:21,160
of shifted my thinking about how
are we comparing. 

1057
00:54:21,160 --> 00:54:23,000
And so I'll give you kind of 
three examples here. 

1058
00:54:23,000 --> 00:54:26,960
One is, you'll hear folks say we
can't change what we're teaching

1059
00:54:26,960 --> 00:54:29,240
in our introductory courses 
right now because students are 

1060
00:54:29,240 --> 00:54:31,400
learning the fundamentals and 
they they'll start kind of 

1061
00:54:31,400 --> 00:54:33,040
hammering on how great the 
currency is. 

1062
00:54:33,040 --> 00:54:37,440
One class is, but the evidence 
is indirectly students finishing

1063
00:54:37,440 --> 00:54:41,880
interview programming class. 
The majority of them can't find 

1064
00:54:41,920 --> 00:54:44,600
the average of positive numbers 
in a list. 

1065
00:54:45,040 --> 00:54:48,040
That's like a super easy task 
for computer scientists, like 

1066
00:54:48,040 --> 00:54:51,040
for software engineers. 
And the majority of students 

1067
00:54:51,040 --> 00:54:52,120
can't do that at the end of ACS 
one. 

1068
00:54:52,120 --> 00:54:54,560
So we we need to make sure that 
we're very clear about what 

1069
00:54:54,560 --> 00:54:56,880
we're comparing against. 
What's happening now isn't 

1070
00:54:56,880 --> 00:54:59,400
successful for everyone. 
The second reason this is what 

1071
00:54:59,400 --> 00:55:02,240
you made me think of for the 
tutors, was we've done a whole 

1072
00:55:02,240 --> 00:55:05,720
bunch of research out of my lab,
finding that both students and 

1073
00:55:05,720 --> 00:55:10,120
tutors have significant 
incentives to essentially just 

1074
00:55:10,120 --> 00:55:13,040
give away the answer and just 
fix the problem for the student 

1075
00:55:13,040 --> 00:55:14,720
right there. 
Basically act as human debuggers

1076
00:55:15,040 --> 00:55:16,840
without actually teaching the 
process. 

1077
00:55:17,240 --> 00:55:21,320
And so I think when we imagine 
that human tutor interacting 

1078
00:55:21,320 --> 00:55:24,400
with a student, we imagine the 
great teacher like Dan, like 

1079
00:55:24,400 --> 00:55:28,240
sitting down and going back to 
like step one and diagnosing the

1080
00:55:28,240 --> 00:55:30,840
problem and giving them the 
right instruction, the right 

1081
00:55:30,840 --> 00:55:32,520
time to address their 
misconceptions. 

1082
00:55:33,040 --> 00:55:36,120
When the reality is, it's mostly
students giving this kind of 

1083
00:55:36,120 --> 00:55:39,640
tutoring help and they're maybe 
not giving the best instruction.

1084
00:55:39,960 --> 00:55:42,800
And so we have to be honest with
ourselves about what are these 

1085
00:55:42,960 --> 00:55:46,040
AI assistants being compared 
against, and then we can 

1086
00:55:46,040 --> 00:55:48,160
actually do a fair comparison, 
right? 

1087
00:55:48,440 --> 00:55:49,760
I think it's like what you said,
right? 

1088
00:55:49,760 --> 00:55:53,080
We can be cautiously optimistic 
about this kind of equitable 

1089
00:55:53,080 --> 00:55:55,280
future, right. 
So I think really looking 

1090
00:55:55,280 --> 00:55:58,200
forward for more chances, more 
opportunities for people. 

1091
00:55:58,720 --> 00:56:00,600
It's been a great conversation 
so far, right? 

1092
00:56:00,600 --> 00:56:04,080
So I think we will have a lot of
more topics if we don't cut it 

1093
00:56:04,080 --> 00:56:06,400
short for now, right? 
I have one last question before 

1094
00:56:06,400 --> 00:56:08,640
we wrap up, which normally I ask
for all my guests. 

1095
00:56:08,640 --> 00:56:10,920
I call this 3 technical 
leadership wisdom. 

1096
00:56:11,200 --> 00:56:13,680
You can think of it just like 
advice as well for people to 

1097
00:56:13,680 --> 00:56:15,960
learn from you. 
Maybe if you can share your 

1098
00:56:15,960 --> 00:56:18,040
version of three technical 
leadership wisdom. 

1099
00:56:18,880 --> 00:56:21,440
Henry I I love that question and
if you don't mind, I've got a 

1100
00:56:21,440 --> 00:56:24,240
slightly long answer for my 
first one, and it's I had a 

1101
00:56:24,240 --> 00:56:27,320
really close colleague who is 
just a fantastic cyclist. 

1102
00:56:27,320 --> 00:56:31,960
His name is Allen Snavely here 
at UC San Diego, and he was part

1103
00:56:31,960 --> 00:56:35,720
of a race and he was a fantastic
cyclist and they were in the 

1104
00:56:35,720 --> 00:56:37,440
second pack. 
Even if all of cycling knows 

1105
00:56:37,440 --> 00:56:40,840
that you're in packs and the 
first pack is up ahead of them, 

1106
00:56:40,840 --> 00:56:43,680
they can't catch them. 
But at one point along the race,

1107
00:56:43,680 --> 00:56:48,440
like the front pack seems to go 
the wrong direction and Allen 

1108
00:56:48,440 --> 00:56:50,560
pretty quickly realizes, wait, 
that's actually not the 

1109
00:56:50,560 --> 00:56:52,360
direction to the finish, What 
are they doing? 

1110
00:56:52,680 --> 00:56:56,280
And so he steers the second pack
towards the finish and it's the 

1111
00:56:56,280 --> 00:56:58,720
one race that he ever got to win
because the main back went off 

1112
00:56:58,720 --> 00:57:01,160
in the wrong direction. 
And whenever he tells that 

1113
00:57:01,160 --> 00:57:02,520
story, I I always get a kick out
of it. 

1114
00:57:02,520 --> 00:57:05,800
But it reminds me from a 
leadership perspective that it's

1115
00:57:05,800 --> 00:57:08,040
important to be good. 
It's important to be fast, it's 

1116
00:57:08,040 --> 00:57:11,360
important to be able to be 
productive, but it's just as 

1117
00:57:11,360 --> 00:57:13,760
important or even more important
to know where we're going. 

1118
00:57:14,200 --> 00:57:16,640
And so I spend a lot of time 
with my group and with my lab 

1119
00:57:16,640 --> 00:57:18,960
and making sure we have a vision
for where we're going. 

1120
00:57:19,240 --> 00:57:21,400
It's that we are going the right
direction. 

1121
00:57:22,040 --> 00:57:23,520
Thanks Leo. 
That's a powerful one, 

1122
00:57:23,520 --> 00:57:26,040
especially for researchers like 
us to remember. 

1123
00:57:26,040 --> 00:57:29,080
I have another one I think that 
relates to research too, which 

1124
00:57:29,080 --> 00:57:32,560
is actually even more important 
now I think for people who are 

1125
00:57:32,560 --> 00:57:35,840
not researchers as well because 
of the LLM discourse right now. 

1126
00:57:36,120 --> 00:57:40,840
And that is always test 
assumptions or always be aware 

1127
00:57:40,840 --> 00:57:44,160
of assumptions that people are 
making. 

1128
00:57:44,560 --> 00:57:47,520
And I I bring this up 
specifically now because I think

1129
00:57:47,520 --> 00:57:51,640
we're at the beginning of this 
in a flood of research and 

1130
00:57:51,840 --> 00:57:54,720
commentary that's going to come 
out about LLMS. 

1131
00:57:54,720 --> 00:57:56,320
I mean, obviously this applies 
to everything, right? 

1132
00:57:56,320 --> 00:57:59,960
Always, you know, take the time 
to understand where the writer 

1133
00:57:59,960 --> 00:58:02,640
is coming from or where your own
assumptions are coming from. 

1134
00:58:03,200 --> 00:58:05,960
But especially now I just want 
to caution that people are going

1135
00:58:05,960 --> 00:58:10,080
to be making sweeping statements
about LLMS and Leo. 

1136
00:58:10,080 --> 00:58:14,920
And I read a lot of research 
around LLMS, and often, you 

1137
00:58:14,920 --> 00:58:18,080
know, if you're a busy 
researcher, busy professor, you 

1138
00:58:18,080 --> 00:58:22,360
can get some summary of the 
paper by reading the abstract. 

1139
00:58:22,360 --> 00:58:25,200
Sort of great practice. 
But if you're very busy, you can

1140
00:58:25,560 --> 00:58:27,280
get a sense of what the paper's 
doing. 

1141
00:58:27,720 --> 00:58:31,560
I don't think this necessarily 
works for LLM papers. 

1142
00:58:32,120 --> 00:58:36,040
There's so many assumptions that
are baked in to the experiments 

1143
00:58:36,040 --> 00:58:40,200
that people are doing right now,
We can't even agree on the right

1144
00:58:40,200 --> 00:58:44,040
skills that we want students to 
have when they're working with 

1145
00:58:44,280 --> 00:58:47,040
LLMS anymore. 
And so I think we're seeing a 

1146
00:58:47,040 --> 00:58:51,280
lot of papers that like there 
are the headlines, like LLMS 

1147
00:58:51,400 --> 00:58:53,440
suck or LLMS are amazing or 
whatever. 

1148
00:58:53,720 --> 00:58:56,360
But I think we need to dig 
beneath the headlines to see 

1149
00:58:56,360 --> 00:59:00,760
exactly what's going on, 
especially in a new area like 

1150
00:59:00,920 --> 00:59:05,120
LLMS where there are so many 
assumptions that have not even 

1151
00:59:05,120 --> 00:59:07,920
been written down yet that 
people might be making. 

1152
00:59:08,760 --> 00:59:11,480
That's a brilliant point, Dan. 
And we see with the new papers 

1153
00:59:11,480 --> 00:59:13,560
coming in, they're coming in 
very quickly. 

1154
00:59:13,920 --> 00:59:17,160
And because we were trying in 
such a race to get the research 

1155
00:59:17,160 --> 00:59:19,880
done, it's really important for 
us to go to the methods and 

1156
00:59:19,880 --> 00:59:22,280
actually read the paper fully. 
And I know, I know you're 

1157
00:59:22,280 --> 00:59:25,680
fantastic at that. 
And so I I really hope all the 

1158
00:59:25,720 --> 00:59:28,120
other practitioners, all the 
people teaching programming, do 

1159
00:59:28,120 --> 00:59:30,000
the same thing. 
Spend their time making sure 

1160
00:59:30,000 --> 00:59:31,640
they understand the studies that
have been done. 

1161
00:59:32,160 --> 00:59:34,800
Yeah. 
And it's not even that anybody 

1162
00:59:34,800 --> 00:59:36,240
who's involved is being 
deceptive. 

1163
00:59:36,240 --> 00:59:39,360
I think everybody's being super 
honest about what's happening. 

1164
00:59:39,720 --> 00:59:43,520
But the assumptions, I think, 
are so new that we're not even 

1165
00:59:43,520 --> 00:59:46,880
necessarily writing them down. 
Like if we're not being careful 

1166
00:59:46,880 --> 00:59:50,960
enough, we may be making 
assumptions about LLMS, like, So

1167
00:59:50,960 --> 00:59:55,880
for example, I could just think 
in my head, OK, students still 

1168
00:59:55,880 --> 00:59:59,480
must know syntax. 
And maybe that's true, Maybe 

1169
00:59:59,480 --> 01:00:02,520
it's not true, but it might be 
so obvious to me one way or the 

1170
01:00:02,520 --> 01:00:05,880
other that I just might not even
take it into account in my 

1171
01:00:05,880 --> 01:00:07,400
research. 
And this is one of the most 

1172
01:00:07,400 --> 01:00:08,960
dangerous things for 
researchers, right, Leo? 

1173
01:00:08,960 --> 01:00:13,360
It's like an assumption that is 
apparently super obvious that 

1174
01:00:13,360 --> 01:00:15,640
you don't even question it, or 
even worse, you don't even write

1175
01:00:15,640 --> 01:00:17,120
it down. 
And I think we're. 

1176
01:00:17,480 --> 01:00:21,080
As a community, we're at risk of
doing this right now because 

1177
01:00:21,080 --> 01:00:23,440
everything is moving so quickly.
Exactly. 

1178
01:00:23,440 --> 01:00:26,120
We've been studying how to teach
programming for the last 40 

1179
01:00:26,120 --> 01:00:29,360
years, and so we've so many 
assumptions built in about that.

1180
01:00:29,800 --> 01:00:33,120
I think just even the assumption
of what is the angle, like, is 

1181
01:00:33,120 --> 01:00:35,920
syntax an end goal of a intro 
programming class? 

1182
01:00:36,160 --> 01:00:37,720
We don't know. 
Like, I think there's going to 

1183
01:00:37,720 --> 01:00:38,920
be a whole bunch of discussion 
about that. 

1184
01:00:39,440 --> 01:00:41,880
Yeah. 
Or like, does it make sense to 

1185
01:00:41,880 --> 01:00:46,400
compare what students learn with
LLMS against what they learn 

1186
01:00:46,400 --> 01:00:49,120
without LLMS? 
Like, what do you compare? 

1187
01:00:49,200 --> 01:00:50,320
Right. 
What's important? 

1188
01:00:50,320 --> 01:00:52,760
Like, I don't think we know the 
answers to these questions. 

1189
01:00:52,760 --> 01:00:56,280
So I guess I'm asking more 
questions than I'm answering, 

1190
01:00:56,280 --> 01:00:58,960
which I don't think was what 
Henry wanted for this section. 

1191
01:01:00,080 --> 01:01:00,800
Yeah. 
Yeah. 

1192
01:01:01,080 --> 01:01:03,200
And Leo, you have one more I 
think you wanted to share. 

1193
01:01:03,360 --> 01:01:05,320
Oh yeah, absolutely. 
So the last piece, and this is 

1194
01:01:05,320 --> 01:01:08,440
going to be me, honestly kind of
just taking from the great 

1195
01:01:08,440 --> 01:01:11,680
wisdom that Henry's already 
shared previously with some of 

1196
01:01:11,680 --> 01:01:14,160
his guests. 
And that's I believe everything 

1197
01:01:14,160 --> 01:01:16,080
is done with people. 
Like if Dan and I work 

1198
01:01:16,080 --> 01:01:18,920
fantastically together, I love 
working with my lab. 

1199
01:01:19,320 --> 01:01:22,440
And so I believe very fervently 
in the notion of empowered 

1200
01:01:22,440 --> 01:01:25,760
teams. 
And the message from Marty Kagan

1201
01:01:25,840 --> 01:01:28,000
really resonates with me. 
I first heard from Monty Hammond

1202
01:01:28,000 --> 01:01:31,720
Tree at Microsoft and it's just 
a really powerful message of you

1203
01:01:31,720 --> 01:01:33,880
want to make sure your teams are
empowered to be able to do the 

1204
01:01:33,880 --> 01:01:36,400
work they want to do and solve 
important problems. 

1205
01:01:36,760 --> 01:01:42,040
And I think Dan and I have both 
seen this as PhD advisors in 

1206
01:01:42,600 --> 01:01:46,240
empowering our PhD students to 
find their own path is probably 

1207
01:01:46,240 --> 01:01:50,400
one of the best things we get to
do as faculty of watching them, 

1208
01:01:50,400 --> 01:01:52,680
not really knowing what they 
want to study initially and us 

1209
01:01:52,800 --> 01:01:55,600
really being close to them on 
every project they do to six 

1210
01:01:55,600 --> 01:01:58,440
years later, five or six years 
later when they are now 

1211
01:01:58,520 --> 01:02:00,640
essentially running their own 
research program and we're just 

1212
01:02:00,640 --> 01:02:04,240
giving them occasional advice. 
And so I I think for the tech 

1213
01:02:04,240 --> 01:02:06,160
leaders out there who've been 
listening to Marty Kagan's 

1214
01:02:06,160 --> 01:02:09,920
message of empowered teams, I 
think it applies more than just 

1215
01:02:09,920 --> 01:02:12,600
software engineering teams. 
I think you empower all the 

1216
01:02:12,600 --> 01:02:14,400
people who work with you and 
you, you end up in a better 

1217
01:02:14,400 --> 01:02:16,280
place. 
Yeah, it's like tutoring, 

1218
01:02:16,280 --> 01:02:18,720
really. 
It's like one-on-one work is the

1219
01:02:18,720 --> 01:02:20,480
really the most powerful work 
you can do. 

1220
01:02:20,760 --> 01:02:24,320
I'll take my classes of 300 or 
400 or whatever. 

1221
01:02:24,320 --> 01:02:27,680
I'll you know, I'll do my best. 
But you can't match a small team

1222
01:02:27,680 --> 01:02:29,920
just empowered to do great work.
So I totally agree. 

1223
01:02:31,120 --> 01:02:34,120
So yeah, I think for people who 
want to learn more about this AI

1224
01:02:34,200 --> 01:02:37,480
assistant from your book, is 
there any resources or a place 

1225
01:02:37,480 --> 01:02:39,720
where they can find you online? 
Yes. 

1226
01:02:39,720 --> 01:02:42,120
If people are interested in 
getting our book and trying to 

1227
01:02:42,120 --> 01:02:46,600
learn how to write software with
the AI assistant, they can just 

1228
01:02:46,600 --> 01:02:49,520
look for our book on Amazon. 
It's freely available in all 

1229
01:02:49,520 --> 01:02:54,080
countries and we candidly are 
very open to feedback. 

1230
01:02:54,080 --> 01:02:58,240
This is a very, very new space. 
As readers work through it, we 

1231
01:02:58,240 --> 01:03:02,800
would appreciate the emails or 
the comments on LinkedIn that 

1232
01:03:02,800 --> 01:03:05,000
would let us know how they're 
appreciating the book and what 

1233
01:03:05,000 --> 01:03:06,560
we could do better for a second 
edition. 

1234
01:03:07,120 --> 01:03:08,640
Thanks for organizing this for 
us, Henry. 

1235
01:03:11,720 --> 01:03:15,000
Thank you for listening to this 
episode and for staying right 

1236
01:03:15,000 --> 01:03:17,720
until the end. 
If you highly enjoyed it, I 

1237
01:03:17,720 --> 01:03:20,440
would appreciate if you share it
with your friends and colleagues

1238
01:03:20,680 --> 01:03:23,680
who you think would also benefit
from listening to this episode. 

1239
01:03:24,120 --> 01:03:26,920
And if you're new to the 
podcast, make sure to subscribe 

1240
01:03:26,920 --> 01:03:29,280
and leave me your valuable 
review and feedback. 

1241
01:03:29,640 --> 01:03:32,520
It helps me a lot in order to 
grow this podcast better. 

1242
01:03:33,040 --> 01:03:35,920
You can also find the full show 
notes of this conversation on 

1243
01:03:35,920 --> 01:03:38,920
the episode page at 
techlitjournal dot dev website, 

1244
01:03:39,200 --> 01:03:42,800
including the full transcript, 
interesting quotes, and links to

1245
01:03:42,800 --> 01:03:45,200
the resources mentioned from the
conversation. 

1246
01:03:45,640 --> 01:03:48,680
And lastly, make sure to 
subscribe to the show's mailing 

1247
01:03:48,680 --> 01:03:52,480
list on techlitjournal dot dev 
to get notified for any future 

1248
01:03:52,480 --> 01:03:55,080
episodes. 
Stay tuned for the next Techly 

1249
01:03:55,080 --> 01:03:58,000
Journal episode, and until then,
goodbye.