1
00:00:00,000 --> 00:00:03,500
If you use Excel for something, 
there is a lot of, you know, 

2
00:00:03,500 --> 00:00:06,100
stereotyping that you know, you 
are so not so good at, you know,

3
00:00:06,100 --> 00:00:08,200
the typical data science 
profile, that people expect you 

4
00:00:08,200 --> 00:00:11,000
to be setting up. 
Start getting started with 

5
00:00:11,000 --> 00:00:12,400
python itself is a big 
challenge. 

6
00:00:12,400 --> 00:00:13,700
I mean, like I'm a python 
programmer. 

7
00:00:13,700 --> 00:00:15,800
I don't want people to, you 
know, get get angry at you. 

8
00:00:15,808 --> 00:00:19,300
But I think the initial friction
that you have with python is 

9
00:00:19,300 --> 00:00:22,500
really, really hot. 
But from the beginner to 

10
00:00:22,500 --> 00:00:24,700
intermediate stage, you would 
see very less content on 

11
00:00:24,700 --> 00:00:26,400
internet. 
No, boot camp. 

12
00:00:26,400 --> 00:00:28,400
Trains them are no courses. 
Teachers them. 

13
00:00:28,400 --> 00:00:31,600
There is very less content. 
In the middle space and I think 

14
00:00:31,600 --> 00:00:34,700
because of that Gap, a lot of 
people actually get stuck in the

15
00:00:34,700 --> 00:00:54,600
beginner space. 
Hello, and welcome to data, 

16
00:00:54,600 --> 00:00:57,700
Shadow the podcast on all things
data. 

17
00:00:58,500 --> 00:01:01,900
This podcast is a series of 
conversations with experts and 

18
00:01:01,900 --> 00:01:04,500
Industry leaders in data. 
At each week. 

19
00:01:04,500 --> 00:01:07,400
We aim to unpack a different 
compartment of the data 

20
00:01:07,400 --> 00:01:10,100
Sudanese. 
I am your host Catholic Church 

21
00:01:10,100 --> 00:01:14,400
that I'm a blogger newspaper, 
columnist book author, and a 

22
00:01:14,408 --> 00:01:17,700
former data and strategy 
consultant at currently head 

23
00:01:17,700 --> 00:01:20,700
analytics and business 
intelligence for Woody one of 

24
00:01:20,700 --> 00:01:22,600
India's largest logistics. 
Companies. 

25
00:01:23,200 --> 00:01:28,300
You can follow me on Twitter at 
Karthik s that is Kar Phi. 

26
00:01:28,300 --> 00:01:32,300
K s and read my blog at. 
No, Intruder.com. 

27
00:01:32,500 --> 00:01:38,300
That is n and p. 
H, be a.com. 

28
00:01:38,800 --> 00:01:41,100
All opinions expressed in his 
podcast belong to me and my 

29
00:01:41,100 --> 00:01:43,400
podcast Pious. 
And I do not reflect the views 

30
00:01:43,400 --> 00:01:45,700
of any organizations. 
We might be Associated. 

31
00:01:46,000 --> 00:01:48,200
Nothing discussing his podcast 
should be taken as been achieved

32
00:01:48,200 --> 00:01:57,700
for us. 
My initial plan was to record a 

33
00:01:57,700 --> 00:02:02,600
podcast on our versus python and
imagined a little fist fight on 

34
00:02:02,600 --> 00:02:06,400
it as a guest and I would battle
about whether our is more suited

35
00:02:06,400 --> 00:02:09,699
for data science or python 
overtime, better sense, 

36
00:02:09,699 --> 00:02:11,500
prevailed. 
And I decided to be more 

37
00:02:11,500 --> 00:02:15,200
constructive. 
So, what we have now is a rather

38
00:02:15,200 --> 00:02:18,700
healthy conversation about the 
merits and demerits of the two 

39
00:02:18,700 --> 00:02:21,400
big languages used for doing 
data sites. 

40
00:02:21,800 --> 00:02:24,500
When do you use our? 
And when do you use Python? 

41
00:02:25,100 --> 00:02:27,900
What are the advantages of being
proficient in both languages? 

42
00:02:28,500 --> 00:02:30,600
How do you integrate both into 
your workflow? 

43
00:02:31,700 --> 00:02:35,500
Our Guest is Abdul Majid, Raja a
data scientist at atlassian. 

44
00:02:36,100 --> 00:02:39,400
He's a heavy user of both R, and
python and a founder of 

45
00:02:39,400 --> 00:02:43,500
Bangalore, our users group. 
He also offers tutorials in 

46
00:02:43,500 --> 00:02:52,100
programming, in both languages. 
So today we will be talking 

47
00:02:52,100 --> 00:02:56,600
about our and python probably 
the two most popularly used 

48
00:02:56,800 --> 00:02:58,700
programming languages, 
statistical packages. 

49
00:02:58,700 --> 00:03:00,700
Whatever you call it for 
analytics. 

50
00:03:01,200 --> 00:03:05,200
So before we start, can you sort
of take us through your journey 

51
00:03:05,500 --> 00:03:07,500
into analytics how you got into 
it? 

52
00:03:07,800 --> 00:03:12,400
And also how you got introduced 
to both our and to python? 

53
00:03:13,300 --> 00:03:15,600
So I come from a digital 
analytics background. 

54
00:03:15,600 --> 00:03:18,500
So I was predominantly into 
digital analytics and once the 

55
00:03:18,500 --> 00:03:20,400
data volume started increasing a
lot. 

56
00:03:20,400 --> 00:03:23,000
So there was a need for me to 
use some programming language. 

57
00:03:23,400 --> 00:03:26,700
I actually first Learn Python. 
After I learn python after a 

58
00:03:26,700 --> 00:03:29,600
point packages that I wanted to 
use for digital and index was 

59
00:03:29,600 --> 00:03:33,000
easier and are for example, 
Google analytics Library, making

60
00:03:33,200 --> 00:03:36,900
our mark down reports making our
shiny applications for doing 

61
00:03:36,900 --> 00:03:38,800
Powers analysis like a Sankey 
diagram. 

62
00:03:39,000 --> 00:03:41,800
So a lot of things became very 
easy and are so I started slowly

63
00:03:41,800 --> 00:03:45,200
migrating into our Our and then 
slowly became for me that you 

64
00:03:45,200 --> 00:03:46,900
know, using one language was not
enough. 

65
00:03:46,900 --> 00:03:50,900
So I started keeping a myself 
with both the languages and 

66
00:03:51,000 --> 00:03:54,800
that's how I but formally. 
I learn python first myself and 

67
00:03:54,800 --> 00:03:58,700
then I moved into our to learn 
on and then I did back and forth

68
00:03:58,700 --> 00:04:02,000
between both the languages and 
what was your background like 

69
00:04:02,000 --> 00:04:05,000
before you got into digital 
analytics and by 3 and so on 

70
00:04:05,000 --> 00:04:07,400
work, in terms of programming 
where you are programmer 

71
00:04:07,400 --> 00:04:09,400
already? 
If so work, what were you 

72
00:04:09,400 --> 00:04:12,100
programming in? 
I was a trained graduate in 

73
00:04:12,100 --> 00:04:13,200
computer science. 
Ring. 

74
00:04:13,200 --> 00:04:15,200
But, you know, as most computer 
science Engineers. 

75
00:04:15,200 --> 00:04:17,600
I was not good at programming 
when I was at University. 

76
00:04:17,600 --> 00:04:20,800
I struggle because there was, 
there was Java and I didn't 

77
00:04:20,800 --> 00:04:22,500
quite like, Java to let the 
time. 

78
00:04:23,000 --> 00:04:25,600
I think I still hate Java. 
Probably that is why I ended up 

79
00:04:25,600 --> 00:04:26,900
in analytics. 
Otherwise, I would have become, 

80
00:04:26,900 --> 00:04:29,000
like a traditional ID 
programmer. 

81
00:04:29,300 --> 00:04:32,100
So, I didn't code. 
And after my University, I was 

82
00:04:32,100 --> 00:04:34,800
trained in Cobol. 
I didn't like kobali there. 

83
00:04:34,800 --> 00:04:36,700
So I left my job as soon as 
possible. 

84
00:04:36,700 --> 00:04:39,500
I can. 
So this this actually frustrated

85
00:04:39,500 --> 00:04:40,800
me. 
That's where digital analytics 

86
00:04:40,800 --> 00:04:43,600
actually gave me a good 
impression where It was a time 

87
00:04:43,600 --> 00:04:46,300
when you know, analytics was not
not like a great thing. 

88
00:04:46,300 --> 00:04:49,300
Like people are not, there was 
no data science at the time and 

89
00:04:49,300 --> 00:04:52,600
slowly migrated into the space 
as the space was growing. 

90
00:04:52,600 --> 00:04:54,300
And that's how I moved into our 
and python. 

91
00:04:54,500 --> 00:04:57,000
But initially, if you ask me, I 
think I ended up in Orange 

92
00:04:57,000 --> 00:04:59,100
python only because I didn't 
like any other programming 

93
00:04:59,100 --> 00:05:01,400
language. 
I think my background is sort of

94
00:05:01,400 --> 00:05:02,300
similar, right? 
Hi. 

95
00:05:02,300 --> 00:05:03,700
Again. 
I did my undergrad. 

96
00:05:03,700 --> 00:05:06,400
In couple decides was asked to 
do a lot of coding in Java. 

97
00:05:06,400 --> 00:05:08,200
And so I decided I don't want to
be a programmer. 

98
00:05:08,200 --> 00:05:11,200
So it just went to two 
completely different path later 

99
00:05:11,200 --> 00:05:12,600
on. 
I think when I learn python I 

100
00:05:12,600 --> 00:05:16,600
was Okay, they had introduced by
thrilled when I was studying, 

101
00:05:16,600 --> 00:05:18,200
computer science. 
Maybe, I might have stayed a 

102
00:05:18,207 --> 00:05:21,600
program of because like as a 
programming language, it's like 

103
00:05:21,900 --> 00:05:26,400
he at least in my opinion like 
far easier to use than than 

104
00:05:26,400 --> 00:05:29,000
Java, right? 
So, so yeah, so that's the that 

105
00:05:29,000 --> 00:05:31,100
sort of my background as well. 
So did you have any other 

106
00:05:31,100 --> 00:05:33,000
experience with any other 
statistical? 

107
00:05:33,000 --> 00:05:36,600
The software software's or with 
Excel or something before you 

108
00:05:36,600 --> 00:05:38,600
got into Analytics? 
Okay. 

109
00:05:38,600 --> 00:05:40,300
Axl. 
I was quite good at Excel all 

110
00:05:40,300 --> 00:05:41,700
the time. 
I think I still think 

111
00:05:41,700 --> 00:05:45,100
accelerates quite useful, even 
if we ran our python programmer,

112
00:05:45,500 --> 00:05:47,700
but I think the problem that I 
feel in Industry. 

113
00:05:47,900 --> 00:05:49,500
I don't know if I'm jumping 
ahead. 

114
00:05:49,500 --> 00:05:52,600
But the problem that I feel is 
if you use Excel for something, 

115
00:05:52,900 --> 00:05:55,500
there is a lot of, you know, 
stereotyping that you know, you 

116
00:05:55,500 --> 00:05:58,200
are so not so good at, you know,
the typical data science profile

117
00:05:58,200 --> 00:06:01,500
that people expect you to be, 
but but I think having Excel 

118
00:06:01,600 --> 00:06:04,100
also, as part of your data 
science, book flow is very 

119
00:06:04,100 --> 00:06:06,400
helpful. 
Not maybe you're not going to 

120
00:06:06,600 --> 00:06:09,900
build a dashboard or something. 
I use a lot of gc8 in my current

121
00:06:09,900 --> 00:06:12,200
profile. 
This it is quite helpful for me 

122
00:06:12,200 --> 00:06:15,100
to, you know, converter small. 
Like, not every time I'm going 

123
00:06:15,100 --> 00:06:17,900
to use ggplot for making charts.
So sometimes they should gives 

124
00:06:17,900 --> 00:06:21,000
me the flexibility to quickly 
turn a very powerful 

125
00:06:21,000 --> 00:06:24,200
visualization. 
So I think Excel is still useful

126
00:06:24,200 --> 00:06:26,500
and I used to be using Excel at 
the time as well. 

127
00:06:27,500 --> 00:06:30,100
Okay, I think as we talk about 
our in Python today, I think 

128
00:06:30,100 --> 00:06:33,500
he'll also keep excel in the in 
the loop because I'm also a big 

129
00:06:33,500 --> 00:06:36,400
fan of accelerating. 
The downside of excel in my 

130
00:06:36,400 --> 00:06:37,900
opinion. 
Is that like you can't take sort

131
00:06:37,900 --> 00:06:41,300
of Automate stuff and you have 
to do. 

132
00:06:41,300 --> 00:06:45,500
I hold that with a win-win with 
some of the Microsoft tools that

133
00:06:45,500 --> 00:06:49,000
are actually becoming easier 
now, but but still, I think you 

134
00:06:49,008 --> 00:06:52,000
can't automate easily. 
But in terms of just analyzing 

135
00:06:52,000 --> 00:06:55,100
data in terms of this looking at
data rate, like the thing with 

136
00:06:55,100 --> 00:06:57,800
either our of python, you don't 
really necessarily need to. 

137
00:06:57,800 --> 00:07:00,000
Look at the data. 
It will forces you to look at 

138
00:07:00,000 --> 00:07:02,000
the data. 
And sometimes that will give you

139
00:07:02,000 --> 00:07:06,700
a lot of insight intact or 
python doesn't necessarily 

140
00:07:07,000 --> 00:07:10,800
provide a Jeremy, Howard. 
The co-founder of us today. 

141
00:07:10,800 --> 00:07:13,100
I was like very well respected 
in deep Learning Community. 

142
00:07:13,400 --> 00:07:16,000
He actually emphasizes a lot in 
looking at the data. 

143
00:07:16,000 --> 00:07:17,500
Even when you are doing deep 
learning problem. 

144
00:07:17,500 --> 00:07:20,100
He says, like, you have to 
actually look at the raw data 

145
00:07:20,100 --> 00:07:22,400
and you know, how the neural 
network progress is? 

146
00:07:22,400 --> 00:07:24,600
So I think your point is very 
valid in that case. 

147
00:07:25,400 --> 00:07:27,600
Yeah, because I mean, one of my 
first experiences was like 

148
00:07:27,600 --> 00:07:29,500
again, it wasn't analytics job. 
I was asked to do the 

149
00:07:29,500 --> 00:07:32,900
forecasting because the company 
was using Java for all its 

150
00:07:32,900 --> 00:07:34,300
coding. 
Direct like why don't you write 

151
00:07:34,300 --> 00:07:37,200
your forecasts programs in Java?
I smell a six-month. 

152
00:07:37,700 --> 00:07:38,600
Got no. 
We're in debt. 

153
00:07:38,600 --> 00:07:41,000
Certainly someday. 
I was like, okay, I am going to 

154
00:07:41,000 --> 00:07:42,600
try exit. 
I tried. 

155
00:07:42,600 --> 00:07:44,400
Excellent. 
Within two days to steer by 

156
00:07:44,400 --> 00:07:47,700
where I was going wrong and the 
whole thing was turned upside 

157
00:07:47,700 --> 00:07:50,200
down. 
So now like I mean now that I 

158
00:07:50,200 --> 00:07:52,900
mean apart from Excel, I mean I 
assume you continue to use 

159
00:07:52,900 --> 00:07:56,100
Excellence one, but within our 
and python like coming, do you 

160
00:07:56,100 --> 00:08:00,200
have a kind of a preference on 
your default of what you re what

161
00:08:00,200 --> 00:08:01,800
you use or do you have 
something? 

162
00:08:01,800 --> 00:08:04,500
Like there are certain 
situations where you use are 

163
00:08:04,500 --> 00:08:06,500
certain kind of problem 
statements where you use Python 

164
00:08:06,500 --> 00:08:10,300
like, can you take us through 
like What do you use in what 

165
00:08:10,300 --> 00:08:13,400
context? 
Yep, so I will just basically 

166
00:08:13,400 --> 00:08:16,800
Define the workflow that I 
follow for data collection from,

167
00:08:16,800 --> 00:08:19,500
you know, the Big Data System, I
use equal because it's Park 

168
00:08:19,500 --> 00:08:22,400
sequel. 
So I try to write most of the 

169
00:08:22,400 --> 00:08:26,200
things in there, like create 
like a temporary table for me. 

170
00:08:26,500 --> 00:08:29,600
And then for EDM. 
Mostly, I prefer our and then 

171
00:08:29,600 --> 00:08:33,000
for actual modeling if I have to
do modeling then machine 

172
00:08:33,000 --> 00:08:37,000
learning, I Professor python. 
The reason why I prefer python 

173
00:08:37,000 --> 00:08:40,200
for machine learning is I think 
pythons APA for machine learning

174
00:08:40,200 --> 00:08:42,700
cycle on jpa. 
Quite unmatchable. 

175
00:08:43,299 --> 00:08:47,000
So there is a very clear, 
four-step process, Define fit 

176
00:08:47,300 --> 00:08:50,800
predict and then evaluate I 
think tidy model is coming 

177
00:08:50,800 --> 00:08:54,200
closer to that in our universe, 
but still, I think scikit-learn 

178
00:08:54,200 --> 00:08:57,000
gives me that ability to if I 
want to change the model, just 

179
00:08:57,000 --> 00:08:59,200
have to change the class that 
I'm defining at the top. 

180
00:08:59,600 --> 00:09:02,600
Then I'll have a new model. 
So I strongly prefer like there 

181
00:09:02,600 --> 00:09:06,100
is a very strong preference of 
using python specially secured 

182
00:09:06,100 --> 00:09:10,100
loan for machine learning. 
But if you ask me about Eda, I 

183
00:09:10,100 --> 00:09:12,900
am one of those people who still
a girl with matplotlib for every

184
00:09:12,900 --> 00:09:14,500
simple thing. 
I have to go online and then 

185
00:09:14,500 --> 00:09:17,500
find. 
So I strongly prefer using our 

186
00:09:17,600 --> 00:09:20,100
tidy, which is amazing. 
So I don't want to, you know, 

187
00:09:20,100 --> 00:09:23,400
Miss that opportunity of using 
tidy was riding pipe pipe Based 

188
00:09:23,400 --> 00:09:27,100
on data pipelines, to do some 
analysis, or Edie. 

189
00:09:27,100 --> 00:09:31,200
I prefer the other place where I
prefer orders, if you have to 

190
00:09:31,200 --> 00:09:34,300
create like a web report, a 
simple report, not as oversight 

191
00:09:34,300 --> 00:09:35,400
report. 
I'm not talking about, you know,

192
00:09:35,400 --> 00:09:38,100
shiny or I'm not talking about 
streamlets, but if you want to 

193
00:09:38,100 --> 00:09:41,900
repair a like a client-side 
report like a small panel Such 

194
00:09:41,900 --> 00:09:44,900
as a report. 
Then I think our modern is 

195
00:09:44,900 --> 00:09:48,300
again, quite unmanageable there.
So, the other option that I have

196
00:09:48,300 --> 00:09:51,200
in Python world is I write 
something in jupyter Notebook. 

197
00:09:51,200 --> 00:09:54,100
I format everything and then I, 
you know, I export it as HTML. 

198
00:09:54,100 --> 00:09:57,200
And then I tried to publish it, 
but I don't find that hope flow 

199
00:09:57,200 --> 00:10:00,000
that interesting or that good. 
There are a couple of libraries,

200
00:10:00,000 --> 00:10:03,600
like, Fast pages. 
That is again coming from the 

201
00:10:03,600 --> 00:10:05,500
past, not a community. 
There. 

202
00:10:05,500 --> 00:10:08,000
You can write a jupyter notebook
and publish it as a blog post, 

203
00:10:08,100 --> 00:10:12,200
it's there, but I think the 
workflow that that you Have an R

204
00:10:12,200 --> 00:10:14,200
is really amazing. 
So I have a website now. 

205
00:10:14,200 --> 00:10:15,800
That's called programming with 
or.com. 

206
00:10:15,900 --> 00:10:18,700
So the entire website is 
actually created by our maqbool.

207
00:10:19,300 --> 00:10:22,100
I create an arm of the on file. 
Who should do, the get up, there

208
00:10:22,100 --> 00:10:24,100
is get abortions running and 
then it publishes it. 

209
00:10:24,100 --> 00:10:27,300
I think it's quite seamless and 
with our connect, if you, if 

210
00:10:27,300 --> 00:10:29,900
your organization uses are 
connecting, our mob down is like

211
00:10:29,900 --> 00:10:33,100
a superpower to share internal 
internal analysis report. 

212
00:10:33,100 --> 00:10:35,200
So that is a place I would 
strongly prefer on. 

213
00:10:36,100 --> 00:10:38,100
Yeah. 
So why is it the, why do you 

214
00:10:38,108 --> 00:10:40,000
think it's got that? 
I mean, I don't know how much 

215
00:10:40,000 --> 00:10:42,900
you like. 
History of the Dual credit for 

216
00:10:42,900 --> 00:10:45,600
two languages or software's, if 
you were to call it that. 

217
00:10:46,000 --> 00:10:50,300
Why do you think like optimized 
more in terms of like usability 

218
00:10:50,400 --> 00:10:53,100
user friendly netting Zone while
python if you think about it as 

219
00:10:53,108 --> 00:10:55,500
more like sort of in some way 
developer-friendly, right? 

220
00:10:55,500 --> 00:10:58,100
I mean like if you over, because
it's easier to integrate into 

221
00:10:58,100 --> 00:11:01,400
the into the overall, take 
workflow and things like that. 

222
00:11:01,400 --> 00:11:05,600
So, why do you why do you think 
this differential thought of? 

223
00:11:05,600 --> 00:11:09,100
Yeah, I think it's a very valid 
point D of Broad and I strongly 

224
00:11:09,100 --> 00:11:11,400
think our is developed for 
people. 

225
00:11:11,600 --> 00:11:14,200
Who are not technically strong 
in mind, I could be wrong. 

226
00:11:14,200 --> 00:11:17,600
But if you see people who are 
using our people from Humanities

227
00:11:17,600 --> 00:11:20,800
people, from social science, 
people who work in bioscience. 

228
00:11:20,800 --> 00:11:24,000
I don't I I don't think they're 
like, you know, core programmers

229
00:11:24,000 --> 00:11:28,700
like how programmers tend to be 
but they can get their work done

230
00:11:29,000 --> 00:11:31,700
using their tool to give you an 
example. 

231
00:11:31,700 --> 00:11:35,000
I I actually teach in a local 
University in Bangalore and I've

232
00:11:35,000 --> 00:11:37,500
tried python before, this is for
economic students. 

233
00:11:37,700 --> 00:11:39,900
I've tried python before. 
So these students have never 

234
00:11:39,900 --> 00:11:42,700
coded in their life at all. 
You told them to install a 

235
00:11:42,700 --> 00:11:44,900
software as well. 
Like somebody has to sit next to

236
00:11:44,900 --> 00:11:47,500
them and then tell them it's 
easy for them to use Facebook, 

237
00:11:47,500 --> 00:11:50,300
but it's very hard for them to 
use any technical because they 

238
00:11:50,300 --> 00:11:54,800
have this project in mind. 
So for them I've seen are they, 

239
00:11:54,800 --> 00:11:57,500
you know, learning or much much 
faster than setting python. 

240
00:11:57,500 --> 00:12:00,400
For example, I'll give you 
example setting up start getting

241
00:12:00,400 --> 00:12:02,300
started with python itself is a 
big challenge. 

242
00:12:02,300 --> 00:12:03,600
I mean like I'm a python 
programmer. 

243
00:12:03,600 --> 00:12:06,400
I don't want people to, you 
know, get get angry at me but I 

244
00:12:06,408 --> 00:12:09,500
think the initial friction that 
you have with python is really 

245
00:12:09,500 --> 00:12:12,200
really hot. 
In fact, that is why A platforms

246
00:12:12,200 --> 00:12:16,000
like repeal it, you know, has 
that initial friction reduced 

247
00:12:16,000 --> 00:12:18,000
variability is being used in a 
lot of schools or 

248
00:12:18,000 --> 00:12:21,200
internationally, but if you want
to set it up on your own machine

249
00:12:21,200 --> 00:12:25,000
as a workflow, if you are, if 
you're on Mac, do you know which

250
00:12:25,000 --> 00:12:27,100
python to use? 
Because Mac comes default with 

251
00:12:27,100 --> 00:12:29,100
one python that python doesn't 
help you a lot. 

252
00:12:29,200 --> 00:12:31,400
If you install an account or now
you have, like, probably three 

253
00:12:31,400 --> 00:12:34,500
python on your machine. 
Do you open an account of prompt

254
00:12:34,500 --> 00:12:37,000
and then try something? 
Are you open your existing to 

255
00:12:37,000 --> 00:12:39,900
Miller and do something? 
Then when you use Pape, then do 

256
00:12:39,900 --> 00:12:43,000
you use paper clip 3? 
Then you just, you just comment 

257
00:12:43,000 --> 00:12:45,700
to this, this range of problems.
Then you have like your code 

258
00:12:45,700 --> 00:12:49,100
editor, which coded to use, why?
Somebody would be very strong 

259
00:12:49,100 --> 00:12:51,600
with python by Chomp. 
Somebody would have a strong 

260
00:12:51,600 --> 00:12:53,600
opinion with spider. 
Somebody would have strong 

261
00:12:53,600 --> 00:12:56,800
opinion with visual studio. 
So at the end of this process, 

262
00:12:56,800 --> 00:13:00,000
somebody who tries to learn, 
let's say like I'm a doctor and 

263
00:13:00,000 --> 00:13:02,900
I want to, you know, learn data 
science for something that I'm 

264
00:13:02,900 --> 00:13:04,400
doing at the end of this 
process. 

265
00:13:04,400 --> 00:13:06,500
If I'm a doctor, right? 
I would probably drop box. 

266
00:13:06,500 --> 00:13:09,900
I don't expect anybody to stay 
at until this point and then 

267
00:13:09,900 --> 00:13:11,700
still further. 
Go ahead and then Learn a 

268
00:13:11,700 --> 00:13:13,300
language and our it's quite 
simple. 

269
00:13:13,900 --> 00:13:17,700
Go to our download our go to our
studio, download rstudio. 

270
00:13:17,700 --> 00:13:20,100
You don't have 10 different 
versions install it you're up 

271
00:13:20,100 --> 00:13:22,200
and running just immediately 
like next five minutes. 

272
00:13:22,200 --> 00:13:25,200
You are up and running. 
Yeah, actually, provide doctors 

273
00:13:25,200 --> 00:13:27,900
in Economist. 
I think I with some with the 

274
00:13:27,900 --> 00:13:29,800
degree in computer science. 
I finally found it. 

275
00:13:29,800 --> 00:13:32,500
Hard to get started by the name 
of my actions are good. 

276
00:13:32,500 --> 00:13:35,700
But I think Amon there was a, I 
was doing a machine learning 

277
00:13:35,700 --> 00:13:38,400
Workshop in collaboration with 
someone and they did. 

278
00:13:38,500 --> 00:13:42,700
I want to do it in Python. 
And so that Back person actually

279
00:13:42,700 --> 00:13:45,800
taught me bison over to three 
days, creating bills to pay and 

280
00:13:45,800 --> 00:13:49,000
that's how I learnt by. 
Otherwise are the sort of the, 

281
00:13:49,700 --> 00:13:54,300
the initial learning curve, was 
actually very, very steep for me

282
00:13:54,300 --> 00:13:56,500
that sets and it's like later 
actually. 

283
00:13:56,500 --> 00:13:58,700
I got infected with a few 
organizations, which were 

284
00:13:58,900 --> 00:14:00,900
completely in the in the python 
world. 

285
00:14:00,900 --> 00:14:03,800
Like, so they asked me to use 
Python there again, what I 

286
00:14:03,800 --> 00:14:06,600
remember was like they had 
virtual machines which I have to

287
00:14:06,600 --> 00:14:10,200
be set up by things like that, 
which it was a, I had no clue 

288
00:14:10,200 --> 00:14:12,800
how to use. 
And then there was a dozen aw 

289
00:14:12,800 --> 00:14:15,500
space where you had Jupiter, but
they're like, how do you get 

290
00:14:15,500 --> 00:14:18,100
files in and out of that 
direction, was not very 

291
00:14:18,100 --> 00:14:22,300
intuitive and things like that. 
So even as a as somebody with a 

292
00:14:22,400 --> 00:14:25,700
sort of a fairly strong 
programming do not software, 

293
00:14:25,700 --> 00:14:28,600
engineering background, I found 
it very difficult to kind of 

294
00:14:28,600 --> 00:14:30,900
gauge is get started in the 
started invited. 

295
00:14:31,000 --> 00:14:33,600
I very comforted by the lady you
were talking about an accord a 

296
00:14:33,608 --> 00:14:36,900
spider and things like that. 
The other thing is that you also

297
00:14:36,900 --> 00:14:40,800
have like within python syntax 
very linked significantly, 

298
00:14:40,800 --> 00:14:42,800
right? 
Like There is the numpy, some 

299
00:14:43,100 --> 00:14:46,100
context, there's a bind us 
indexed, then they report by 10 

300
00:14:46,100 --> 00:14:46,700
p.m. 
To deck. 

301
00:14:46,700 --> 00:14:50,900
And in some places things are, 
there are some small differences

302
00:14:50,900 --> 00:14:53,700
and you can just end up writing,
fairly inefficient, good. 

303
00:14:53,800 --> 00:14:56,100
And the thing is even our, as if
you think about it, like every 

304
00:14:56,100 --> 00:14:58,600
day, you have the tiny worst way
of doing things. 

305
00:14:58,600 --> 00:15:01,600
And also the base, our way of 
doing things, which a lot of 

306
00:15:01,900 --> 00:15:05,000
people, especially in Academia. 
I think they still do baits are 

307
00:15:05,000 --> 00:15:07,200
in stuff. 
So while so what happens, I mean

308
00:15:07,200 --> 00:15:10,100
like when you have these were 
sort of versions literature of, 

309
00:15:10,100 --> 00:15:13,200
does it make it? 
Easier and more user-friendly 

310
00:15:13,200 --> 00:15:15,700
for the quarters. 
Or is it just that like the 

311
00:15:15,700 --> 00:15:19,100
communication between coders 
sort of drops and things like 

312
00:15:19,100 --> 00:15:21,900
that? 
No, I think if you are a, if 

313
00:15:21,900 --> 00:15:24,700
you're a border for quite some 
time, so these kind of changes 

314
00:15:24,700 --> 00:15:27,300
are do not trouble you much, but
I am but if you're somebody who 

315
00:15:27,300 --> 00:15:29,700
is starting new. 
So like you said that this 

316
00:15:29,900 --> 00:15:31,300
there's a DOT notation and 
python. 

317
00:15:31,300 --> 00:15:33,500
There is a function for you you 
something. 

318
00:15:33,500 --> 00:15:35,000
There's a method that you use 
something else. 

319
00:15:35,000 --> 00:15:37,900
There's a an object that you 
define and even in our old you 

320
00:15:37,908 --> 00:15:39,800
have the same problem, right? 
What is five, operator? 

321
00:15:39,800 --> 00:15:42,100
Some people do not understand 
how they have to buy by 

322
00:15:42,100 --> 00:15:44,200
operators. 
One from, tidy was Universe one 

323
00:15:44,200 --> 00:15:46,400
from court. 
So I think these are these are 

324
00:15:46,400 --> 00:15:49,800
the problems that beginners. 
I dilute phase, but I Only 

325
00:15:49,800 --> 00:15:52,200
believe, once you get started, 
especially if you're a 

326
00:15:52,200 --> 00:15:55,900
programmer, who who learns from 
documentation every time there 

327
00:15:55,900 --> 00:15:57,800
is a new release. 
You you understand. 

328
00:15:57,800 --> 00:16:00,700
What is the relief? 
And I think I diversity team 

329
00:16:00,700 --> 00:16:03,700
actually does a great job in 
communicating that especially if

330
00:16:03,700 --> 00:16:07,800
they're deprecating a function 
and also any primary for python 

331
00:16:07,800 --> 00:16:10,200
for that matter. 
They also communicate that the 

332
00:16:10,200 --> 00:16:12,600
problem arises when most people 
do not look into it. 

333
00:16:12,900 --> 00:16:15,100
Most people when there is a new 
release for example, the last 

334
00:16:15,100 --> 00:16:16,700
week, last week. 
I think I lost last week. 

335
00:16:16,700 --> 00:16:19,500
There was a new cycle own 
release cycle on one point. 

336
00:16:20,000 --> 00:16:22,600
I don't know how many, how many 
programmers, especially working 

337
00:16:22,600 --> 00:16:25,200
in data. 
Science Community have had a 

338
00:16:25,200 --> 00:16:27,600
look at it. 
I strongly believe data. 

339
00:16:27,600 --> 00:16:30,000
Scientists still don't think, 
like software developers a lot. 

340
00:16:30,000 --> 00:16:33,100
I don't know which Universe they
live in by the mostly if you 

341
00:16:33,100 --> 00:16:36,300
actually see her always, you 
know, in the race of, you know, 

342
00:16:36,300 --> 00:16:38,700
improving accuracy of their 
model latest algorithm. 

343
00:16:39,400 --> 00:16:41,100
In this process. 
They actually leave out this, 

344
00:16:41,100 --> 00:16:43,800
you know, there are certain good
practices that software 

345
00:16:43,800 --> 00:16:46,300
engineering can actually teach 
us and we started realizing it. 

346
00:16:46,300 --> 00:16:49,200
Very now, like after maturity 
has increased in this domain. 

347
00:16:49,700 --> 00:16:51,500
But still a lot of people 
actually leave order. 

348
00:16:51,600 --> 00:16:54,200
I think that is a problem. 
Otherwise developer should like 

349
00:16:54,200 --> 00:16:56,600
developers who release libraries
are doing great job. 

350
00:16:56,600 --> 00:16:59,300
It's just that we don't notice 
it actually just take a little 

351
00:16:59,300 --> 00:17:01,000
digression. 
Oh, so you may think that they 

352
00:17:01,000 --> 00:17:03,800
can be, there are things that 
like data scientists can learn 

353
00:17:03,800 --> 00:17:05,700
some software from software 
doing community. 

354
00:17:05,700 --> 00:17:07,300
Right? 
Like what are the some of those 

355
00:17:07,300 --> 00:17:09,200
things? 
So there are a lot of things 

356
00:17:09,200 --> 00:17:12,000
data scientists can learn from 
software, engineering to start 

357
00:17:12,000 --> 00:17:15,099
with one of the problems that I 
have as I write very bad code 

358
00:17:15,300 --> 00:17:17,700
and the way I'm trying to 
overcome right now is the 

359
00:17:17,700 --> 00:17:20,599
Jupiter environment like you 
can, you can do The dog control 

360
00:17:20,599 --> 00:17:23,300
groups has given its a very 
popular talk about why. 

361
00:17:23,300 --> 00:17:27,099
Jupyter notebook is not a great 
tool for writing a software code

362
00:17:27,099 --> 00:17:29,800
like development. 
I completely agree with that. 

363
00:17:29,800 --> 00:17:31,800
Jupyter. 
Notebook is is like a repeal 

364
00:17:31,800 --> 00:17:33,900
environment. 
You write something you iterate 

365
00:17:33,900 --> 00:17:36,900
on it, you write something in 
bed, visualization and along 

366
00:17:36,900 --> 00:17:40,300
with it and then you can publish
a report but using jupyter 

367
00:17:40,300 --> 00:17:43,400
notebook only as a final tool, 
like a lot of people do this, 

368
00:17:43,400 --> 00:17:45,100
right? 
They write, jupyter notebook and

369
00:17:45,100 --> 00:17:46,700
they want when they want a 
python code. 

370
00:17:46,700 --> 00:17:48,500
They wouldn't write a python 
code from the first. 

371
00:17:48,600 --> 00:17:51,200
They'll just go to the Our 
notebook, download the python 

372
00:17:51,200 --> 00:17:53,300
script and then they would try 
to start, you know, embedding 

373
00:17:53,300 --> 00:17:54,600
it. 
I think that's a very bad 

374
00:17:54,600 --> 00:17:56,800
practice. 
What we can do is, like, what I 

375
00:17:56,800 --> 00:17:59,800
started doing is using visual, 
could use Visual Studio code BS,

376
00:17:59,800 --> 00:18:02,800
code extensively. 
So we escort has jupyter 

377
00:18:02,800 --> 00:18:04,200
notebook. 
And we have scored. 

378
00:18:04,300 --> 00:18:08,600
Let me write a python code as 
well and the advantages I use vs

379
00:18:08,600 --> 00:18:10,500
codes. 
Jupyter notebook for any Rebel, 

380
00:18:10,500 --> 00:18:12,600
like whatever. 
I want, iterate small things, I 

381
00:18:12,600 --> 00:18:15,400
use it and then I just copy it 
and then I paste it in the 

382
00:18:15,400 --> 00:18:17,800
python script. 
So I have a decent python script

383
00:18:17,800 --> 00:18:20,900
at the end. 
So this way I am The term and 

384
00:18:20,900 --> 00:18:23,900
not ruining my python script 
development experience. 

385
00:18:24,000 --> 00:18:26,200
At the same time. 
I'm not losing my ability to 

386
00:18:26,200 --> 00:18:28,400
iterate on something. 
Like if I want to make a chart 

387
00:18:28,400 --> 00:18:31,400
and then see how it looks this 
is this is one thing that I 

388
00:18:31,400 --> 00:18:33,800
would I would strongly say. 
And the second thing that I 

389
00:18:33,800 --> 00:18:36,900
think software engineer software
developers do a lot is going 

390
00:18:36,900 --> 00:18:39,000
through documentation. 
I think this is what I touched 

391
00:18:39,000 --> 00:18:41,600
upon slightly before data 
science Community seems to be so

392
00:18:41,600 --> 00:18:44,700
obsessed with the new algorithm,
you know, like Cutting Edge 

393
00:18:44,700 --> 00:18:47,600
tools, but I don't know how many
people like if you go online 

394
00:18:47,600 --> 00:18:49,500
like you would see lot of people
asking questions. 

395
00:18:49,600 --> 00:18:51,400
It's like, how do I become data 
centers? 

396
00:18:51,400 --> 00:18:55,100
How do I become data center, but
you the initial becoming data 

397
00:18:55,100 --> 00:18:58,500
scientist is everybody trying, 
but from the beginner to 

398
00:18:58,500 --> 00:19:00,700
intermediate stage, you would 
see very less content on 

399
00:19:00,700 --> 00:19:02,300
internet. 
No, boot camp. 

400
00:19:02,300 --> 00:19:04,400
Trains them. 
No courses teaches them. 

401
00:19:04,400 --> 00:19:07,500
The there is very less content 
in the middle space and I think 

402
00:19:07,500 --> 00:19:10,700
because of that Gap, a lot of 
people actually get stuck in the

403
00:19:10,700 --> 00:19:12,900
beginner space. 
They are only searching for job 

404
00:19:12,900 --> 00:19:16,300
and then they never move ahead. 
And that again leads to a lot of

405
00:19:16,300 --> 00:19:19,200
bad code that we end up writing.
Technical debt is a big problem 

406
00:19:19,200 --> 00:19:21,500
in order. 
When we write jupyter notebook, 

407
00:19:21,500 --> 00:19:24,500
that nobody can understand. 
A lot of people still do not lay

408
00:19:24,500 --> 00:19:26,900
out their dependencies properly.
Like, if you are using python 

409
00:19:26,900 --> 00:19:30,400
for that matter, a lot of people
don't have dependencies filed, 

410
00:19:30,400 --> 00:19:32,400
right? 
If you have a library 

411
00:19:32,400 --> 00:19:34,400
dependency, what is the library 
version? 

412
00:19:34,400 --> 00:19:36,700
We still do not know if there is
a vulnerability, then we are 

413
00:19:36,700 --> 00:19:38,300
upgrading. 
Is there any truck? 

414
00:19:38,400 --> 00:19:40,500
We don't know. 
So a lot of these problems come 

415
00:19:40,500 --> 00:19:43,800
in because we just dumped into 
directly, a jupyter notebook, 

416
00:19:43,800 --> 00:19:45,200
and then we don't do anything 
before. 

417
00:19:45,200 --> 00:19:47,000
I think like these are a couple 
of things. 

418
00:19:47,000 --> 00:19:48,800
There are a lot of things 
actually enjoy gross. 

419
00:19:48,800 --> 00:19:51,600
Talk about By Jupiter notebook 
is bad for software development.

420
00:19:51,600 --> 00:19:52,900
Will actually tell you a lot 
more. 

421
00:19:53,500 --> 00:19:55,900
Okay, I think we're learning to 
that dock in the in the show 

422
00:19:55,900 --> 00:19:58,400
notes. 
So that liquor lives in can go 

423
00:19:58,400 --> 00:20:00,600
and check it out. 
And you mentioned about online 

424
00:20:00,600 --> 00:20:03,300
support in things like that. 
So but what if the in the two 

425
00:20:03,300 --> 00:20:06,300
years with python, what I found 
is that them, especially in 

426
00:20:06,300 --> 00:20:09,700
terms of like your online, help 
stack Overflow and so on the 

427
00:20:09,700 --> 00:20:14,300
support for python seal much 
harder than the our support for 

428
00:20:14,300 --> 00:20:16,300
our it was very difficult to 
there. 

429
00:20:16,300 --> 00:20:18,500
For example, some day. 
I was like coming from the art 

430
00:20:18,500 --> 00:20:20,400
world. 
I was like, aww, Metadata play 

431
00:20:20,400 --> 00:20:22,800
with python. 
I asked them some of the people 

432
00:20:22,800 --> 00:20:25,200
I was working with and they 
didn't understand what making a 

433
00:20:25,200 --> 00:20:28,000
data frame needs. 
And then like that overflow 

434
00:20:28,000 --> 00:20:30,900
again, like it was very 
difficult to our, it gave either

435
00:20:30,900 --> 00:20:33,700
like some are links or 
something, but nothing related 

436
00:20:33,700 --> 00:20:37,100
to Panda. 
So is this is my perception of 

437
00:20:37,100 --> 00:20:38,900
the online support of the two 
things, correct? 

438
00:20:38,900 --> 00:20:43,200
And if so, why do you think that
regarding online support? 

439
00:20:43,200 --> 00:20:46,300
I strongly believe python has a 
better support in terms of, you 

440
00:20:46,308 --> 00:20:50,300
know, the because of because of 
the nature of the Language, 

441
00:20:50,300 --> 00:20:52,800
which is very, you know, data 
scientist. 

442
00:20:52,900 --> 00:20:54,400
The support for python is really
good. 

443
00:20:54,400 --> 00:20:56,700
I would say. 
But the problem arises, when 

444
00:20:56,700 --> 00:20:59,700
some are Developer actually 
comes to python world and then 

445
00:20:59,700 --> 00:21:02,300
try to use the same keywords 
that they have used to used to 

446
00:21:02,300 --> 00:21:04,400
use our. 
So, for example, melt and 

447
00:21:04,400 --> 00:21:08,600
reshape is not melt in Python. 
So melt melt is like, mint 

448
00:21:08,600 --> 00:21:10,100
different likely, voter and 
pivoting. 

449
00:21:10,400 --> 00:21:13,200
I think that is first Gap, and 
the second thing if you see 

450
00:21:13,200 --> 00:21:15,400
again, python does not want to 
enter python, right? 

451
00:21:15,400 --> 00:21:18,200
For example are tiny. 
Tiny was one of the complaints 

452
00:21:18,200 --> 00:21:21,100
that people make about. 
Videos that are studio is 

453
00:21:21,100 --> 00:21:24,500
tightly controlling by diverse, 
but because they are tightly 

454
00:21:24,500 --> 00:21:26,600
controlling tidy, was everything
in Thai. 

455
00:21:26,600 --> 00:21:30,600
Diverse has you know some 
unified theme, but that does not

456
00:21:30,600 --> 00:21:32,500
want how it is in Python 
Universe, right? 

457
00:21:32,500 --> 00:21:35,400
Soap, and houses from a 
different group, numpy is from a

458
00:21:35,408 --> 00:21:37,100
different group. 
Like you learn is from a 

459
00:21:37,100 --> 00:21:39,900
different group, even though if 
all of these are like in part of

460
00:21:39,900 --> 00:21:42,400
the scientific stack, so it is 
all from different different 

461
00:21:42,400 --> 00:21:44,700
entities or different 
developers. 

462
00:21:45,000 --> 00:21:47,700
So that makes it slightly 
different from what they have. 

463
00:21:47,700 --> 00:21:50,100
They have been using and that is
why you Don't see that 

464
00:21:50,100 --> 00:21:52,600
uniformity there. 
When, so what do you do? 

465
00:21:52,600 --> 00:21:56,000
I mean, I is in our user. 
Let's say I do I do who seek for

466
00:21:56,000 --> 00:21:59,000
help in pythons are assuming I 
use the right keywords. 

467
00:21:59,000 --> 00:22:02,000
Like, I use the keywords that 
are more relevant to python. 

468
00:22:02,400 --> 00:22:05,200
What is the, what is, is 
Googling till the answer and 

469
00:22:05,200 --> 00:22:07,300
Googling for and get going to 
stack? 

470
00:22:07,300 --> 00:22:08,600
Overflow links to the 
underworld? 

471
00:22:08,600 --> 00:22:11,200
Is it, that python have a 
different kind of a support 

472
00:22:11,200 --> 00:22:13,100
system on that? 
Yeah. 

473
00:22:13,100 --> 00:22:16,700
So mostly these days I use 
DuckDuckGo only for one reason. 

474
00:22:16,700 --> 00:22:19,400
So I think Google's answers are 
mostly car. 

475
00:22:19,600 --> 00:22:22,400
With a lot of, you know, SEO 
strong SEO blog post. 

476
00:22:22,500 --> 00:22:24,300
So I use DuckDuckGo for that 
purpose. 

477
00:22:24,600 --> 00:22:27,000
Lot of blocks are not optimized 
for DuckDuckGo. 

478
00:22:27,000 --> 00:22:31,100
At least II, think. 
So I use DuckDuckGo and mostly, 

479
00:22:31,100 --> 00:22:32,700
I go to the stack, Overflow 
answer. 

480
00:22:32,700 --> 00:22:35,400
And I would, at least go to 
couple of answers. 

481
00:22:35,700 --> 00:22:38,500
One of the problems that you 
would probably see in Python is 

482
00:22:39,200 --> 00:22:41,000
most of the times. 
If you go to the top answered 

483
00:22:41,000 --> 00:22:44,500
stack, Overflow answer, you 
would see some answer, that is 

484
00:22:44,500 --> 00:22:48,300
not very recent and that is 
probably because this feature 

485
00:22:48,300 --> 00:22:50,800
was not available, you know, 
When five years back, a few 

486
00:22:50,800 --> 00:22:53,200
years back, when somebody 
question, and somebody had given

487
00:22:53,200 --> 00:22:55,600
an answer to that and somebody 
Mark the dancer was, right. 

488
00:22:56,000 --> 00:22:58,400
So, I would strongly encourage 
anybody to actually go through a

489
00:22:58,408 --> 00:23:01,000
couple of answers below. 
That also, to understand if 

490
00:23:01,000 --> 00:23:03,900
there is a new feature that has 
been added to give you an 

491
00:23:03,900 --> 00:23:07,100
example, one of my stock or blue
answers in our community, how to

492
00:23:07,100 --> 00:23:09,800
fill missing values. 
So, Phil underscore n, a is a 

493
00:23:09,800 --> 00:23:11,900
function that was added in tidy,
Versa lighting. 

494
00:23:11,900 --> 00:23:14,700
I think a couple like in last 
within last two years. 

495
00:23:15,400 --> 00:23:17,700
If you go to that stack, 
Overflow question, you would see

496
00:23:17,700 --> 00:23:19,400
a lot of answers that has 
nothing to do. 

497
00:23:19,600 --> 00:23:21,100
This, but this is the easiest 
way. 

498
00:23:21,500 --> 00:23:24,100
And once I added this answer, I 
think probably this is my most 

499
00:23:24,100 --> 00:23:26,800
uploaded answer or something 
because a lot of people started 

500
00:23:27,300 --> 00:23:29,900
then seeing this answer as one 
of the best answers, but still 

501
00:23:29,900 --> 00:23:31,400
this would not be marked as 
answer. 

502
00:23:31,900 --> 00:23:34,900
So anybody who's looking for an 
answer on stack Overflow? 

503
00:23:35,500 --> 00:23:37,000
I think not just the mark 
dancer. 

504
00:23:37,000 --> 00:23:39,700
They should scroll down and he 
because all these libraries keep

505
00:23:39,700 --> 00:23:42,200
on enhancing things. 
And when win, this is being 

506
00:23:42,200 --> 00:23:45,100
added as new answer, lot of 
beginners might or even 

507
00:23:45,100 --> 00:23:47,800
practitioners might miss out 
that the most efficient way of 

508
00:23:47,800 --> 00:23:49,200
doing this. 
Just by looking at, you know, 

509
00:23:49,200 --> 00:23:51,300
some for your five-year-old. 
Super and one other thing. 

510
00:23:51,300 --> 00:23:53,400
I think you touched upon 
research some time back. 

511
00:23:53,400 --> 00:23:57,200
So again, like as you as you 
were, everybody else knows like 

512
00:23:57,300 --> 00:24:00,000
I come from the art world and 
have sort of exterior 

513
00:24:00,000 --> 00:24:02,300
experimented with python and 
sort of gone back. 

514
00:24:02,400 --> 00:24:05,300
So my big so I'm very visually 
the way I think. 

515
00:24:05,300 --> 00:24:09,400
So if somebody sends me a large 
Excel sheet unless it's probably

516
00:24:09,400 --> 00:24:12,800
sort of properly formatted with 
conditional, formatting and 

517
00:24:12,800 --> 00:24:15,400
things are, it's very difficult 
for me to kind of really make 

518
00:24:15,400 --> 00:24:17,400
sense of it. 
So the way I make sense of data 

519
00:24:17,400 --> 00:24:19,400
is to sort of plotted 
graphically. 

520
00:24:20,200 --> 00:24:23,800
And if on the knotting support 
in Python to be like, fairly 

521
00:24:23,800 --> 00:24:25,700
weak in some sense. 
I mean, it's again. 

522
00:24:25,700 --> 00:24:30,000
So, how do you kind of navigate 
this sort of difference in in 

523
00:24:30,000 --> 00:24:32,300
visualizations between the two 
platforms? 

524
00:24:32,300 --> 00:24:34,500
Or is it that? 
Like if you are heavy on 

525
00:24:34,500 --> 00:24:40,400
analysis and visualization you 
should just use our Yeah, I feel

526
00:24:40,500 --> 00:24:42,800
the thing that you just 
mentioned lost and I do that. 

527
00:24:42,800 --> 00:24:46,500
So every platform has an 
advantage and disadvantages in 

528
00:24:46,500 --> 00:24:48,000
that way. 
I strongly believe, like if you 

529
00:24:48,008 --> 00:24:51,100
want to do very extensive video,
at least I am very comfortable 

530
00:24:51,100 --> 00:24:53,500
in doing that in our I'll give 
an example. 

531
00:24:53,500 --> 00:24:56,900
So I tried plotly. 
When just plotly, was there like

532
00:24:56,900 --> 00:24:59,400
a couple of years back. 
So you have to build go object? 

533
00:24:59,400 --> 00:25:01,300
You have to do, you know, like 
the code itself. 

534
00:25:01,300 --> 00:25:02,900
Looks like a Json script or 
something. 

535
00:25:03,100 --> 00:25:06,200
So I completely disregarded, 
plotly, but now, plotly has come

536
00:25:06,200 --> 00:25:07,800
up with plotly Express, which 
they are calling. 

537
00:25:08,000 --> 00:25:11,500
I was inspired by ggplot now, 
for every single plot. 

538
00:25:12,000 --> 00:25:14,500
You have the same syntax, you 
have data frame. 

539
00:25:14,600 --> 00:25:17,600
You have x axis, you have y axis
at couple of other attribute, 

540
00:25:17,800 --> 00:25:20,000
you don't touch any of us are a 
couple of other arguments. 

541
00:25:20,000 --> 00:25:21,400
You don't touch any of the other
arguments. 

542
00:25:21,700 --> 00:25:26,300
And just with three data frame, 
XY you can make a bar chart. 

543
00:25:26,300 --> 00:25:28,100
You can convert that bar, chart 
to the scatter plot. 

544
00:25:28,100 --> 00:25:30,900
You can do all of these things 
and that is how very much, you 

545
00:25:30,908 --> 00:25:34,700
know, ggplot used to look right 
or at least till you have the G 

546
00:25:34,700 --> 00:25:37,300
blotnick. 
So, I think now libraries also 

547
00:25:37,300 --> 00:25:39,900
started Realizing that, you 
know, how world is the best. 

548
00:25:39,900 --> 00:25:42,300
And I don't think anybody would 
disagree with that statement, 

549
00:25:42,300 --> 00:25:46,100
that ggplot and tidy was 
universe is the best for data 

550
00:25:46,100 --> 00:25:48,100
analysis. 
Even I've seen a lot of python 

551
00:25:48,100 --> 00:25:51,200
data and data science between 
success accepting this, this 

552
00:25:51,200 --> 00:25:55,300
fact, so if you ask me, I would 
say if I want to make an 

553
00:25:55,300 --> 00:25:58,000
analysis, make an analysis 
report, make something that is 

554
00:25:58,008 --> 00:26:00,600
shareable that I can, you know, 
share it across internal 

555
00:26:00,600 --> 00:26:02,900
organizations Community 
interactive. 

556
00:26:03,100 --> 00:26:06,800
I would still stick to our for 
that rather than you know, going

557
00:26:06,800 --> 00:26:09,200
to Python and My luck that is a 
place. 

558
00:26:09,200 --> 00:26:13,000
I would strongly use our Gordon 
Gordon, okay, I'm done. 

559
00:26:13,100 --> 00:26:15,800
So that's instead. 
I mean like well where you 

560
00:26:15,800 --> 00:26:18,500
belong apart from legit machine 
learning models, like for 

561
00:26:18,500 --> 00:26:21,300
example tidy models have for me 
has been like anything, but 

562
00:26:21,300 --> 00:26:27,200
tidy, you just seems a very very
verbose way of defining a Model.

563
00:26:27,200 --> 00:26:29,400
A. 
I having used Cyclones three 

564
00:26:29,400 --> 00:26:31,800
lines of code. 
It's very difficult to come to 

565
00:26:31,800 --> 00:26:33,600
come to terms with Payette ID 
models. 

566
00:26:34,000 --> 00:26:35,700
One more thing is recently for 
some work. 

567
00:26:35,700 --> 00:26:38,500
I had to write a random forest 
and then I found that the 

568
00:26:38,500 --> 00:26:42,400
natural the sort of the default.
Random Forest as going. 

569
00:26:42,400 --> 00:26:46,000
Our it has been somebody had it 
has been adapted from Fortran. 

570
00:26:46,100 --> 00:26:48,900
So it is sort of I think it's a 
32-bit code or something. 

571
00:26:48,900 --> 00:26:52,100
So because of that if you have a
large data set to just hands, 

572
00:26:52,500 --> 00:26:55,000
right? 
So please, I guess machine 

573
00:26:55,000 --> 00:26:57,900
learning is one thing where 
like, sort of python is sort of 

574
00:26:57,900 --> 00:26:59,800
his. 
I don't think there's any 

575
00:26:59,800 --> 00:27:02,600
comparison again. 
Well, I hate of our, do you see 

576
00:27:02,600 --> 00:27:04,700
are catching up in this? 
Or do you think that like, I 

577
00:27:04,700 --> 00:27:08,300
mean, it's best for people to be
sort of bilingual to kind of 

578
00:27:08,700 --> 00:27:10,100
address. 
Steve these two words. 

579
00:27:10,900 --> 00:27:12,800
Now, if you're bilingual, you 
should definitely stick to 

580
00:27:12,800 --> 00:27:15,300
scikit-learn. 
There is no to a avoided, but I 

581
00:27:15,300 --> 00:27:18,600
see our studio doing a lot of 
lot of lot of work. 

582
00:27:18,600 --> 00:27:21,900
In terms of Tidy models, are to 
give you an example, Julia, 

583
00:27:21,900 --> 00:27:24,200
silicic. 
So, the the creator of like, one

584
00:27:24,200 --> 00:27:26,000
of the creators of Tidy text 
mining. 

585
00:27:26,000 --> 00:27:30,000
So, she creates a video every 
week, and I am seeing a pattern 

586
00:27:30,000 --> 00:27:31,900
that every week. 
She uses tidy models. 

587
00:27:32,300 --> 00:27:35,100
So, this is a pattern and I see 
our studio doing a lot of 

588
00:27:35,100 --> 00:27:38,400
webinars related to tidy models 
and tidy models, if you know, 

589
00:27:38,400 --> 00:27:41,900
like, it initially was like they
were Initially only that carrot 

590
00:27:41,900 --> 00:27:44,700
was there, Max Cohen had 
developed it, but tidy models is

591
00:27:44,700 --> 00:27:46,800
something that now they've 
pushed, they have created a 

592
00:27:46,800 --> 00:27:49,500
universe like tiny words. 
They've created a universe 

593
00:27:49,500 --> 00:27:51,000
around, you know, the study 
models. 

594
00:27:51,200 --> 00:27:54,600
So if you are in our universe, I
think there is a strong value in

595
00:27:54,600 --> 00:27:57,700
getting into tidy, you know, 
Teddy models than, you know, 

596
00:27:57,700 --> 00:28:00,000
using like you said, right? 
You have like three different 

597
00:28:00,000 --> 00:28:02,200
ways to do random person bar. 
Which one do you pick? 

598
00:28:02,200 --> 00:28:04,800
I think that is the problem. 
We didn't have initially but we 

599
00:28:04,800 --> 00:28:07,300
have now I think we should like 
at least practitioners should 

600
00:28:07,300 --> 00:28:09,100
solve it by just getting into 
one universe. 

601
00:28:09,800 --> 00:28:12,000
That would that would Help them,
you know, do everything in 

602
00:28:12,000 --> 00:28:14,700
standardized manner. 
But if you are bilingual, there 

603
00:28:14,700 --> 00:28:17,100
is no question about, you know, 
leaving scikit-learn, especially

604
00:28:17,100 --> 00:28:19,600
the fact that, you know, deep 
learning libraries, right? 

605
00:28:19,600 --> 00:28:21,200
There are a couple of deep 
learning bindings. 

606
00:28:21,200 --> 00:28:23,700
In are like, for example, you 
have carers binding in, are you 

607
00:28:23,700 --> 00:28:27,500
have helped in the flow binding 
an hour, but it's still a 

608
00:28:27,508 --> 00:28:30,100
binding. 
You still have to have python 

609
00:28:30,100 --> 00:28:32,900
installed on your machine food 
reticular, 10 top of it and then

610
00:28:32,900 --> 00:28:36,100
do it rather. 
Do it in, you know, the this is 

611
00:28:36,100 --> 00:28:37,700
the same language python itself,
right? 

612
00:28:37,700 --> 00:28:43,200
So why to use bindings, so I 
mean, so again like I mean as 

613
00:28:43,200 --> 00:28:46,100
you know, like I mean like be 
like recruiting fairly heavily 

614
00:28:46,100 --> 00:28:50,200
over the last one year and since
I personally use our I sort of 

615
00:28:50,200 --> 00:28:53,200
have been specifying are for all
the people that I recruit. 

616
00:28:53,500 --> 00:28:57,800
But what I find is that in 
India, among all the series that

617
00:28:57,800 --> 00:29:00,800
I get, like a very, very large. 
Majority of the data scientist. 

618
00:29:00,800 --> 00:29:04,900
Seem to only know, python even 
for Eda, they use Python and in 

619
00:29:04,900 --> 00:29:06,800
my opinion, produce a lot of 
ugly crafts. 

620
00:29:06,900 --> 00:29:10,500
And, and so like, I mean, why 
why? 

621
00:29:10,600 --> 00:29:13,600
Why is it that like sister of? 
I mean, it's, it's only been now

622
00:29:13,600 --> 00:29:17,400
Lego the last one or two months 
that I found at least some sort 

623
00:29:17,400 --> 00:29:20,400
of a critical mass of people who
are, who are actually using a 

624
00:29:20,400 --> 00:29:23,000
fair amount of our, in their 
regular work. 

625
00:29:23,000 --> 00:29:26,000
It's also, why do you think the 
data science community in India,

626
00:29:26,000 --> 00:29:29,000
especially as link? 
So heavily towards the python? 

627
00:29:29,000 --> 00:29:31,700
I mean, that is just my 
empirical experience over. 

628
00:29:32,100 --> 00:29:35,100
And why is there a difference in
sort of a background of people 

629
00:29:35,100 --> 00:29:40,300
who, in your opinion, seemed to 
prefer a prefer python? 

630
00:29:40,600 --> 00:29:43,300
Part of the work. 
They do your first question. 

631
00:29:43,300 --> 00:29:45,700
I think there was a myth and 
that myth, says, you know, 

632
00:29:45,700 --> 00:29:47,300
becoming like a self-fulfilling 
prophecy. 

633
00:29:47,300 --> 00:29:51,100
Know the myth was that python is
a better language for data 

634
00:29:51,100 --> 00:29:53,000
science. 
Okay, so which I strongly 

635
00:29:53,000 --> 00:29:56,600
disagree, again even today, but 
this myth was told again and 

636
00:29:56,600 --> 00:29:59,500
again and again and everybody 
who is getting started to become

637
00:29:59,500 --> 00:30:02,700
a data scientist. 
They believe this myth and now 

638
00:30:02,700 --> 00:30:06,000
they actually, you know, learned
only Python and Deb. 

639
00:30:06,200 --> 00:30:09,000
There was what they were brought
into this mess and now it became

640
00:30:09,000 --> 00:30:11,300
like a self-fulfilling prophecy 
because Everybody is now a 

641
00:30:11,308 --> 00:30:13,500
python developer? 
And now, everybody will say, 

642
00:30:13,500 --> 00:30:14,600
okay. 
See everybody is a python 

643
00:30:14,600 --> 00:30:16,000
developer. 
That's why python is based. 

644
00:30:16,000 --> 00:30:17,700
So it is, it has become like 
that. 

645
00:30:17,800 --> 00:30:20,300
I think that myth has been told 
multiple times in Indians are, 

646
00:30:20,300 --> 00:30:24,000
and data science Community. 
Have I run like, for the record 

647
00:30:24,000 --> 00:30:27,800
I run Bangalore or user group. 
We have, we have done, we have 

648
00:30:27,800 --> 00:30:30,100
partnered with the pi data 
Bangalore before and then we 

649
00:30:30,100 --> 00:30:33,100
have run events. 
But yeah, so have we have people

650
00:30:33,100 --> 00:30:35,200
coming? 
Like there is a Malik lot. 

651
00:30:35,200 --> 00:30:37,700
I have spoken to a lot of people
who use our in their daily 

652
00:30:37,700 --> 00:30:40,400
workflow, but still people don't
want to believe what they do. 

653
00:30:40,500 --> 00:30:42,800
I want to believe. 
So this, this myth has become 

654
00:30:42,800 --> 00:30:45,900
like a fact that is, that is one
of the primary reasons why you 

655
00:30:45,900 --> 00:30:48,200
would see a lot of python-based 
resumes. 

656
00:30:48,700 --> 00:30:51,100
Second thing is, I, I don't 
know. 

657
00:30:51,100 --> 00:30:54,000
Somehow, I feel, you know, 
people who are technically, good

658
00:30:54,600 --> 00:30:57,800
being always bashing a heart. 
I don't know, for what reason, 

659
00:30:58,000 --> 00:30:59,900
so, I, I am a computer science 
engineer. 

660
00:30:59,900 --> 00:31:03,200
I use Python, but still I love 
art and I would still like if 

661
00:31:03,200 --> 00:31:05,500
you give me a choice for a lot 
of things, I would still prefer 

662
00:31:05,500 --> 00:31:08,900
our but still, I see this. 
So this is this is a big problem

663
00:31:08,900 --> 00:31:10,400
and that is one of the reasons 
why a lot of people. 

664
00:31:10,500 --> 00:31:13,000
People are not learning or 
itself. 

665
00:31:13,000 --> 00:31:15,300
I mean like you don't lose 
anything by learning a new 

666
00:31:15,300 --> 00:31:16,700
language, honestly. 
Speaking. 

667
00:31:16,700 --> 00:31:20,100
It's it's a new tool in Your 
Arsenal so you can do a lot of 

668
00:31:20,100 --> 00:31:21,700
things. 
But yeah, that is why a lot of 

669
00:31:21,700 --> 00:31:24,900
people again have not learned or
but if you will, if you see a 

670
00:31:24,908 --> 00:31:27,400
lot of veterans you would see 
them, you would see them, you 

671
00:31:27,400 --> 00:31:29,700
know, liking the other tool as 
well like appreciating what the 

672
00:31:29,700 --> 00:31:32,500
tool is offering whether you use
the tool or not. 

673
00:31:32,500 --> 00:31:34,400
But at least, you know, you 
should appreciate what the tool 

674
00:31:34,400 --> 00:31:36,700
is offering like in the world 
that we live in. 

675
00:31:36,900 --> 00:31:38,600
So these are the two reasons why
I strongly. 

676
00:31:38,600 --> 00:31:42,900
Think you would see a lot of Out
of python-based resumes at least

677
00:31:42,900 --> 00:31:45,900
because of that myth that that 
people people started talking 

678
00:31:45,900 --> 00:31:49,400
about, does it also have to do 
some sort of a background like 

679
00:31:49,400 --> 00:31:54,700
for example for me like before I
got into our python long time, a

680
00:31:54,700 --> 00:31:57,800
very long time back. 
I used to use mostly. 

681
00:31:57,800 --> 00:32:01,500
I was mostly in Excel Greg and I
used to write like, VBA code to 

682
00:32:01,500 --> 00:32:04,600
kind of do some interesting 
things on top of Excel, or I'll 

683
00:32:04,600 --> 00:32:09,000
even briefly use SAS when one of
my employers had a license for 

684
00:32:09,000 --> 00:32:12,500
it into one or Or hand you think
like the python guys are, mostly

685
00:32:12,500 --> 00:32:15,300
the sort of people who come from
more for software, engineering 

686
00:32:15,300 --> 00:32:18,100
background rather, from rather 
than a sort of an analytics 

687
00:32:18,100 --> 00:32:19,100
program. 
Yeah, you're right. 

688
00:32:19,100 --> 00:32:20,700
Actually, you're not just 
speculating. 

689
00:32:20,700 --> 00:32:22,800
If you see students coming from 
engineering colleges. 

690
00:32:22,800 --> 00:32:25,900
I wouldn't necessarily say, you 
know, like only ID background 

691
00:32:25,900 --> 00:32:29,300
but still students coming from 
engineering colleges doing well 

692
00:32:29,300 --> 00:32:32,100
with the python again, because 
you know, their peers are using 

693
00:32:32,100 --> 00:32:34,000
python. 
So they can talk to somebody who

694
00:32:34,000 --> 00:32:35,700
uses Python and then get the 
answer. 

695
00:32:35,700 --> 00:32:40,100
But if you see non engineering 
background, I see a strong or 

696
00:32:40,100 --> 00:32:44,300
at, I strongly encourage them to
use our like I so told you right

697
00:32:44,400 --> 00:32:47,800
everywhere. 
I teach I try to use our because

698
00:32:47,800 --> 00:32:49,600
people who are not from 
engineering background. 

699
00:32:49,800 --> 00:32:52,300
I think it's very, very easy for
them to get started with our 

700
00:32:52,300 --> 00:32:55,600
initial friction is list. 
And again, the problem is those 

701
00:32:55,600 --> 00:32:59,100
kind of people applying for a, 
you know, typical data science 

702
00:32:59,100 --> 00:33:01,000
role is also not very common in 
India. 

703
00:33:01,500 --> 00:33:02,900
That is another reason why you 
wouldn't see. 

704
00:33:02,900 --> 00:33:05,100
Like, for example, if you see 
outside, India, right? 

705
00:33:05,300 --> 00:33:09,200
Somebody who had done some 
bioscience course, they might 

706
00:33:09,200 --> 00:33:10,400
be, you know, looking into data 
science. 

707
00:33:10,600 --> 00:33:12,400
Not even in Pharmaceuticals, all
these things. 

708
00:33:12,600 --> 00:33:15,100
But if you see in India, 
especially like, mostly, you see

709
00:33:15,100 --> 00:33:17,800
Engineers, we are all engineers 
and that is how we have been. 

710
00:33:17,800 --> 00:33:20,600
And that is why you see this 
strong python here. 

711
00:33:23,600 --> 00:33:25,600
Now. 
I mean, let's talk about that 

712
00:33:25,600 --> 00:33:28,200
database access rate. 
So I think it again like the big

713
00:33:28,200 --> 00:33:31,100
come here, literally discovered 
like because to my nine years of

714
00:33:31,100 --> 00:33:32,700
Consulting. 
Before I joined delivery. 

715
00:33:32,700 --> 00:33:35,800
I people kept asking me if I 
know how to use big data and I 

716
00:33:35,808 --> 00:33:37,200
was a kid. 
I know the principle of big 

717
00:33:37,200 --> 00:33:40,400
data, but I don't think your org
really needs to. 

718
00:33:40,600 --> 00:33:42,900
Big data, and then I joined 
delivery and here, obviously, 

719
00:33:42,900 --> 00:33:46,400
our data size is like absolutely
humongous tank. 

720
00:33:46,400 --> 00:33:48,300
So like thanks. 
So for the first time I've had 

721
00:33:48,300 --> 00:33:51,000
to like kind of continuously 
have database access in stuff. 

722
00:33:51,200 --> 00:33:53,900
And if we have sort of solved 
that problem for the last 10 

723
00:33:53,900 --> 00:33:57,200
months of being here is to get a
use with our package called tv 

724
00:33:57,200 --> 00:34:00,400
player where I just write code 
in our it's okay, if the court 

725
00:34:00,400 --> 00:34:05,000
is a heat generated with maybe 
slightly inefficient compared to

726
00:34:05,000 --> 00:34:09,199
writing it natively in SQL, but 
I have personally become a very 

727
00:34:09,199 --> 00:34:14,199
big fan of Because I find it far
easier to debug compared to a 

728
00:34:14,207 --> 00:34:16,300
skill. 
I think SQL because of it's sort

729
00:34:16,300 --> 00:34:18,600
of, I don't know if we should 
call it an in fix for might or 

730
00:34:18,607 --> 00:34:22,000
something, but you have to keep 
looking up and down and up and 

731
00:34:22,000 --> 00:34:24,199
down. 
If you have to sort of debunk 

732
00:34:24,199 --> 00:34:27,400
the debug an SQL statement and 
it gives 100 line segment, which

733
00:34:27,400 --> 00:34:30,000
is very easy with the kind of 
data that we have in my company.

734
00:34:30,400 --> 00:34:33,199
It's I find it done. 
I find it sort of mentally 

735
00:34:33,199 --> 00:34:36,000
taxing to you. 
So but I will gather for the 

736
00:34:36,100 --> 00:34:38,600
benefit of the rock python guys 
in my team have been sort of 

737
00:34:38,900 --> 00:34:40,400
trying to figure out a DVD 
player. 

738
00:34:40,500 --> 00:34:43,699
Violent deaths so far not been 
able to find it and so on. 

739
00:34:43,699 --> 00:34:46,600
So how easy database activity 
are in Python. 

740
00:34:46,600 --> 00:34:51,000
How my I mean, I assume you have
done a done a fair bit of this. 

741
00:34:52,000 --> 00:34:54,000
Yeah. 
So DB blade is amazing again for

742
00:34:54,000 --> 00:34:56,100
a lot of reasons like you 
mentioned and again sequel 

743
00:34:56,100 --> 00:34:57,400
debugging is also very 
difficult. 

744
00:34:57,400 --> 00:35:00,500
Like you said, so what does 
currently working out for me at 

745
00:35:00,500 --> 00:35:01,900
least in my current 
organization? 

746
00:35:02,000 --> 00:35:04,300
The greatest thing that I have 
found out this data breach. 

747
00:35:04,800 --> 00:35:07,100
So we have one, you know Big 
Data System, where we have 

748
00:35:07,100 --> 00:35:10,300
connected it's like one cluster 
and within that one cluster. 

749
00:35:10,300 --> 00:35:13,400
I B equals Park sequel. 
I can write our, I can write 

750
00:35:13,400 --> 00:35:15,200
Python. 
And so I have the ability to 

751
00:35:15,200 --> 00:35:19,300
write all the three languages in
the same notebook environment so

752
00:35:19,300 --> 00:35:21,900
I can just barely switch between
languages much much easier. 

753
00:35:21,900 --> 00:35:23,600
I don't have to you know, make 
new connections. 

754
00:35:23,800 --> 00:35:25,500
I think that is that is really 
good option. 

755
00:35:25,800 --> 00:35:29,500
But I think if you do not offer 
this option, only your other 

756
00:35:29,500 --> 00:35:32,000
solution is. 
Let's say if you have a spark 

757
00:35:32,000 --> 00:35:35,000
when you use Five Spot, make a 
separate connection and then do 

758
00:35:35,000 --> 00:35:37,700
things separately. 
It is spice Pockets, you know, 

759
00:35:37,800 --> 00:35:41,400
closer to closer to what you 
would see with the Pandas adding

760
00:35:41,400 --> 00:35:44,200
koala is something that people 
are using call our qualis. 

761
00:35:44,200 --> 00:35:47,200
I think it's called people are 
using task is something people 

762
00:35:47,200 --> 00:35:50,500
are using all these have a very 
similar pandas are related to a 

763
00:35:50,500 --> 00:35:53,300
syntax. 
But but in my job, I don't have 

764
00:35:53,300 --> 00:35:55,900
to do all those things because I
have this flexibility. 

765
00:35:55,900 --> 00:35:58,700
But other than that, you cannot,
I don't think you have a lot of 

766
00:35:58,700 --> 00:36:01,200
options to write a native, at 
least from what I know native 

767
00:36:01,200 --> 00:36:03,900
Python, and that would be on and
the other thing, like you spoke 

768
00:36:03,900 --> 00:36:06,200
about, right? 
Making database connection 

769
00:36:06,200 --> 00:36:08,400
itself, using our studio is 
quite straightforward, very 

770
00:36:08,400 --> 00:36:09,700
easy. 
You see the connection and it's 

771
00:36:09,700 --> 00:36:12,100
like how you How you would see 
it? 

772
00:36:12,100 --> 00:36:15,700
If you are using data reports 
from database, from SQL client. 

773
00:36:15,800 --> 00:36:19,200
It has a very similar nature. 
If you use our studio for making

774
00:36:19,200 --> 00:36:21,800
database connections and you can
have multiple connections, that 

775
00:36:21,800 --> 00:36:24,900
is again, very easy to make 
button in Python. 

776
00:36:24,900 --> 00:36:27,300
It may not be easy. 
But the advantage in Python. 

777
00:36:27,300 --> 00:36:31,700
If you ask me is most of this 
database Solutions companies, 

778
00:36:31,900 --> 00:36:34,700
you will easily get a starter 
code in Python, but you may not 

779
00:36:34,700 --> 00:36:37,300
easily get a starter code. 
Do not like any time you see 

780
00:36:37,300 --> 00:36:39,000
this thing. 
So that is another place where I

781
00:36:39,008 --> 00:36:42,500
actually use Python a lot. 
Any time, I see an API, you go 

782
00:36:42,500 --> 00:36:44,700
to the documentation. 
You would actually get a starter

783
00:36:44,700 --> 00:36:47,600
code in Python, very fairly 
simply like you can easily get 

784
00:36:47,600 --> 00:36:50,800
it and you can get started with 
python, but it may not be as 

785
00:36:50,800 --> 00:36:53,500
easy as python in order to do 
that. 

786
00:36:53,600 --> 00:36:55,200
One other thing. 
I mean like, this is again, 

787
00:36:55,200 --> 00:36:57,800
something, which I've been doing
a lot of and like so. 

788
00:36:57,800 --> 00:37:01,900
I as part of us as the head of 
the, I ended up having to build 

789
00:37:01,900 --> 00:37:04,900
a lot of dashboards and reports.
I have in my life. 

790
00:37:04,900 --> 00:37:09,100
Never figured out these sort of 
bi packages like Tableau or 

791
00:37:09,100 --> 00:37:12,300
click on power bi or Any of 
these have never figured out any

792
00:37:12,300 --> 00:37:13,800
of those. 
I probably have too much of a 

793
00:37:13,800 --> 00:37:18,000
control sleep for that. 
So I do things using tiny or 

794
00:37:18,000 --> 00:37:21,200
like I kind of generate reports 
using our markdown and things 

795
00:37:21,200 --> 00:37:23,500
like that. 
So often it actually have come 

796
00:37:23,500 --> 00:37:26,900
across a few blog posts and 
videos that you have made on on 

797
00:37:27,200 --> 00:37:29,500
streamlet, which I think is a 
python-based. 

798
00:37:29,500 --> 00:37:32,600
I'll start offered dashboard 
create a so can can you talk a 

799
00:37:32,600 --> 00:37:36,300
little bit about about creating 
dashboards how that world Works 

800
00:37:36,300 --> 00:37:38,300
in both are and in python or 
rather? 

801
00:37:38,300 --> 00:37:40,400
If we were to put it this way, 
Dash. 

802
00:37:40,500 --> 00:37:43,500
Boards, without the sort of 
broadly speaking, that Tableau 

803
00:37:43,500 --> 00:37:45,500
Universe. 
How do you, how do you program a

804
00:37:45,500 --> 00:37:49,100
dashboard? 
How easy it is in both our and 

805
00:37:49,100 --> 00:37:52,200
pythonic where you would pick? 
They say something like a 

806
00:37:52,207 --> 00:37:54,100
something like a stream later 
and where you would pick 

807
00:37:54,100 --> 00:37:55,500
something like a shiny or 
something. 

808
00:37:56,900 --> 00:37:59,000
Yeah. 
Yeah, I have a strong opinion in

809
00:37:59,000 --> 00:38:01,600
this as well. 
So maybe I'll start with why you

810
00:38:01,600 --> 00:38:03,800
should use a programming 
language for a dashboard over 

811
00:38:03,800 --> 00:38:05,900
Tableau or power. 
Bi are any of these proprietary 

812
00:38:05,900 --> 00:38:08,000
tools forget about that. 
This is open source for a 

813
00:38:08,008 --> 00:38:10,400
moment. 
Even, you know, if it's not, why

814
00:38:10,400 --> 00:38:12,100
do you think? 
Why do I think that you should 

815
00:38:12,100 --> 00:38:13,600
use? 
I think, the first most 

816
00:38:13,600 --> 00:38:16,000
important reason, again, I would
talk about technical data, 

817
00:38:16,700 --> 00:38:19,200
Tableau. 
If you see you, can you can have

818
00:38:19,200 --> 00:38:22,100
a table and you can make some 
changes in table. 

819
00:38:22,100 --> 00:38:25,100
Now, you can have another column
or some changes in the Tableau 

820
00:38:25,100 --> 00:38:28,500
when you Import, the data and 
third you can have calculated 

821
00:38:28,500 --> 00:38:30,700
field and for you can have 
aliases. 

822
00:38:31,200 --> 00:38:34,400
And when you do like this is the
under industry till the chart. 

823
00:38:34,400 --> 00:38:35,300
You could have changed the 
label. 

824
00:38:35,300 --> 00:38:37,100
Okay. 
Now, if somebody is coming to 

825
00:38:37,100 --> 00:38:40,300
debug, they will first see the 
sheet and they have to go back. 

826
00:38:40,300 --> 00:38:43,000
All these places to understand 
what like, if I have to change 

827
00:38:43,000 --> 00:38:47,000
our field name and this is, this
is a very difficult task. 

828
00:38:47,100 --> 00:38:49,500
And in terms of technical debt, 
it just gets accumulated and 

829
00:38:49,500 --> 00:38:52,400
accumulated, and accumulated. 
And, again, it's a, it's a very 

830
00:38:52,400 --> 00:38:56,700
bad thing for an organization. 
So, what these Ironpython offers

831
00:38:56,700 --> 00:38:59,500
is simple, right? 
So if my colleague has built a 

832
00:38:59,500 --> 00:39:02,100
dashboard. 
All I have to do is import the 

833
00:39:02,100 --> 00:39:04,800
code in our studio or import the
code in Visual Studio code. 

834
00:39:04,900 --> 00:39:07,700
I can basically see where all 
these variables are, you know, 

835
00:39:07,700 --> 00:39:11,100
reference and then you can you 
can you can start using it for. 

836
00:39:11,300 --> 00:39:14,300
So it is very straightforward. 
In terms of technical debt, in 

837
00:39:14,300 --> 00:39:17,400
terms of reusability. 
I have built something, you take

838
00:39:17,400 --> 00:39:19,600
it. 
You keep it for yourself, change

839
00:39:19,600 --> 00:39:21,200
the data source, change, the 
column names. 

840
00:39:21,200 --> 00:39:25,300
You have a dashboard up and 
running, it might look easier on

841
00:39:25,300 --> 00:39:27,100
top. 
Universe, but because of all the

842
00:39:27,100 --> 00:39:29,900
complications that I said, it is
also not very straightforward. 

843
00:39:30,300 --> 00:39:35,600
I I prefer Tableau at only one 
place has a lot of cxo 

844
00:39:35,600 --> 00:39:37,600
dashboards right? 
People had, you know, the higher

845
00:39:37,600 --> 00:39:40,700
level, they like things visually
appealing. 

846
00:39:40,800 --> 00:39:43,400
I strongly still believe lot of 
Chinese developers. 

847
00:39:43,400 --> 00:39:47,300
Lot of stimuli developers do not
have enough tools to make their 

848
00:39:47,300 --> 00:39:49,100
dashboards. 
Look as good as Tableau 

849
00:39:49,100 --> 00:39:52,100
dashboards. 
I should agree that I think by 

850
00:39:52,100 --> 00:39:55,600
our a shiny actually recently 
like a year back, at least. 

851
00:39:56,000 --> 00:39:59,000
Rider to give you on UI where 
you can actually select the 

852
00:39:59,000 --> 00:40:01,500
theme of The Elements like 
button should be like this. 

853
00:40:01,600 --> 00:40:03,700
They should be like this like 
Dynamic team created. 

854
00:40:03,700 --> 00:40:06,700
That was there in Chinese, but 
still, I don't see a lot of lot 

855
00:40:06,700 --> 00:40:09,200
of shiny developers using it. 
So, most of the times, if you 

856
00:40:09,200 --> 00:40:13,000
see Shiny applications as shiny 
dashboards, it wouldn't look as 

857
00:40:13,000 --> 00:40:16,100
great as a tableau. 
So only for visually aesthetic 

858
00:40:16,100 --> 00:40:18,600
appearance, tab lose fine, but 
for everything else, I think 

859
00:40:18,600 --> 00:40:20,400
people should either shiny or 
streamlet. 

860
00:40:20,700 --> 00:40:23,500
Okay, now coming back to shiny 
like, let's say, talk about the 

861
00:40:23,500 --> 00:40:26,200
our universe. 
I think our universe has That is

862
00:40:26,200 --> 00:40:29,400
amazing, which is our mob. 
Don't, especially if you want to

863
00:40:29,408 --> 00:40:31,800
build only a client-side 
dashboard. 

864
00:40:31,900 --> 00:40:33,500
You can use flex dashboard, 
right? 

865
00:40:33,600 --> 00:40:34,900
So you don't need a server to 
host. 

866
00:40:34,900 --> 00:40:38,500
It simply HTML file, that looks 
like a dashboard. 

867
00:40:38,500 --> 00:40:41,500
So, to it, to even demonstrate 
this, I actually built a cackle 

868
00:40:41,500 --> 00:40:42,500
notebook. 
So, kaggle. 

869
00:40:42,500 --> 00:40:44,600
Notebook is actually only a 
client-side notebook. 

870
00:40:44,600 --> 00:40:47,100
But, ultimately it renders a 
dashboard at the friend. 

871
00:40:47,200 --> 00:40:49,300
So Flex dashboard is really, 
really amazing. 

872
00:40:49,400 --> 00:40:51,100
If you do not want server-side 
interaction. 

873
00:40:51,100 --> 00:40:55,100
You should always offer client. 
Sorry, Flex dashboard, where you

874
00:40:55,100 --> 00:40:57,900
can just Share the HTML file. 
Anybody can open it, it would 

875
00:40:57,900 --> 00:41:00,600
open a browser. 
And if your organization can 

876
00:41:00,600 --> 00:41:02,200
afford our studio, connect our 
studio. 

877
00:41:02,200 --> 00:41:04,600
Can I ask I've spoken to a 
couple of people who used our 

878
00:41:04,600 --> 00:41:07,300
studio connect people have said 
it's magical, you know, click. 

879
00:41:07,300 --> 00:41:09,900
You can just publish your report
into our studio konnekt. 

880
00:41:09,900 --> 00:41:12,500
And then, you know, somebody can
make use of it. 

881
00:41:12,800 --> 00:41:15,200
So Flex dashboard is a free 
solution. 

882
00:41:15,300 --> 00:41:17,300
Now, coming to our shiny or 
shiny. 

883
00:41:17,300 --> 00:41:19,400
Watch. 
This was the sole King for quite

884
00:41:19,400 --> 00:41:21,000
a long time before simulate came
in. 

885
00:41:21,000 --> 00:41:23,800
So I have made multiple 
presentations, why are shiny is 

886
00:41:23,800 --> 00:41:25,500
a very important tool 
unmatchable. 

887
00:41:25,800 --> 00:41:29,300
I think this was the time when 
slowly in Python Universe, 

888
00:41:29,300 --> 00:41:31,600
people started realizing that we
need some alternative. 

889
00:41:31,700 --> 00:41:35,600
So to powerful tools came and I 
would say, one is plotly Dash 

890
00:41:35,600 --> 00:41:38,600
plotly. 
As this plotly visualization 

891
00:41:38,600 --> 00:41:40,200
company. 
They also pushed Dash. 

892
00:41:40,200 --> 00:41:42,000
And now they're pushing a lot of
Enterprise stuff. 

893
00:41:42,000 --> 00:41:44,600
Especially Dash pokers, and then
we have, streamlet. 

894
00:41:45,100 --> 00:41:48,000
I still prefer streamlet to 
plotly dash. 

895
00:41:48,000 --> 00:41:50,500
The reason I prefer streamlet, 
is because I find streamlet 

896
00:41:50,500 --> 00:41:52,700
quite similar to how I used are 
shiny. 

897
00:41:52,800 --> 00:41:55,300
It's not in the same way to 
write the code, but how simple 

898
00:41:55,300 --> 00:41:56,900
it is to Do you know spin up 
something? 

899
00:41:57,100 --> 00:42:00,100
See the problem in these kind of
things is if you are a data 

900
00:42:00,100 --> 00:42:02,800
scientist, your focus is not to 
develop a full stack 

901
00:42:02,800 --> 00:42:04,400
application. 
That is something that we do a 

902
00:42:04,408 --> 00:42:06,800
lot of people actually forget 
when they develop these kind of 

903
00:42:06,800 --> 00:42:09,500
applications. 
And a lot of companies actually 

904
00:42:09,500 --> 00:42:12,100
have a software developer or a 
data engineering team tied up 

905
00:42:12,100 --> 00:42:13,600
with. 
So as a typical data center. 

906
00:42:13,600 --> 00:42:15,900
It's not necessarily every time 
you are going to develop a full 

907
00:42:15,900 --> 00:42:19,100
stock application. 
But if there is a need for you 

908
00:42:19,100 --> 00:42:22,000
to develop realistic 
application, do you have tools 

909
00:42:22,200 --> 00:42:25,600
that has very less friction to 
help you make like an MVP? 

910
00:42:25,800 --> 00:42:28,000
I would see this like more like 
an MVP. 

911
00:42:28,200 --> 00:42:30,500
So I think that's where 
streamlet under shiny. 

912
00:42:30,500 --> 00:42:33,100
Actually does a great job, but 
that doesn't mean you cannot 

913
00:42:33,100 --> 00:42:35,600
develop full stock production 
grade application. 

914
00:42:35,800 --> 00:42:38,000
Now, the question is, what is 
production grid. 

915
00:42:38,500 --> 00:42:41,500
Do I want to build the next 
Facebook using shiny or 

916
00:42:41,500 --> 00:42:43,100
streamlet? 
Know, I would, of course not 

917
00:42:43,100 --> 00:42:45,200
right. 
I would probably recruit 

918
00:42:45,200 --> 00:42:48,300
somebody who has react 
background, more in stock, and 

919
00:42:48,300 --> 00:42:52,400
then we'll let them develop it. 
But if I want to develop a full 

920
00:42:52,400 --> 00:42:56,000
stack Facebook, next Facebook, 
if I want to raise Ting. 

921
00:42:56,100 --> 00:42:58,100
If you want to pitch it in my 
organization, let's say like I 

922
00:42:58,100 --> 00:43:00,700
want to give a tool to my 
internal, let's say some team as

923
00:43:00,700 --> 00:43:03,000
part of an initiative that like 
data sends have figured out 

924
00:43:03,000 --> 00:43:03,800
something. 
Now. 

925
00:43:03,800 --> 00:43:06,200
I don't have to find a resource 
to do this pilot. 

926
00:43:06,200 --> 00:43:08,500
I can myself develop this tool 
and then give it to them. 

927
00:43:09,000 --> 00:43:13,000
So it's not that you cannot 
develop production grade. 

928
00:43:13,000 --> 00:43:14,900
But what production grade is the
first question. 

929
00:43:15,000 --> 00:43:18,500
The second question is, you can 
still make these tools work work

930
00:43:18,500 --> 00:43:20,400
much better than how most people
are developing. 

931
00:43:20,400 --> 00:43:23,400
If you see Shiny two years back 
or should your conference Focus 

932
00:43:23,400 --> 00:43:27,300
solely on lot of scalable. 
Applications, there is a library

933
00:43:27,300 --> 00:43:31,100
by Colin fake called Golem Geo. 
Lem Golem is an amazing library 

934
00:43:31,100 --> 00:43:34,100
that tries to bring some 
standardization recently. 

935
00:43:34,100 --> 00:43:35,900
I came across another Library 
brochure. 

936
00:43:35,900 --> 00:43:38,400
I think that is also by Colin 
fade and that is also again 

937
00:43:38,400 --> 00:43:40,700
trying to make more optimization
in this area. 

938
00:43:40,900 --> 00:43:44,900
So, yeah, to, to quickly some of
our shiny and stimulator quite 

939
00:43:44,900 --> 00:43:47,200
equivalent or from from a 
developer perspective. 

940
00:43:47,200 --> 00:43:50,000
If you ask me very similar, it's
just that in different 

941
00:43:50,000 --> 00:43:52,400
universities available, but I 
don't think there is anything 

942
00:43:52,400 --> 00:43:55,100
like our mob done in Python 
Universal and our McDonald's. 

943
00:43:55,100 --> 00:43:57,300
Like, Really amazing. 
Even if you are a software 

944
00:43:57,300 --> 00:43:59,600
engineer, I see. 
Why don't you talk a little bit 

945
00:43:59,600 --> 00:44:02,600
more about our March towards? 
If I don't think give, they 

946
00:44:02,600 --> 00:44:05,400
might be a lot of our listeners,
who may not be very familiar 

947
00:44:05,400 --> 00:44:08,300
with the tool. 
I so, can you talk about how you

948
00:44:08,300 --> 00:44:10,100
use are? 
My daughter's is sort of a 

949
00:44:10,400 --> 00:44:13,200
client a step back boarding. 
Yeah. 

950
00:44:13,200 --> 00:44:16,700
So this is a very nice blog post
written by Minimax are so we can

951
00:44:16,700 --> 00:44:19,500
link that in the show notes as 
well that compares our notebook 

952
00:44:19,500 --> 00:44:21,600
and jupyter notebook. 
That had that does a very good 

953
00:44:21,600 --> 00:44:24,000
comparison. 
So now our notebook is nothing 

954
00:44:24,000 --> 00:44:26,100
but a slight extension of our 
Bond. 

955
00:44:26,100 --> 00:44:28,100
So we'll just talk about our 
mark down alone. 

956
00:44:28,400 --> 00:44:32,100
So our markdown is this is this 
script type that that is 

957
00:44:32,107 --> 00:44:36,100
available in our that lets you 
write our code and also lets you

958
00:44:36,100 --> 00:44:38,800
light, right markdown. 
So R plus markdown is what you 

959
00:44:38,800 --> 00:44:43,500
are or McDonald's but it's like 
oversimplification of saying. 

960
00:44:43,700 --> 00:44:45,500
So what are mugged on? 
Lets you do, that. 

961
00:44:45,500 --> 00:44:49,200
Does it primary lets you create 
like a lot of different formats 

962
00:44:49,200 --> 00:44:52,400
and it's not just you know, 
typical standard report that you

963
00:44:52,400 --> 00:44:54,700
can do in your company. 
For example, let's say, like you

964
00:44:54,700 --> 00:44:57,500
have a dock, you want to build 
It's a simple analytics report 

965
00:44:58,100 --> 00:45:00,100
and you want that report to be 
customized. 

966
00:45:00,200 --> 00:45:02,600
For example, like, let's say, it
could be like a stock market 

967
00:45:02,600 --> 00:45:04,600
reporter. 
Something for example, and based

968
00:45:04,600 --> 00:45:07,200
on what the stock market value. 
Do you want to do some natural 

969
00:45:07,200 --> 00:45:09,400
language change within the 
document and you want it to be 

970
00:45:09,400 --> 00:45:11,200
sent out to, you know, bunch of 
people every day. 

971
00:45:11,500 --> 00:45:14,000
Now, this is quite easy to 
implement it in our universe. 

972
00:45:14,000 --> 00:45:17,500
So you you just need an arm of 
done and that arm of don't under

973
00:45:17,600 --> 00:45:19,500
probably the package like 
sendmail. 

974
00:45:19,500 --> 00:45:22,000
And that arm of Don has our 
script that gets customized. 

975
00:45:22,000 --> 00:45:24,400
Like let's say maybe you're 
building a linear regression 

976
00:45:24,400 --> 00:45:25,600
model or you're building a 
cheap. 

977
00:45:25,700 --> 00:45:28,100
Model that will do some 
forecasting and all the results 

978
00:45:28,100 --> 00:45:30,200
are there. 
Visualized very nicely and send 

979
00:45:30,200 --> 00:45:32,500
that email as PDF to the 
customer. 

980
00:45:33,000 --> 00:45:35,200
Very simple. 
This is one of the use cases of 

981
00:45:35,200 --> 00:45:39,400
our arm Ogden, but leaving out 
the organization part like 

982
00:45:39,400 --> 00:45:42,600
organization Partners like you 
build a client site dashboard, 

983
00:45:42,600 --> 00:45:45,000
you build analytics report, you 
build HTML Pages. 

984
00:45:45,000 --> 00:45:48,700
You build PDFs, but leaving out 
this part are McDonough still 

985
00:45:48,700 --> 00:45:51,500
grown a lot. 
You can publish blog posts using

986
00:45:51,500 --> 00:45:53,900
our mock. 
Don't you can write books using 

987
00:45:53,900 --> 00:45:56,800
our modern style. 
I lot of People started writing 

988
00:45:56,800 --> 00:45:59,100
books, using our mob done with 
the library called book, don't 

989
00:45:59,500 --> 00:46:01,000
book. 
Don is an extension of our 

990
00:46:01,000 --> 00:46:03,700
Magnum. 
So, the advantages, when you 

991
00:46:03,700 --> 00:46:06,800
like mini books, at least I see 
a couple of people I respect in 

992
00:46:06,800 --> 00:46:09,100
the industry. 
They are not very famous liquid.

993
00:46:09,100 --> 00:46:11,000
You see them actually riding a 
mini book. 

994
00:46:11,100 --> 00:46:13,100
So that many book is like a 
reference for me as well 

995
00:46:13,100 --> 00:46:15,100
reference for the elderly 
Community as well. 

996
00:46:15,300 --> 00:46:19,100
So once you the greatest thing 
that I like about harm of known 

997
00:46:19,100 --> 00:46:20,900
as want. 
So this is a very classical 

998
00:46:20,900 --> 00:46:23,900
example of once you give the 
tool to somebody how they 

999
00:46:23,900 --> 00:46:25,600
leverage it, something that you 
cannot even imagine. 

1000
00:46:26,100 --> 00:46:28,700
Like I wouldn't imagine 
publishing a blog post using our

1001
00:46:28,700 --> 00:46:32,000
mobile, but I'm doing it. 
So block down is they're booked 

1002
00:46:32,000 --> 00:46:36,000
on, is there a lot of extensions
of arm of them, but to basically

1003
00:46:36,000 --> 00:46:38,900
arm of known as like one script,
where you can write markdown 

1004
00:46:38,900 --> 00:46:41,900
where you can write R and you 
can write anything and all and 

1005
00:46:41,900 --> 00:46:43,700
that gets rendered into some 
formats. 

1006
00:46:43,700 --> 00:46:45,900
And how do you render it is what
the different parameters. 

1007
00:46:45,900 --> 00:46:48,100
Does it render like a book or 
blog post? 

1008
00:46:48,100 --> 00:46:51,900
Like a dashboard with, you know,
subtle changes in it, and that's

1009
00:46:51,900 --> 00:46:53,900
something that python Community.
Definitely misses. 

1010
00:46:54,900 --> 00:46:57,100
Correct, correct. 
Okay, so I think they're coming.

1011
00:46:57,300 --> 00:46:59,500
So I think, what I initially 
planned this episode, I mean, 

1012
00:46:59,500 --> 00:47:01,500
before I decided to go ahead of 
high value on it. 

1013
00:47:01,500 --> 00:47:05,300
I thought you should have a sort
of a are versus python Qaeda for

1014
00:47:05,300 --> 00:47:07,200
debate on this. 
But I think what's happened now,

1015
00:47:07,200 --> 00:47:10,500
is like, you provided very nice 
intro into, like, what is it 

1016
00:47:10,500 --> 00:47:12,700
like to kind of use? 
Both are in Python? 

1017
00:47:12,700 --> 00:47:16,100
Because I see that you use both 
in your in your regular work. 

1018
00:47:16,400 --> 00:47:19,200
So can we talk about like, I 
mean, like, I mean your sister 

1019
00:47:19,200 --> 00:47:21,700
spoke a little bit about this, 
but normally how you get the 

1020
00:47:21,700 --> 00:47:24,000
best of both worlds. 
I mean like if you if, you know,

1021
00:47:24,000 --> 00:47:26,500
both are Bite it. 
Like how do you sort of like 

1022
00:47:26,700 --> 00:47:28,400
Oneness? 
Like I mean, you were talking 

1023
00:47:28,400 --> 00:47:30,900
about how you do your Eda and I 
are and then the modeling in 

1024
00:47:30,900 --> 00:47:33,600
Python, but my thought 
immediately was like, okay, how 

1025
00:47:33,600 --> 00:47:35,300
do you transfer the data from 
one to the other? 

1026
00:47:35,300 --> 00:47:38,400
Like, I mean, I can't imagine 
radical CSV or something as a 

1027
00:47:38,900 --> 00:47:41,900
go-between and so on. 
So so how do you get a integrate

1028
00:47:41,900 --> 00:47:46,500
these two into your workflow to 
kind of get them become a even 

1029
00:47:46,500 --> 00:47:49,700
better data scientist? 
Yeah, so I have actually called 

1030
00:47:49,700 --> 00:47:53,200
this as a superpower previously,
in couple of my talks where you 

1031
00:47:53,200 --> 00:47:55,200
can combine, you know, it's 
like, I don't know if you have 

1032
00:47:55,200 --> 00:47:57,800
seen seen superheroes. 
Like they usually you have like 

1033
00:47:57,800 --> 00:47:59,300
combination. 
Like, if you see Justice League,

1034
00:47:59,300 --> 00:48:01,700
they have like multiple Powers 
together and they formerly the 

1035
00:48:01,700 --> 00:48:03,600
same way. 
Our has some superpower python 

1036
00:48:03,600 --> 00:48:06,000
as a superpower. 
And if you can combine both of 

1037
00:48:06,000 --> 00:48:07,800
this, it's like a massive super 
power. 

1038
00:48:07,800 --> 00:48:09,700
And that is what reticulate 
library is 4. 

1039
00:48:10,000 --> 00:48:12,900
Pie R has a very nice Library 
called reticulate that lets you 

1040
00:48:12,900 --> 00:48:16,200
combine both the walls and like 
I said at the start, right? 

1041
00:48:16,200 --> 00:48:19,100
So after reticulate came in, you
actually started seeing a lot of

1042
00:48:19,100 --> 00:48:23,000
python Library coming into our 
world just by using binding. 

1043
00:48:23,100 --> 00:48:24,500
So at the start like we said, 
right? 

1044
00:48:24,700 --> 00:48:27,800
You know, if you are bilingual, 
is better to use that that 

1045
00:48:27,800 --> 00:48:31,400
language itself. 
So now the question is, you 

1046
00:48:31,400 --> 00:48:33,700
don't want to use binding, but 
if you want to use the same 

1047
00:48:33,700 --> 00:48:37,000
language, then you can write 
python code inside. 

1048
00:48:37,000 --> 00:48:40,300
Our studio using reticulate, and
you can interchange objects 

1049
00:48:40,300 --> 00:48:42,700
between both the languages. 
Let me give you a very simple 

1050
00:48:42,700 --> 00:48:46,200
example, you know, let's say I 
want to do an LP, okay natural 

1051
00:48:46,200 --> 00:48:49,300
language processing and I 
strongly think natural language 

1052
00:48:49,300 --> 00:48:52,000
processing, couple of libraries 
like Spacey hugging phase 

1053
00:48:52,000 --> 00:48:54,400
Transformers are like super 
awesome in natural language. 

1054
00:48:54,600 --> 00:48:56,300
Processing. 
Even if I have tidied X, I would

1055
00:48:56,300 --> 00:48:59,000
probably prefer these languages 
for me to do natural language 

1056
00:48:59,000 --> 00:49:01,200
processing. 
But before I do natural language

1057
00:49:01,200 --> 00:49:04,000
processing, I want to import 
data may be like you said, I 

1058
00:49:04,008 --> 00:49:06,900
would probably use a DVD player.
I want to. 

1059
00:49:06,900 --> 00:49:11,000
I want to do some Eda like 
unigram by G or like whatever I 

1060
00:49:11,000 --> 00:49:12,400
want to do some kind of 
cleaning. 

1061
00:49:12,400 --> 00:49:15,000
I would probably use tiny dicks 
to Green do cleaning because 

1062
00:49:15,100 --> 00:49:17,200
tidy text as a very English 
syntax, right? 

1063
00:49:17,200 --> 00:49:19,100
When you compare it with nld 
care somewhere. 

1064
00:49:19,200 --> 00:49:22,100
It's a very English syntax. 
So, after I do all these things 

1065
00:49:22,800 --> 00:49:26,600
now, I would take this data. 
And immediately the same code, I

1066
00:49:26,607 --> 00:49:29,600
would pass it on to my python 
environment and all these other.

1067
00:49:29,600 --> 00:49:31,800
Within the same code, if you use
reticulate, everything is in the

1068
00:49:31,800 --> 00:49:35,600
same code and now within python,
I would probably let say do POS 

1069
00:49:35,600 --> 00:49:38,800
tagging any a named entity 
recognition and probably let's 

1070
00:49:38,800 --> 00:49:42,200
say, like, even I'm going to use
cyclone and extra boost to build

1071
00:49:42,200 --> 00:49:44,800
a text classification model. 
Now I have the result. 

1072
00:49:44,900 --> 00:49:48,300
I can interpret it. 
So in one code, I basically used

1073
00:49:48,300 --> 00:49:52,000
our and python basically managed
to use best of both the worlds. 

1074
00:49:52,300 --> 00:49:54,800
And this is simply Possible, 
only because I'm Oozing 

1075
00:49:54,800 --> 00:49:59,100
reticulate, reticulate handle. 
Please, I have a data frame in 

1076
00:49:59,100 --> 00:50:02,900
or how does it, how should it be
in Python world? 

1077
00:50:03,000 --> 00:50:04,800
I don't have to be worried about
it. 

1078
00:50:04,900 --> 00:50:08,500
Reticulate as a simple map, like
a table that would say, okay, if

1079
00:50:08,500 --> 00:50:12,500
you have a list in Python, it's 
a, it's a, it's a vector in R. 

1080
00:50:12,500 --> 00:50:15,300
So they just give you the 
mapping to understand for you, 

1081
00:50:15,300 --> 00:50:17,500
to mentally, understand. 
What is this object here? 

1082
00:50:17,500 --> 00:50:20,200
And what is this object here? 
But other than that, you as a 

1083
00:50:20,207 --> 00:50:21,700
developer, do not have to do 
anything. 

1084
00:50:21,700 --> 00:50:25,000
Just basically have the 
installation and reticulate also

1085
00:50:25,000 --> 00:50:27,800
lets you call. 
Specific python version. 

1086
00:50:27,800 --> 00:50:29,700
What does it mean? 
Let's say the initial problem 

1087
00:50:29,700 --> 00:50:30,700
that I talked about. 
Right? 

1088
00:50:30,700 --> 00:50:32,500
Let's say you have a condom 
environment and you have a bunch

1089
00:50:32,500 --> 00:50:34,700
of libraries there or let's say 
you have created a virtual 

1090
00:50:34,700 --> 00:50:36,700
environment. 
Where I think that is quite 

1091
00:50:36,700 --> 00:50:39,600
common with python developers to
create virtual environment to 

1092
00:50:39,600 --> 00:50:42,100
keep the library's only within 
the environment, ready 

1093
00:50:42,100 --> 00:50:44,000
platelets. 
You specifically, called only, 

1094
00:50:44,000 --> 00:50:46,900
that environment, use libraries 
from that environment, keep your

1095
00:50:46,900 --> 00:50:48,600
entire session within the undead
environment. 

1096
00:50:48,700 --> 00:50:52,100
And the greatest thing about 
reticulate, again, is reticulate

1097
00:50:52,100 --> 00:50:56,400
can be combined with our shiny 
and, and our mock known as well.

1098
00:50:56,700 --> 00:51:00,000
Which means I can now finally 
create a report, send an email 

1099
00:51:00,000 --> 00:51:02,400
also, which uses both R and 
python code. 

1100
00:51:03,500 --> 00:51:05,400
Or it's okay. 
That's, that's very interesting.

1101
00:51:05,400 --> 00:51:08,900
I, I personally have aware of 
the existence of ridiculous, but

1102
00:51:09,100 --> 00:51:12,800
I haven't really used it though 
of a time finding cases where I 

1103
00:51:12,808 --> 00:51:16,400
have to use it, because I run a 
team, which sort of which works 

1104
00:51:16,400 --> 00:51:19,500
both in our and in Python. 
So I come stage will have to 

1105
00:51:19,500 --> 00:51:21,300
integrate our code and things 
like that. 

1106
00:51:21,300 --> 00:51:25,700
So we love will have to do that.
So by the end of the super, as 

1107
00:51:25,700 --> 00:51:28,900
we were sort of talking about 
like reticulated, ironpython 

1108
00:51:28,900 --> 00:51:31,000
directly, one of the other 
things is like, what about other

1109
00:51:31,000 --> 00:51:32,400
languages? 
Like, for example, I think is 

1110
00:51:32,400 --> 00:51:35,000
right in the beginning of the We
spoke a little bit about Excel. 

1111
00:51:35,000 --> 00:51:38,100
So can we talk about a little 
bit of how do you interface 

1112
00:51:38,100 --> 00:51:40,000
between? 
Let's say our and python on one 

1113
00:51:40,000 --> 00:51:42,900
side and Excel on the other. 
What do you use each one more. 

1114
00:51:43,200 --> 00:51:45,900
And also I don't know if you 
have any experience with with 

1115
00:51:46,100 --> 00:51:49,400
Julia, which I think again, 
becoming popular with a lot of 

1116
00:51:49,500 --> 00:51:52,500
with a lot of data scientist. 
Yeah, okay. 

1117
00:51:52,500 --> 00:51:55,200
So starting with Julia, so I 
tried a little bit. 

1118
00:51:55,200 --> 00:51:58,800
I, at least if you ask me, 
personally, I'm more of an 

1119
00:51:58,800 --> 00:52:02,200
applied person. 
So, for my, whatever I do is 

1120
00:52:02,200 --> 00:52:06,200
mostly driven by business 
question, so I still don't see, 

1121
00:52:06,200 --> 00:52:08,500
you know, Julia fitting in my 
work line anytime soon. 

1122
00:52:08,500 --> 00:52:09,900
Probably. 
Like, next, 2-3 years. 

1123
00:52:09,900 --> 00:52:12,800
I think our and python would 
completely keep me safe. 

1124
00:52:12,800 --> 00:52:14,600
At least from the job market or 
whatever. 

1125
00:52:14,600 --> 00:52:16,700
It is straight, but the next 
half a decade. 

1126
00:52:16,700 --> 00:52:19,700
So that way I don't use Julia 
and Julius pitch if you see 

1127
00:52:19,700 --> 00:52:21,300
their pictures to solve to 
language. 

1128
00:52:21,800 --> 00:52:24,300
Like one language for Prototype 
own language for protection. 

1129
00:52:24,300 --> 00:52:28,500
So I think Julia would be more 
applicable for people who are 

1130
00:52:28,500 --> 00:52:31,900
facing the to language problem. 
So let's say like if I'm an iot 

1131
00:52:31,900 --> 00:52:34,100
developer or something. 
I mean, it's Computing. 

1132
00:52:34,100 --> 00:52:36,600
I develop something in Python. 
Then I have to translate it to 

1133
00:52:36,700 --> 00:52:39,200
into, you know, like that that 
devices code. 

1134
00:52:39,200 --> 00:52:41,500
Maybe Julia will help me. 
But again, if you actually see 

1135
00:52:41,500 --> 00:52:45,700
tensorflow ecosystem by torch 
ecosystem, lot of these tools 

1136
00:52:45,700 --> 00:52:48,100
are also developing tools to 
help you stick to python 

1137
00:52:48,100 --> 00:52:51,400
ecosystem and then export some, 
you know, the model into the H 

1138
00:52:51,400 --> 00:52:52,600
come. 
Booting or that side of the 

1139
00:52:52,600 --> 00:52:54,800
world. 
So both are actually growing a 

1140
00:52:54,800 --> 00:52:57,400
lot. 
So I don't see, you know, Julia 

1141
00:52:57,400 --> 00:53:00,300
being a main tool for a typical 
data scientist who's like me 

1142
00:53:00,400 --> 00:53:03,600
using it maybe separately for 
machine learning engineer show. 

1143
00:53:03,800 --> 00:53:06,900
That is about Julia and I'm 
sorry. 

1144
00:53:06,900 --> 00:53:08,200
Sorry. 
What is the first one? 

1145
00:53:08,500 --> 00:53:12,500
Yeah, so I think Excel is good. 
If you, if you accept, if you 

1146
00:53:12,500 --> 00:53:15,400
are not using dates, so you 
would probably have seen lot of 

1147
00:53:15,400 --> 00:53:18,200
names around Excel dates xlsx LS
painful. 

1148
00:53:18,200 --> 00:53:20,400
Many times. 
I would again put extra light 

1149
00:53:20,400 --> 00:53:22,000
Tableau. 
The problem that I Basement 

1150
00:53:22,000 --> 00:53:25,000
Excel is reusability. 
Lot of things that you do in 

1151
00:53:25,000 --> 00:53:26,500
Excel. 
Like if I have to redo it again,

1152
00:53:26,500 --> 00:53:29,600
it's very difficult for me. 
But again, there are places 

1153
00:53:29,600 --> 00:53:31,400
where I would strongly prefer 
Excel first. 

1154
00:53:31,400 --> 00:53:33,600
The first point is like you 
said, right? 

1155
00:53:33,800 --> 00:53:37,600
If I have to do d f dot head in 
condos or a the dataframe head, 

1156
00:53:37,900 --> 00:53:39,500
like top ten rows. 
I want to see. 

1157
00:53:39,800 --> 00:53:43,700
I think I can make more better 
use of my same time. 

1158
00:53:43,700 --> 00:53:46,400
If I do the same thing on Excel,
right? 

1159
00:53:46,400 --> 00:53:49,200
So, if I have to see a part of 
the data, like you said, right, 

1160
00:53:49,200 --> 00:53:51,100
conditional? 
Formatting, I'm known for doing 

1161
00:53:51,100 --> 00:53:52,700
a lot of fun. 
Ditional formatting and then, 

1162
00:53:52,700 --> 00:53:55,700
you know, making colorful colors
but it actually gives you a lot 

1163
00:53:55,700 --> 00:53:58,400
of advances that you might 
otherwise miss when you simply 

1164
00:53:58,400 --> 00:54:01,100
look at, you know, the plane 
screen that you use order or 

1165
00:54:01,100 --> 00:54:03,600
python for. 
So there are places where I have

1166
00:54:03,600 --> 00:54:06,600
I want to look at data. 
Simple pivoting is sometimes you

1167
00:54:06,607 --> 00:54:08,400
know much much easier and to do 
in Excel. 

1168
00:54:08,400 --> 00:54:11,300
So we all have to accept, right?
If the data is not very big I 

1169
00:54:11,300 --> 00:54:13,200
can just use pivot, make a 
simple chart. 

1170
00:54:13,200 --> 00:54:17,200
So Excel is so useful in that 
these days I find making charts 

1171
00:54:17,200 --> 00:54:20,200
very thematic using Excel Tex. 
Okay, whenever I say Excel, it's

1172
00:54:20,300 --> 00:54:21,500
either, G sheet or Excel. 
I don't know. 

1173
00:54:21,700 --> 00:54:24,600
Differentiate between a 
spreadsheet based tools, so 

1174
00:54:24,800 --> 00:54:28,100
making charts that that are 
quite fitting in my organization

1175
00:54:28,100 --> 00:54:30,800
scheme or the team that I want 
to produce Penny, minimalistic 

1176
00:54:30,800 --> 00:54:33,700
charts, and maybe it's because I
didn't put much effort in, you 

1177
00:54:33,700 --> 00:54:36,600
know, creating a theme myself. 
But yeah, XL XL. 

1178
00:54:36,600 --> 00:54:39,100
I started finding it more 
easier. 

1179
00:54:39,100 --> 00:54:40,500
And so these are these are 
certain places. 

1180
00:54:40,500 --> 00:54:44,000
But again, I would strongly 
discourage using Excel in data 

1181
00:54:44,000 --> 00:54:46,200
science for flow. 
The reason I would say is still 

1182
00:54:46,200 --> 00:54:48,200
a lot of people. 
I see them using, you know, like

1183
00:54:48,200 --> 00:54:51,200
Excel library in or like pandas 
read underscore Excel to read 

1184
00:54:51,200 --> 00:54:53,000
XLS. 
Five, I think we should try to 

1185
00:54:53,000 --> 00:54:57,100
keep as much as clean possible. 
Which means like we should deal 

1186
00:54:57,100 --> 00:54:59,300
mostly with the CSV. 
We shouldn't, you know, get into

1187
00:54:59,300 --> 00:55:02,000
Excel world because that world 
is really messy in terms of 

1188
00:55:02,100 --> 00:55:03,800
putting something inside your 
pipeline. 

1189
00:55:03,800 --> 00:55:05,800
If something goes wrong, for 
example, let's say there is a g 

1190
00:55:05,800 --> 00:55:08,700
sheet and the output of G sheet 
is what you are using inside. 

1191
00:55:09,500 --> 00:55:11,400
You're like Founders workflow. 
Now. 

1192
00:55:11,400 --> 00:55:15,100
The problem is that g? 
It has different authorization. 

1193
00:55:15,200 --> 00:55:16,800
If somebody leaves the 
organization, right? 

1194
00:55:16,800 --> 00:55:19,500
What do you do with the g-h it? 
So the like the more, you know, 

1195
00:55:19,500 --> 00:55:21,600
proprietary Solutions you get 
into your data. 

1196
00:55:21,700 --> 00:55:24,600
Workflow, I think the Messier, 
it becomes the technical debt 

1197
00:55:24,600 --> 00:55:26,600
increases. 
So keeping it simple as a 

1198
00:55:26,600 --> 00:55:28,500
school. 
Like for an automated workflow. 

1199
00:55:28,500 --> 00:55:31,300
I think we should keep it as 
simple as possible as, you know,

1200
00:55:31,300 --> 00:55:33,600
standardized as possible or 
python tsp. 

1201
00:55:34,000 --> 00:55:36,500
But if you want to use Excel, 
these are the cases I would use 

1202
00:55:36,500 --> 00:55:38,500
Excel for. 
The other has been great. 

1203
00:55:38,500 --> 00:55:40,700
Is we re done is give a 
fascinating conversation adding 

1204
00:55:40,700 --> 00:55:43,000
to end. 
I mean, is the question that I 

1205
00:55:43,100 --> 00:55:45,500
in most of the most of my 
conversations with. 

1206
00:55:45,500 --> 00:55:48,300
So let's say if you are, if you 
are kind of an aspiring data 

1207
00:55:48,300 --> 00:55:51,000
scientist and from the 
programming perspective lately, 

1208
00:55:51,000 --> 00:55:53,800
how would you? 
Or let's say you have your seen 

1209
00:55:53,800 --> 00:55:56,400
some sort of very interesting 
statistics in yourself or like 

1210
00:55:56,500 --> 00:55:59,300
some sort of interested in 
looking at numbers and so on and

1211
00:55:59,300 --> 00:56:02,000
you want to be a data scientist.
So how do you approach it from 

1212
00:56:02,000 --> 00:56:04,500
the programming perspective? 
Like what you learn? 

1213
00:56:04,500 --> 00:56:08,100
What's how do you sequence a 
tank as you get into the work 

1214
00:56:08,100 --> 00:56:11,300
environment? 
How do you navigate between the 

1215
00:56:11,500 --> 00:56:13,500
different programming languages 
that are available? 

1216
00:56:14,400 --> 00:56:15,900
Yeah. 
Yeah, the the very first thing 

1217
00:56:15,900 --> 00:56:18,700
that I tell everybody is don't 
get into language words, like 

1218
00:56:18,700 --> 00:56:21,600
people who are part taking part 
in language words or you know, 

1219
00:56:21,600 --> 00:56:24,000
like language phonetics. 
I don't you shouldn't be 

1220
00:56:24,000 --> 00:56:27,200
language phonetic, you like both
the languages you would like one

1221
00:56:27,200 --> 00:56:29,200
language, but just don't get 
into language words. 

1222
00:56:29,300 --> 00:56:31,600
It doesn't help anybody and 
people who are doing it. 

1223
00:56:31,600 --> 00:56:33,900
They are already professionals. 
Probably they have they have 

1224
00:56:33,900 --> 00:56:35,400
achieved something. 
So, if you're a beginner and 

1225
00:56:35,400 --> 00:56:37,700
when you get into language, or 
you don't get anything out of 

1226
00:56:37,700 --> 00:56:40,500
it, you're basically losing out 
a super power that is available 

1227
00:56:40,500 --> 00:56:42,400
in some other language. 
That's the first thing. 

1228
00:56:42,600 --> 00:56:45,000
The second thing is. 
There is no point in, you know, 

1229
00:56:45,000 --> 00:56:47,400
trying to figure out which is 
better or not. 

1230
00:56:47,700 --> 00:56:50,000
Just get started with some 
language, whether it is order 

1231
00:56:50,000 --> 00:56:51,800
python. 
So if you want some clue or 

1232
00:56:51,800 --> 00:56:54,100
guidance, I would say, like if 
you are not from engineering 

1233
00:56:54,100 --> 00:56:56,800
background pekar, if you are 
from engineering background, 

1234
00:56:56,800 --> 00:56:59,400
pick python, just purely 
because, you know, you will have

1235
00:56:59,400 --> 00:57:01,800
a Community Learning something 
with the communities, much 

1236
00:57:01,800 --> 00:57:03,400
better than learning, something 
individual. 

1237
00:57:03,600 --> 00:57:07,900
So pick order Python, and if you
are not from engineering to God,

1238
00:57:07,900 --> 00:57:11,000
if you are from engineering pick
Python, and after you pick that,

1239
00:57:11,000 --> 00:57:13,800
I would strongly suggest first 
to go through the language being

1240
00:57:14,000 --> 00:57:16,400
It's this is again a mistake 
that I've seen a lot of data 

1241
00:57:16,400 --> 00:57:18,900
science Enthusiast. 
Doing just straight away. 

1242
00:57:18,900 --> 00:57:20,900
Get started with machine 
learning, don't do that. 

1243
00:57:20,900 --> 00:57:22,800
Just get started with the 
language Basics. 

1244
00:57:23,000 --> 00:57:25,300
Understand the data types, 
understand, you know, how to 

1245
00:57:25,300 --> 00:57:28,800
write for Loop out erisa fails 
in, in, in whatever language 

1246
00:57:28,800 --> 00:57:30,600
that you have picked. 
I think like you said, write 

1247
00:57:30,600 --> 00:57:33,100
these things come very handy, 
probably sometime later in your 

1248
00:57:33,100 --> 00:57:34,500
career. 
You don't want to be in your 

1249
00:57:34,500 --> 00:57:36,300
later, in your career at the 
time thinking. 

1250
00:57:36,700 --> 00:57:38,900
How do I write is here? 
How do I write is here? 

1251
00:57:39,500 --> 00:57:41,500
I mean, that is very basic part 
of the language, right? 

1252
00:57:41,500 --> 00:57:43,500
You're not just learning the 
language for data sense. 

1253
00:57:43,800 --> 00:57:44,900
You're actually learning the 
language. 

1254
00:57:44,900 --> 00:57:47,700
It's not like I'm asking you to 
do develops with python, but at 

1255
00:57:47,700 --> 00:57:49,900
least you should be able to read
a python code and make some 

1256
00:57:49,900 --> 00:57:52,300
sense out of it. 
Like, what does it mean to learn

1257
00:57:52,300 --> 00:57:54,900
the basics of language? 
Once you learn the basics of 

1258
00:57:54,900 --> 00:57:58,500
language, get started with basic
data manipulation in our basic 

1259
00:57:58,500 --> 00:58:01,800
tidy basic ideas, especially, 
especially the deployer under 

1260
00:58:01,800 --> 00:58:04,300
tidy or and in Python pandas 
numpy. 

1261
00:58:04,300 --> 00:58:07,100
And after you do that, I would 
suggest immediately get started 

1262
00:58:07,100 --> 00:58:09,100
with data visualization, not 
machine learning. 

1263
00:58:09,100 --> 00:58:11,500
And the reason I'm saying, is 
from my teaching experience. 

1264
00:58:11,500 --> 00:58:13,400
I've seen when people get into 
t. 

1265
00:58:13,900 --> 00:58:15,800
Our machine learning. 
It just goes deep into deep. 

1266
00:58:15,800 --> 00:58:18,000
It's like a rabbit hole. 
You can never get out of it 

1267
00:58:18,000 --> 00:58:20,100
because now you will try to 
learn. 

1268
00:58:20,100 --> 00:58:21,500
Linear regression, logistic 
regression. 

1269
00:58:21,500 --> 00:58:22,800
Then you start going into 
decision tree. 

1270
00:58:22,800 --> 00:58:25,900
So it's like a big rabbit hole. 
We as human beings. 

1271
00:58:25,900 --> 00:58:28,800
We need some gratification. 
So visualizations are very good 

1272
00:58:28,800 --> 00:58:31,300
way to give you gratification 
when you are trying to learn the

1273
00:58:31,300 --> 00:58:33,400
get started with visualization 
Library. 

1274
00:58:33,600 --> 00:58:36,200
Start from, you know, static 
visualization Library move into 

1275
00:58:36,200 --> 00:58:37,700
interactive. 
Visualization library. 

1276
00:58:38,000 --> 00:58:41,100
Now, switch back to machine 
learning start from, you know, 

1277
00:58:41,100 --> 00:58:44,300
classical algorithms go deep, 
and then learn Full stack 

1278
00:58:44,400 --> 00:58:47,500
development. 
Also, I think we are in 2021 and

1279
00:58:47,500 --> 00:58:49,200
we, as data scientist. 
We should be. 

1280
00:58:49,200 --> 00:58:52,500
We should we should have the 
ability to build at least an 

1281
00:58:52,500 --> 00:58:54,600
MVP. 
If you are in our loan or 

1282
00:58:54,600 --> 00:58:57,500
markdown and are shiny. 
If you are in Python, learn 

1283
00:58:57,500 --> 00:59:00,500
either Dash or streamlet, I have
slightly strong preference for 

1284
00:59:00,500 --> 00:59:02,400
streamlet. 
I think this would make you. 

1285
00:59:02,500 --> 00:59:04,600
This would make you really 
competitive and market. 

1286
00:59:04,800 --> 00:59:07,300
And this is, let's say, you get 
the job. 

1287
00:59:07,400 --> 00:59:09,500
And after you get the job. 
I think your primary duty is to 

1288
00:59:09,500 --> 00:59:12,100
go through documentation and 
understanding how people have 

1289
00:59:12,100 --> 00:59:14,700
written code. 
Internet is Like an amazing 

1290
00:59:14,700 --> 00:59:16,000
place. 
You have a lot of Open Source 

1291
00:59:16,000 --> 00:59:17,500
Code. 
I think at this point, we should

1292
00:59:17,500 --> 00:59:21,000
stop watching tutorials when I 
make noodles, but I still say, I

1293
00:59:21,008 --> 00:59:23,800
think, at this point, we should 
stop watching tutorials and 

1294
00:59:23,800 --> 00:59:26,000
start reading. 
Somebody else's code, spend more

1295
00:59:26,000 --> 00:59:28,600
time on stack of lighting that 
will make you an intermediate or

1296
00:59:28,600 --> 00:59:53,300
better programmer from me. 
Thank you for listening to data 

1297
00:59:53,300 --> 00:59:55,400
set. 
If you like the show. 

1298
00:59:55,800 --> 00:59:59,100
Please leave a comment, share 
and subscribe to the podcast. 

1299
00:59:59,500 --> 01:00:03,300
You can find this podcast on 
Apple podcast Spotify or 

1300
01:00:03,300 --> 01:00:05,700
wherever else you go to get your
podcasts. 

1301
01:00:06,400 --> 01:00:09,500
Once again, this is kind of 
exciting of thank you.

