1
00:00:00,000 --> 00:00:04,200
Today we are discussing data at 
Goldman Sachs. 

2
00:00:04,600 --> 00:00:08,900
We're speaking with Nemo, 
Raphael the firm's head of data 

3
00:00:08,900 --> 00:00:11,700
engineering and their Chief data
officer. 

4
00:00:11,700 --> 00:00:15,300
I've been at Bowman about 20 
years, always as a software 

5
00:00:15,300 --> 00:00:20,400
engineer data, engineer rat, or 
want as we call volunteer 

6
00:00:20,400 --> 00:00:23,200
stress. 
And now, as you mentioned, I run

7
00:00:23,200 --> 00:00:25,900
our data team or data, 
engineering team and chief data 

8
00:00:25,900 --> 00:00:27,800
officer. 
For Goldman data engineering, 

9
00:00:27,800 --> 00:00:32,200
Goldman is probably a little 
little bit different than other 

10
00:00:32,200 --> 00:00:36,400
places. 
I, I, sort of organize my brain 

11
00:00:36,400 --> 00:00:39,200
and my aim and sort of three 
buckets. 

12
00:00:39,200 --> 00:00:42,500
We have our platform team, which
hopefully, I'll have a little 

13
00:00:42,500 --> 00:00:46,300
more time to talk about and our 
curation, our content and 

14
00:00:46,300 --> 00:00:51,000
curation team. 
And also as a chief data 

15
00:00:51,000 --> 00:00:54,200
officer, with my chief data 
officer hat on sort of our 

16
00:00:54,500 --> 00:00:57,800
governance and quality teams. 
So that's that's sort of the 

17
00:00:57,800 --> 00:01:03,400
background of the work we do. 
Data is so obviously important 

18
00:01:03,400 --> 00:01:08,100
to financial services but it 
would be really interesting to 

19
00:01:08,100 --> 00:01:14,500
hear your perspective on the 
role of data how data fits into 

20
00:01:14,500 --> 00:01:18,100
the world of Goldman Sachs 
information is the lifeblood of 

21
00:01:18,100 --> 00:01:20,700
financial services and people 
say well what is that? 

22
00:01:20,900 --> 00:01:24,700
And and I think you know, in 
financial services at Goldman, 

23
00:01:24,700 --> 00:01:30,400
of course you know information 
is is actually our currency So, 

24
00:01:30,400 --> 00:01:34,900
a lot of a lot of decision 
making a lot of helping our 

25
00:01:34,900 --> 00:01:40,600
clients, a lot of innovation is 
all based on what information we

26
00:01:40,600 --> 00:01:44,600
see out in the world and what 
information we see internally. 

27
00:01:46,400 --> 00:01:49,700
And and so organizing that 
making sure it's readily 

28
00:01:49,700 --> 00:01:54,400
accessible, making sure people 
could get to that data quickly 

29
00:01:55,400 --> 00:01:59,500
to, to do to do their work or to
help their clients. 

30
00:02:00,200 --> 00:02:05,100
Starts becoming sort of like a 
core, a core component of doing 

31
00:02:05,100 --> 00:02:08,500
business, that Goldman Sachs, 
you mentioned, the various 

32
00:02:08,500 --> 00:02:11,600
aspects of your role, how you 
break things up. 

33
00:02:11,600 --> 00:02:13,800
Do you want to dive into that? 
And tell us a little bit more 

34
00:02:13,800 --> 00:02:16,100
about that. 
The first thing I think about 

35
00:02:16,100 --> 00:02:20,100
data is that it has to be a 
first class asset. 

36
00:02:20,500 --> 00:02:23,100
Right at Goldman. 
People have to believe that it 

37
00:02:23,100 --> 00:02:26,300
could help solve problems. 
Help, our clients help us 

38
00:02:26,300 --> 00:02:31,800
innovate and financial services.
So, so the first thing we sort 

39
00:02:31,800 --> 00:02:35,700
of did is okay, how do we want 
to work with data Goldman and 

40
00:02:35,700 --> 00:02:39,600
again, as an engineer, that 
always push me to sort of think 

41
00:02:39,600 --> 00:02:44,200
about data as as an engineer 
would think about their code. 

42
00:02:44,200 --> 00:02:47,000
So I the analogy, I always say, 
or maybe it's not even analogy, 

43
00:02:47,000 --> 00:02:51,400
but the way I always talk about 
data is like, you got to think 

44
00:02:51,400 --> 00:02:53,800
about data, just like you were 
thinking about your code and 

45
00:02:53,800 --> 00:02:57,500
that's sort of has pushed us to 
build a platform team. 

46
00:02:58,200 --> 00:03:02,000
That really thinks about that. 
Things about data engineering as

47
00:03:02,000 --> 00:03:05,200
an engineering function. 
He thinks about the workflows of

48
00:03:05,200 --> 00:03:10,200
data Engineers, how we could 
help get, leverage and scale to 

49
00:03:10,200 --> 00:03:13,000
our businesses so that they 
could start making decisions 

50
00:03:13,000 --> 00:03:17,200
better faster, cheaper easier. 
And so the platform team and my 

51
00:03:17,200 --> 00:03:20,900
team is really laser focused on 
that, right? 

52
00:03:21,000 --> 00:03:24,200
How do you make data engineering
workflows just seamless? 

53
00:03:24,400 --> 00:03:29,600
Beautiful, and all the things 
that, you know, Developers are 

54
00:03:29,600 --> 00:03:33,800
used to right? 
I ve s code completion. 

55
00:03:35,300 --> 00:03:39,000
Making sure that we have like 
reproducibility of our data. 

56
00:03:39,000 --> 00:03:43,500
Like we have code think that so 
all of that sort of becomes 

57
00:03:43,500 --> 00:03:48,500
their first class like role is 
that won't is to only think 

58
00:03:48,500 --> 00:03:53,400
about how do we make data 
awesome working with data also 

59
00:03:54,500 --> 00:03:58,800
and then we have a Content team.
So again like I said, A lot of a

60
00:03:58,800 --> 00:04:03,100
lot of our workflows at Goldman 
run on data. 

61
00:04:03,100 --> 00:04:07,800
So you can imagine, you know, 
part of our team is a real-time 

62
00:04:07,800 --> 00:04:09,900
Market data team. 
So all the all the data 

63
00:04:09,900 --> 00:04:12,700
streaming from exchanges or 
different, venues things like 

64
00:04:12,700 --> 00:04:15,600
that in real time, super low 
latency. 

65
00:04:15,600 --> 00:04:19,899
You know, we're talking about 
like millisecond latency here 

66
00:04:19,899 --> 00:04:23,600
right to get to Trader screens 
or to get into the Algos. 

67
00:04:23,800 --> 00:04:26,800
And so there's a, there's a 
whole team of mine which is in 

68
00:04:26,800 --> 00:04:30,400
the curation business. 
The content business, whether 

69
00:04:30,400 --> 00:04:32,800
it's real-time Market data or 
things. 

70
00:04:32,800 --> 00:04:36,800
Like our reference data that we 
use a, you know, like who our 

71
00:04:36,800 --> 00:04:39,400
clients or what products, can we
trade things like that? 

72
00:04:39,400 --> 00:04:44,700
And so that team is really, I'd 
say, sort of a data engineering 

73
00:04:44,700 --> 00:04:47,800
bent team that uses our 
platforms to sort of do their 

74
00:04:47,800 --> 00:04:52,400
work, make sure the data is 
amazing, quality wise accessible

75
00:04:54,200 --> 00:04:58,000
and ready to go. 
And then, the third bit is Is 

76
00:04:58,008 --> 00:05:02,700
really about the framework that 
we do data governance and data 

77
00:05:02,700 --> 00:05:08,400
quality around like so a little 
bit around policy and Frameworks

78
00:05:08,400 --> 00:05:11,700
but then how do we push our 
platform team to then build in 

79
00:05:11,700 --> 00:05:15,900
those mechanisms or controls on 
the platform. 

80
00:05:15,900 --> 00:05:18,500
So when you do your work at 
Goldman, you sort of get those 

81
00:05:18,500 --> 00:05:20,600
for free, I say or 
complimentary. 

82
00:05:20,600 --> 00:05:24,800
So that's sort of the, a little 
more about the three buckets, we

83
00:05:24,800 --> 00:05:27,900
talked about here. 
What I find particularly 

84
00:05:28,000 --> 00:05:32,700
Interesting is unlike most 
businesses which are of course 

85
00:05:32,700 --> 00:05:39,000
have their their business data. 
You're dealing with huge volumes

86
00:05:39,000 --> 00:05:45,800
of real-time information of 
real-time data at which there is

87
00:05:45,800 --> 00:05:50,700
enormous amounts of money and 
risk at stake. 

88
00:05:51,000 --> 00:05:56,700
I have to imagine that that 
places and additional kind of 

89
00:05:56,700 --> 00:06:00,100
intense bird. 
Ian on you, as you're thinking 

90
00:06:00,100 --> 00:06:04,100
about this whole technology 
landscape, I would sort of flip 

91
00:06:04,100 --> 00:06:06,600
it in a positive. 
I don't think it's actually, we 

92
00:06:06,600 --> 00:06:09,000
don't think of it as a burden. 
We think of it as a real 

93
00:06:09,000 --> 00:06:12,500
opportunity, right? 
That I think, is the first sort 

94
00:06:12,500 --> 00:06:14,600
of mindset shift. 
You have to think about when 

95
00:06:14,600 --> 00:06:16,400
you're in these sort of 
high-stakes games. 

96
00:06:16,400 --> 00:06:17,700
Yeah, of course, it's high 
stakes. 

97
00:06:17,700 --> 00:06:21,000
Of course, things could go 
wrong, but really, it's more 

98
00:06:21,000 --> 00:06:24,100
about the opportunities to make 
sure that again, we're helping 

99
00:06:24,100 --> 00:06:27,300
our clients, do the right thing,
we're helping the economy, we're

100
00:06:27,300 --> 00:06:29,700
hoping the Financial system do 
the right thing. 

101
00:06:29,700 --> 00:06:34,200
So we sort of take pride in that
instead of a sort of Burden 

102
00:06:34,200 --> 00:06:37,000
thing. 
And the other one thing I'd say 

103
00:06:37,000 --> 00:06:43,200
is, I actually like to say you 
said, huge amounts of data, I 

104
00:06:43,200 --> 00:06:46,200
actually like to say Goldman 
Sachs were in the sort of medium

105
00:06:46,200 --> 00:06:50,500
sized data but very complex data
World. 

106
00:06:50,500 --> 00:06:54,000
Our data super complex, the 
types of data are complex, the 

107
00:06:54,000 --> 00:06:57,700
speed is complex, there's 
everything from super real. 

108
00:06:57,900 --> 00:07:01,300
Time low latency all the way to,
you know, end of day batch 

109
00:07:01,300 --> 00:07:05,100
processing the relationships 
between our data as complex, 

110
00:07:06,300 --> 00:07:10,100
right. 
The product catalog is complex, 

111
00:07:10,100 --> 00:07:13,400
right? 
It's not just use of whatever 

112
00:07:14,000 --> 00:07:17,700
its actual financial instruments
that people have have made up. 

113
00:07:18,500 --> 00:07:22,300
So the complexity is really the 
really interesting and cool 

114
00:07:22,300 --> 00:07:29,100
challenge here versus I'd say 
the volume what makes The data 

115
00:07:29,100 --> 00:07:31,200
complex, or just describing a 
little bit. 

116
00:07:31,200 --> 00:07:34,500
Can you drill into that? 
The complexity again comes from 

117
00:07:34,500 --> 00:07:37,700
various angles, like you 
mentioned the complexity comes 

118
00:07:37,700 --> 00:07:41,700
from the, from the speed is one 
angle one dimension, right? 

119
00:07:41,700 --> 00:07:43,600
But also the relationships of 
the data. 

120
00:07:43,600 --> 00:07:49,500
So you could imagine, you know, 
there's data about the stocks 

121
00:07:49,500 --> 00:07:54,000
and bonds that peep that our 
clients want to trade for one 

122
00:07:54,000 --> 00:07:57,600
example, but then there's a 
whole infrastructure. 

123
00:07:57,900 --> 00:08:03,100
Around those like I don't know 
how many people here are 

124
00:08:03,400 --> 00:08:05,500
familiar with financial 
services, but there's 

125
00:08:05,500 --> 00:08:07,600
derivatives on those products, 
right? 

126
00:08:07,600 --> 00:08:09,700
So now you're not just talking 
about stocks and bonds, but 

127
00:08:09,700 --> 00:08:12,300
you're talking about really 
complex. 

128
00:08:12,800 --> 00:08:19,900
Now, algorithms plus complex 
data elements about now that 

129
00:08:19,900 --> 00:08:24,200
have a relationship that have 
been now made up in the world on

130
00:08:24,200 --> 00:08:27,500
those stocks and bonds so 
layering of this data and the 

131
00:08:27,900 --> 00:08:30,100
Alexa tea. 
Now that people have come up 

132
00:08:30,100 --> 00:08:33,200
with such creative ideas to help
our clients and things like that

133
00:08:34,200 --> 00:08:38,100
starts pushing the boundaries of
like how this data is related 

134
00:08:38,100 --> 00:08:43,100
and interconnected and curated. 
Hopefully, that gives a little 

135
00:08:43,100 --> 00:08:48,100
bit of a sense. 
You are managing these very 

136
00:08:48,100 --> 00:08:55,100
large volumes of real-time data.
Do you want to tell us at all 

137
00:08:55,100 --> 00:08:59,000
about your, your technology 
infrastructure? 

138
00:08:59,000 --> 00:09:02,800
It's not a question. 
I usually ask, but it seems that

139
00:09:02,800 --> 00:09:08,400
in this case given the the 
equities, the derivatives, 

140
00:09:08,400 --> 00:09:10,900
everything else that you had 
just mentioned it. 

141
00:09:11,100 --> 00:09:13,800
Seems the technology 
infrastructure has to play a 

142
00:09:13,800 --> 00:09:16,000
very important role in the 
structure. 

143
00:09:16,000 --> 00:09:20,200
Absolutely plays a huge role. 
Here, we have an incredible 

144
00:09:20,200 --> 00:09:24,100
infrastructure team and you 
could imagine again that this 

145
00:09:24,100 --> 00:09:27,600
goes all the way down to sort of
the hardware Hardware layers, 

146
00:09:27,600 --> 00:09:34,800
like the networking stack, the 
computers you use the network 

147
00:09:34,800 --> 00:09:40,300
cards in the computers use. 
All this stuff sort of actually 

148
00:09:40,300 --> 00:09:44,800
matters at At this sort of 
latency in scale, but I got it. 

149
00:09:45,000 --> 00:09:47,600
I don't want a super over index 
on the Real Time stuff. 

150
00:09:47,600 --> 00:09:51,800
The Real Time stuff is seen as a
very important, very critical 

151
00:09:51,800 --> 00:09:57,000
and important part of of of our 
of our world, for sure. 

152
00:09:57,800 --> 00:10:00,700
And the team does an amazing job
all the way from the hardware 

153
00:10:00,700 --> 00:10:05,700
layer up. 
But we also have, you know, data

154
00:10:05,700 --> 00:10:09,200
challenges just after, you know,
as the real-time data. 

155
00:10:11,000 --> 00:10:14,100
As in, you know, when you do a 
trade right, that store also 

156
00:10:14,100 --> 00:10:18,100
starts being complex and so we 
have built sort of a data 

157
00:10:18,100 --> 00:10:23,400
platform here, we call Legend 
and we've actually open source 

158
00:10:23,400 --> 00:10:25,000
that recently in the last couple
of years. 

159
00:10:25,000 --> 00:10:27,900
We've built this Tech stack over
the last 10 years. 

160
00:10:27,900 --> 00:10:31,700
Internally, two years about two 
years ago, maybe three. 

161
00:10:31,700 --> 00:10:35,400
Now we open sourced it it's 
fully on GitHub, you can check 

162
00:10:35,400 --> 00:10:38,500
it out. 
We gave the code to a non-profit

163
00:10:38,500 --> 00:10:43,500
and open source nonprofit called
Fitness, Around GitHub happy to 

164
00:10:43,500 --> 00:10:45,600
talk more about about that as 
well. 

165
00:10:46,000 --> 00:10:49,900
Why did you open source at first
glance? 

166
00:10:50,200 --> 00:10:53,100
It seems like an odd decision to
me. 

167
00:10:53,100 --> 00:10:57,500
Because one would think that a 
firm like Goldman Sachs would 

168
00:10:57,500 --> 00:11:00,800
want to keep all of that 
infrastructure to yourself. 

169
00:11:01,200 --> 00:11:03,900
There's a couple of reasons. 
One is, as we were building 

170
00:11:03,900 --> 00:11:08,200
this, we talked to a lot of 
clients who are having exactly 

171
00:11:08,200 --> 00:11:12,800
the same data challenges. 
We Were and, you know, we're 

172
00:11:12,800 --> 00:11:17,500
very obviously, a very 
client-centric firm. 

173
00:11:17,500 --> 00:11:20,800
So we thought look, this has 
helped us so much internally. 

174
00:11:20,800 --> 00:11:24,100
This platform has helped us so 
much internally, we'd love to 

175
00:11:24,100 --> 00:11:27,100
give it a chance. 
Help our clients to help push 

176
00:11:27,100 --> 00:11:31,200
the industry forward as well. 
And and so you also have to 

177
00:11:31,200 --> 00:11:34,400
separate a little bit about the 
platform itself versus the 

178
00:11:34,400 --> 00:11:37,500
content that we curate and work 
in the platform. 

179
00:11:37,500 --> 00:11:42,700
So, we haven't yet. 
Or maybe never will open source 

180
00:11:42,700 --> 00:11:47,300
the actual content, but we could
talk about that as well. 

181
00:11:47,900 --> 00:11:50,400
But the platform and the work 
we've done to sort of 

182
00:11:50,400 --> 00:11:53,100
standardize, how we work with 
data. 

183
00:11:53,100 --> 00:11:58,500
We thought that was so powerful.
That that actually giving it out

184
00:11:58,500 --> 00:12:00,900
to the community and building a 
community around, that could 

185
00:12:00,900 --> 00:12:05,200
actually help the whole industry
and our clients and and 

186
00:12:05,200 --> 00:12:09,800
ourselves you are wanting to 
share best practices. 

187
00:12:10,000 --> 00:12:14,100
Is approaches to working with 
data things like that with the 

188
00:12:14,100 --> 00:12:17,400
community and then of course, 
everybody's going to have their 

189
00:12:17,400 --> 00:12:20,500
own data content within within 
those constructs. 

190
00:12:20,900 --> 00:12:23,300
That's exactly right. 
The interesting part is even the

191
00:12:23,500 --> 00:12:26,100
so even if we don't give out the
content, right? 

192
00:12:26,900 --> 00:12:31,800
We are working with industry 
standard bodies, to even 

193
00:12:31,800 --> 00:12:35,100
describe the data and structure 
the data, and the linkages to 

194
00:12:35,100 --> 00:12:38,000
the data. 
So, it's a really cool and 

195
00:12:38,000 --> 00:12:41,600
Powerful technique. 
And, you know, working with 

196
00:12:41,600 --> 00:12:44,400
these standardization bodies and
you know like what does a 

197
00:12:44,408 --> 00:12:47,900
derivative look like and even if
even if our data looks slightly 

198
00:12:47,900 --> 00:12:52,700
different than our clients. 
At least we could standardize on

199
00:12:52,700 --> 00:12:56,200
how we talk about those those 
things. 

200
00:12:56,200 --> 00:12:59,300
And that's what the legend 
platform, really excels at. 

201
00:12:59,700 --> 00:13:04,100
So it's really a contribution to
the broader data science data 

202
00:13:04,100 --> 00:13:06,300
and analytics Community 
essentially. 

203
00:13:06,300 --> 00:13:08,300
Ultimately absolutely 
absolutely. 

204
00:13:08,300 --> 00:13:12,300
I think the platform is Is a 
general purpose platform for 

205
00:13:12,300 --> 00:13:15,700
working with data and then you 
know we specialize on some 

206
00:13:15,700 --> 00:13:19,100
standards and things like that 
working which is I think another

207
00:13:19,100 --> 00:13:22,000
contribution to the into the 
financial industry in general. 

208
00:13:22,000 --> 00:13:24,900
Is just how do you get to that 
interop and standardization of 

209
00:13:25,600 --> 00:13:28,700
at least data contracts or how 
we talk about different terms 

210
00:13:28,700 --> 00:13:32,000
and relationships of different 
pieces of information. 

211
00:13:32,400 --> 00:13:36,700
Please subscribe to our YouTube 
channel and hit the Subscribe 

212
00:13:36,700 --> 00:13:39,300
button at the top of our 
website. 

213
00:13:39,300 --> 00:13:43,300
We have Have a number of 
questions on Twitter so why 

214
00:13:43,300 --> 00:13:45,700
don't we jump over to some of 
those? 

215
00:13:46,300 --> 00:13:49,900
And we have an interesting 
question from our Salon con are 

216
00:13:49,900 --> 00:13:53,400
salons, a regular listener and 
so thank you our salon for, for 

217
00:13:53,400 --> 00:13:55,600
your, for listening, and for 
this great question. 

218
00:13:56,100 --> 00:14:00,800
He says, given that data is such
an important asset. 

219
00:14:01,000 --> 00:14:05,600
How do you assign the financial 
value to that data? 

220
00:14:05,600 --> 00:14:09,700
Who is it that? 
Decides that what data? 

221
00:14:09,900 --> 00:14:14,500
As Financial value and how much 
it's worth and, and I'll just 

222
00:14:14,500 --> 00:14:17,900
add to that. 
I'm assuming that, that helps 

223
00:14:18,200 --> 00:14:22,900
that attachment of value is one 
of the things that guides your 

224
00:14:22,900 --> 00:14:27,300
priorities in terms of where you
focus your team focuses, what we

225
00:14:27,300 --> 00:14:31,200
do is we work with our business 
line, so I run our core data T. 

226
00:14:31,200 --> 00:14:35,600
We work with our business lines 
about what their, what they need

227
00:14:36,000 --> 00:14:39,500
for their client to serve their 
clients better what they need 

228
00:14:39,500 --> 00:14:44,100
for their business to scale what
they need from us, to help them,

229
00:14:44,100 --> 00:14:48,200
sort of get an edge or innovate 
in the data realm. 

230
00:14:48,800 --> 00:14:54,700
And so, a lot of it is about the
value of the overall outcome 

231
00:14:55,100 --> 00:15:00,600
versus just like ascribing value
to some piece of data. 

232
00:15:00,700 --> 00:15:05,600
And so, the way I, the way I 
think about from a core team, 

233
00:15:05,600 --> 00:15:09,400
helping people is really what is
the outcome? 

234
00:15:09,400 --> 00:15:12,600
We're trying. 
To drive for our clients and for

235
00:15:12,700 --> 00:15:16,700
our businesses and for 
Innovation and really, look at 

236
00:15:16,700 --> 00:15:19,100
that holistically. 
I think, again, I think it's a 

237
00:15:19,108 --> 00:15:24,900
little bit of a, little bit of a
misnomer, a mistake to sort of, 

238
00:15:24,900 --> 00:15:28,500
try to say, okay, this dataset 
creates this much value, this 

239
00:15:28,500 --> 00:15:31,900
creates that much value is like,
what did we do for that business

240
00:15:31,900 --> 00:15:35,200
and for our clients and really 
we take those wins. 

241
00:15:36,000 --> 00:15:40,600
As platform and data, winds 
collectively with, with those 

242
00:15:40,600 --> 00:15:46,400
teams that makes perfect sense 
because the point is not some 

243
00:15:46,400 --> 00:15:49,500
piece of data, the point or body
of data. 

244
00:15:49,500 --> 00:15:53,800
The point is, what are we doing 
with that data? 

245
00:15:54,200 --> 00:15:59,300
And as you said, what are the 
resulting business outcomes and 

246
00:15:59,300 --> 00:16:03,600
then you have a framework for 
for valuing, the the data 

247
00:16:03,900 --> 00:16:06,300
because we're trying to get to 
To the outcome. 

248
00:16:06,400 --> 00:16:07,800
Sorry, I didn't mean to answer 
for you. 

249
00:16:07,900 --> 00:16:09,400
No, no, that's exactly right. 
That's it. 

250
00:16:09,400 --> 00:16:10,700
That's a great. 
That's a great. 

251
00:16:10,700 --> 00:16:12,600
Exactly. 
This, the right summary. 

252
00:16:12,600 --> 00:16:15,900
And that's exactly how we do. 
Our oh, I like our return on 

253
00:16:15,908 --> 00:16:18,400
investment. 
On the work we do, is we 

254
00:16:18,400 --> 00:16:23,900
actually work hand in hand with 
those teens and say, okay, like 

255
00:16:24,200 --> 00:16:28,500
it, we were we able to reduce 
Risk by X or were we able to 

256
00:16:28,500 --> 00:16:33,000
help our clients do? 
Why better or faster or get them

257
00:16:33,000 --> 00:16:35,400
into better position for y or 
did we? 

258
00:16:35,800 --> 00:16:39,400
Would enable the business to do 
some new thing that they were 

259
00:16:39,400 --> 00:16:43,100
not able to do. 
And that's really like the ROI 

260
00:16:43,100 --> 00:16:48,700
calc and at that level we have 
another question from Twitter. 

261
00:16:48,700 --> 00:16:53,900
And another great question, this
is from Natalie Bean who says, 

262
00:16:54,000 --> 00:16:58,900
how do you balance data quality 
and data quantity? 

263
00:16:58,900 --> 00:17:02,000
And we haven't even spoken about
data quality yet, so this is a 

264
00:17:02,000 --> 00:17:05,000
great question. 
When you think about data as a 

265
00:17:05,000 --> 00:17:07,800
first-class, Class concept, when
you think about data is the, 

266
00:17:08,400 --> 00:17:11,099
when you do the same things, you
do with your code, that you do 

267
00:17:11,099 --> 00:17:14,599
with data, like, think about 
your data architecture up front,

268
00:17:14,700 --> 00:17:17,900
think about how you structure 
your data, how it relates to 

269
00:17:17,900 --> 00:17:21,400
other pieces of data, right? 
And you do that, work up front, 

270
00:17:21,400 --> 00:17:24,599
we have seen the yeah, The 
Upfront work takes a little bit 

271
00:17:24,599 --> 00:17:29,800
longer but the, but the huge 
benefits of that become apparent

272
00:17:29,900 --> 00:17:31,600
as you're trying to scale these 
things. 

273
00:17:31,600 --> 00:17:35,600
So we deal with it. 
We deal with the scale and the 

274
00:17:35,600 --> 00:17:40,300
volume and the complexity, like 
any engineering organ. 

275
00:17:40,300 --> 00:17:43,100
Is we build tools. 
We build platforms, and then we 

276
00:17:43,200 --> 00:17:45,900
make sure that those scale to 
those problems. 

277
00:17:46,900 --> 00:17:52,100
And right and so now when we do 
data Goldman Sachs it's like we 

278
00:17:52,100 --> 00:17:56,000
do it on the platform and 
everyone knows like okay now the

279
00:17:56,000 --> 00:17:59,000
quants are going to get their 
data you know and days instead 

280
00:17:59,000 --> 00:18:03,100
of months because we have set up
the right platform constructs 

281
00:18:03,100 --> 00:18:05,900
and Engineering constructs for 
that. 

282
00:18:05,900 --> 00:18:11,300
So again it's not a perfect 
Silver Bullet, answer of like as

283
00:18:11,300 --> 00:18:14,900
things grow like, you know, 
there's some there's some 

284
00:18:14,900 --> 00:18:18,500
equation but we have Seen that 
the investment in the platform 

285
00:18:19,500 --> 00:18:22,500
and the tools and the workflows.
That's that's the thing that 

286
00:18:22,500 --> 00:18:26,400
helps us scale again. 
I find it really fascinating 

287
00:18:26,400 --> 00:18:30,500
that the platform and as you 
said, the, the workflows play 

288
00:18:30,900 --> 00:18:37,300
such a crucial role, but I 
suppose it's entirely logical 

289
00:18:37,500 --> 00:18:41,900
that when you need that data 
when that data is so important 

290
00:18:41,900 --> 00:18:44,900
and it needs to be right? 
And it needs to be consistently,

291
00:18:44,900 --> 00:18:48,100
write that Need that 
infrastructure and the 

292
00:18:48,100 --> 00:18:51,900
automation to make it happen at 
that, at that level of quality. 

293
00:18:52,400 --> 00:18:55,400
That's exactly right. 
This is my personal view coming 

294
00:18:55,400 --> 00:18:59,200
from an engineering background. 
I solve pry eyes off, scale, 

295
00:18:59,200 --> 00:19:02,600
problems of all your problems 
and complexity Problems by 

296
00:19:02,600 --> 00:19:04,900
breaking it down into 
engineering steps and 

297
00:19:04,900 --> 00:19:07,700
Engineering platforms. 
And so that's that's sort of 

298
00:19:07,700 --> 00:19:11,600
been our ethos of this whole 
legend platform is really about 

299
00:19:11,600 --> 00:19:13,800
that. 
How to attack that problem from 

300
00:19:13,800 --> 00:19:16,500
an engineering lens. 
We have another Another 

301
00:19:16,500 --> 00:19:21,600
excellent question from Twitter.
This is from Lizabeth Shaw, who 

302
00:19:21,600 --> 00:19:26,100
is now alluding to governance 
which you mentioned earlier, but

303
00:19:26,100 --> 00:19:27,900
we definitely should talk about 
that. 

304
00:19:28,100 --> 00:19:32,700
And she says, how do you build 
in the mechanisms that support 

305
00:19:32,700 --> 00:19:36,100
data policies? 
The point of the platform isn't 

306
00:19:36,100 --> 00:19:38,000
that? 
It just magically makes these 

307
00:19:38,000 --> 00:19:41,000
things go away. 
It's that it makes you think of 

308
00:19:41,000 --> 00:19:44,300
them upfront, it makes you think
of these Concepts. 

309
00:19:45,100 --> 00:19:51,000
As you're doing your data design
and your data work and so really

310
00:19:51,000 --> 00:19:54,100
when you when you're on the 
legend platform, thinking about 

311
00:19:54,100 --> 00:19:56,800
your data, the first thing you 
think is what is your data 

312
00:19:56,800 --> 00:20:01,400
contract, right? 
That's actually the first bit of

313
00:20:01,400 --> 00:20:04,600
the workflow is describe your 
data, describe how you want to 

314
00:20:04,600 --> 00:20:07,300
publish it, describe how you 
want other people to see it or 

315
00:20:07,300 --> 00:20:12,900
consume it how you want the 
track things like lineage. 

316
00:20:14,800 --> 00:20:17,800
All of that is sort of built in 
baked in, as a first class 

317
00:20:17,800 --> 00:20:19,700
concept. 
So again, I don't want to 

318
00:20:19,700 --> 00:20:21,700
oversell the thing. 
You still have to think about 

319
00:20:21,700 --> 00:20:24,200
it, you still have to do it, but
the point is, bringing it up 

320
00:20:24,200 --> 00:20:29,000
front, instead of hiding it as 
some secondary thing. 

321
00:20:29,000 --> 00:20:32,300
And so, have to think about the 
entitlements, the security, the 

322
00:20:32,300 --> 00:20:35,400
encryption, all of that. 
You think of, as sort of the 

323
00:20:35,400 --> 00:20:38,900
first class assets, as you're 
just a first-class Concepts as 

324
00:20:38,900 --> 00:20:42,600
you're designing, your data flow
or your data workflow, or your 

325
00:20:42,600 --> 00:20:47,700
data, Production or data 
consumption patterns, and all 

326
00:20:47,700 --> 00:20:51,200
that sort of comes together in 
in the platform. 

327
00:20:51,500 --> 00:20:55,900
It's fascinating again to me 
because it seems that your 

328
00:20:56,300 --> 00:21:02,600
emphasis is definitely placed 
higher on that infrastructure 

329
00:21:02,600 --> 00:21:06,900
and platforms. 
Then other Chief data officer, 

330
00:21:06,900 --> 00:21:13,100
others that I've spoken with, 
but at the same time, you were 

331
00:21:13,300 --> 00:21:16,800
Dealing with a level of data 
complexity combined with the 

332
00:21:16,800 --> 00:21:21,500
speed and the and the financial 
consequences associated with it,

333
00:21:21,500 --> 00:21:25,600
that I think few other companies
would have that set of 

334
00:21:26,300 --> 00:21:30,200
combination of circumstances. 
Exactly like the quality the 

335
00:21:30,200 --> 00:21:33,200
data has to be right. 
It has to be consistently, 

336
00:21:33,200 --> 00:21:39,200
right, as you said, and the 
ramifications of that are pretty

337
00:21:39,300 --> 00:21:42,200
or pretty big. 
We have another question 

338
00:21:42,200 --> 00:21:46,700
relating to to governance. 
And this is from again, from 

339
00:21:46,800 --> 00:21:51,100
Arsenal and con. 
And he says, how do you decide 

340
00:21:51,100 --> 00:21:55,000
what data is good to use? 
What is not? 

341
00:21:55,000 --> 00:21:58,900
How do you address bias? 
He's in the data you're 

342
00:21:58,900 --> 00:22:03,500
collecting and using. 
And this is interesting, he 

343
00:22:03,500 --> 00:22:09,400
says, do the business lines, 
agree with your conclusions. 

344
00:22:10,000 --> 00:22:12,800
So how do you also get everybody
on the same page around this 

345
00:22:12,800 --> 00:22:15,900
stuff? 
If we are not some like isolated

346
00:22:15,900 --> 00:22:19,900
team in the corner with like 
pointy hats, right? 

347
00:22:19,900 --> 00:22:22,600
Like doing this in isolation. 
I think that's the first most 

348
00:22:22,600 --> 00:22:26,900
important thing to get. 
Like, we are hands on keyboard 

349
00:22:27,200 --> 00:22:31,200
together with the businesses, 
making sure that the data were 

350
00:22:31,200 --> 00:22:35,100
using first is right, right? 
But the right for them and write

351
00:22:35,100 --> 00:22:40,100
for the use case, And really is 
solving their problem. 

352
00:22:40,100 --> 00:22:43,900
So, the first point I would make
is, you know, aligning ourselves

353
00:22:43,900 --> 00:22:47,400
with business outcomes and with 
the businesses is the first 

354
00:22:47,400 --> 00:22:49,000
thing. 
So like not being in some 

355
00:22:49,000 --> 00:22:53,200
isolated back room, like oh, we 
know best about everything, 

356
00:22:53,500 --> 00:22:57,400
every piece of data. 
And again, the the financial 

357
00:22:57,600 --> 00:23:01,600
domain is so complex that we 
would never even pretend to do 

358
00:23:01,600 --> 00:23:04,100
that, right? 
You know, we have people who are

359
00:23:04,600 --> 00:23:07,200
highly skilled and finance, 
Financial areas. 

360
00:23:07,200 --> 00:23:11,700
So that we could make those 
decisions with our our 

361
00:23:11,700 --> 00:23:19,200
businesses, like, in sort of a 
joint venture fashion, but we 

362
00:23:19,200 --> 00:23:21,200
would never say we're the 
experts in everything. 

363
00:23:21,200 --> 00:23:24,500
So I think that's the first 
piece is like, you have to be 

364
00:23:24,500 --> 00:23:26,700
connected to the business and 
the business outcomes. 

365
00:23:26,700 --> 00:23:32,000
So that's that's the first piece
then, then again the quality and

366
00:23:32,000 --> 00:23:36,400
the governance, right is then 
becomes an aligning incentives. 

367
00:23:36,600 --> 00:23:38,200
Of a joint sort of venture, 
right? 

368
00:23:38,200 --> 00:23:40,400
It's becomes there's a healthy 
tension, right? 

369
00:23:40,500 --> 00:23:44,000
Of course we want to do things 
at scale and they want to sort 

370
00:23:44,000 --> 00:23:45,800
of solve the problem 
immediately. 

371
00:23:47,700 --> 00:23:50,600
But the point is again that, you
know, bringing these two teams 

372
00:23:50,600 --> 00:23:54,700
together really helps accelerate
accelerate that and then get to 

373
00:23:54,700 --> 00:23:56,400
the right answers and the right 
data. 

374
00:23:56,700 --> 00:24:01,100
So this then leads to the 
question about the composition 

375
00:24:01,500 --> 00:24:06,500
of your team. 
It's obvious that you have very 

376
00:24:06,600 --> 00:24:12,800
three deep technology and data 
expertise, but as you just 

377
00:24:12,800 --> 00:24:18,200
alluded in order to do your job.
Again, I have to assume that you

378
00:24:18,200 --> 00:24:24,600
need equivalent Financial depth 
of expertise especially when you

379
00:24:24,600 --> 00:24:27,500
go into Concepts like 
derivatives, as you were, as you

380
00:24:27,500 --> 00:24:32,400
were describing earlier, we do 
have a team of stress or quants 

381
00:24:32,400 --> 00:24:38,900
in the maybe more financial 
world, who Come from being on 

382
00:24:38,900 --> 00:24:42,100
the desk. 
Understanding how data is used 

383
00:24:42,100 --> 00:24:47,300
on the desks and have sort of a 
stem background, whether its 

384
00:24:47,300 --> 00:24:51,900
technology math, physics, 
whatever who are equivalent, 

385
00:24:51,900 --> 00:24:54,700
counterparties in Mighty 
counterparts, in my team that 

386
00:24:54,700 --> 00:24:59,700
work with these groups and 
actually understand the 

387
00:24:59,700 --> 00:25:02,800
financial domain, but cross over
to the technical domain. 

388
00:25:02,800 --> 00:25:06,000
So that's that's what we call 
our data design and curation 

389
00:25:06,000 --> 00:25:08,500
team. 
I'm pretty pretty cool, 

390
00:25:08,500 --> 00:25:12,500
high-powered, high-powered, 
crew, that that really is bound 

391
00:25:12,500 --> 00:25:16,600
to those business. 
Was those folks which team are 

392
00:25:16,600 --> 00:25:18,200
they part of? 
Are they part of your 

393
00:25:18,200 --> 00:25:23,600
organization or part of the 
finance or trading organization?

394
00:25:23,600 --> 00:25:26,500
Where do they actually fit in 
our team? 

395
00:25:26,500 --> 00:25:29,200
We have a subset of that team, 
right? 

396
00:25:29,200 --> 00:25:34,400
But those, those sort of people 
are also all embedded in the 

397
00:25:34,400 --> 00:25:37,400
business as well. 
So there's Bedded what we call 

398
00:25:37,400 --> 00:25:41,000
embedded, death strats or 
embedded quants in various 

399
00:25:41,000 --> 00:25:45,900
businesses. 
But then I have a sort of small 

400
00:25:45,900 --> 00:25:51,100
selection of that team that 
works specifically on data with 

401
00:25:51,100 --> 00:25:53,400
those other business teams, if 
that makes sense. 

402
00:25:53,600 --> 00:25:59,200
What are the elements that 
comprise a successful data team 

403
00:25:59,200 --> 00:26:02,800
such as you have? 
It really follows sort of the 

404
00:26:02,800 --> 00:26:06,100
structure of the team like the 
platform team is high-powered 

405
00:26:06,100 --> 00:26:07,900
soft. 
For engineers right there. 

406
00:26:08,200 --> 00:26:10,000
They come from a software 
background. 

407
00:26:10,000 --> 00:26:13,500
Their goal is to make the 
software bulletproof and build 

408
00:26:13,500 --> 00:26:15,700
the right workflows for data 
engineers. 

409
00:26:16,500 --> 00:26:19,600
And the second bucket is 
basically data, Engineers, 

410
00:26:19,600 --> 00:26:23,400
people who are content experts 
but can use the platform 

411
00:26:23,400 --> 00:26:26,100
configure. 
The platform build data 

412
00:26:26,100 --> 00:26:28,800
pipelines, build curation 
pipelines. 

413
00:26:30,600 --> 00:26:36,100
Build data models that then get 
shared out then the third 

414
00:26:36,100 --> 00:26:39,900
bucket. 
Is really this sort of hybrid 

415
00:26:40,000 --> 00:26:41,900
hybrid. 
Again, I'll use the word strap 

416
00:26:41,900 --> 00:26:43,500
because that's what we use 
internally. 

417
00:26:43,500 --> 00:26:48,600
This hybrid team, that's really 
straddles sort of deep finance 

418
00:26:48,600 --> 00:26:53,000
and deep Tech together. 
And then the fourth is sort of 

419
00:26:53,000 --> 00:26:56,700
our governance, you know, 
framework team. 

420
00:26:56,700 --> 00:27:00,300
They're the people who sort of 
set the policies set the 

421
00:27:00,300 --> 00:27:05,600
framework on and and and work 
with the divisions to sort of, 

422
00:27:05,600 --> 00:27:08,100
make sure there. 
They're working in the bounds of

423
00:27:08,100 --> 00:27:11,600
our framework. 
One of the topics that we have, 

424
00:27:11,600 --> 00:27:17,400
not really touched upon is the 
notion of data interoperability 

425
00:27:18,500 --> 00:27:20,700
and I know that's important to 
you. 

426
00:27:20,700 --> 00:27:24,600
So can you tell us about that? 
We touched a little bit of pain.

427
00:27:24,600 --> 00:27:28,300
It, that's sort of why a big big
reason we open sourced. 

428
00:27:28,300 --> 00:27:33,400
Our Legend platform was exactly 
for that was because we felt 

429
00:27:33,400 --> 00:27:38,300
that like if we could bring some
Platform standardization in the 

430
00:27:38,300 --> 00:27:44,200
industry for us or clients are 
counterparties that then at 

431
00:27:44,200 --> 00:27:48,100
least the discussions about how 
data works or how it should be 

432
00:27:48,100 --> 00:27:52,200
connected or how it should be 
described in the industry that 

433
00:27:52,200 --> 00:27:55,100
we could help push that forward 
and we can help push that 

434
00:27:55,100 --> 00:28:00,000
forward with standards with 
other bodies, but they could all

435
00:28:00,000 --> 00:28:04,700
do that work in one sort of 
saying way in in our platform. 

436
00:28:04,700 --> 00:28:08,400
Well, not it's not even our For 
me, in the open source Community

437
00:28:08,400 --> 00:28:13,500
platform and and that will 
benefit everybody. 

438
00:28:13,500 --> 00:28:18,200
So it's a bit, it's absolutely a
big, a big, big play, I'll 

439
00:28:18,200 --> 00:28:22,000
mention, you know, one project 
we're doing in the in US 

440
00:28:22,000 --> 00:28:27,500
Community which is the open 
source nonprofit that we gave 

441
00:28:27,500 --> 00:28:30,300
gave Legend to. 
We brought is though, which is 

442
00:28:30,300 --> 00:28:33,500
this the derivative standards 
d'oeuvres body? 

443
00:28:34,600 --> 00:28:38,700
They have built a The model 
called CDM, common domain model 

444
00:28:38,700 --> 00:28:44,300
for derivatives and now that 
sort of that data model is now 

445
00:28:44,300 --> 00:28:47,300
available in legend for people 
to collaborate on in a 

446
00:28:47,308 --> 00:28:50,400
collaborative environment out in
the wild it's has nothing to do 

447
00:28:50,400 --> 00:28:52,400
with Goldman Sachs they're just 
using the platform. 

448
00:28:52,400 --> 00:28:57,600
So these are like pretty cool 
things that you know we're 

449
00:28:57,600 --> 00:29:01,700
trying to do to push the interop
and and standardization out how 

450
00:29:01,700 --> 00:29:05,700
has the uptake bin in the 
broader community? 

451
00:29:06,200 --> 00:29:11,700
Of of this platform and data. 
Interoperability the cool part 

452
00:29:11,700 --> 00:29:15,400
is that our code is out there we
have clients, actually deploying

453
00:29:15,400 --> 00:29:17,500
it using it. 
We have other counterparties 

454
00:29:17,500 --> 00:29:21,900
also looking at it, doing pocs, 
in their own environment, but 

455
00:29:21,900 --> 00:29:24,900
the flip side of that is, you 
know, building an open source 

456
00:29:24,900 --> 00:29:30,300
Community is very difficult. 
I mean, we vastly underestimated

457
00:29:30,400 --> 00:29:35,700
how much time and effort and 
energy goes into, you know, 

458
00:29:35,700 --> 00:29:38,100
making Making sure people know 
how to use the thing, the 

459
00:29:38,100 --> 00:29:40,700
documentation. 
Do they understand the value? 

460
00:29:40,700 --> 00:29:45,600
Can they contribute code back? 
Like, all of these things, we 

461
00:29:45,600 --> 00:29:49,100
knew it would take time, but I 
think we sort of grossly 

462
00:29:49,100 --> 00:29:53,300
underestimated how much time to 
build sort of that, at that big 

463
00:29:53,300 --> 00:29:57,000
Community around these. 
These projects, what are the 

464
00:29:57,000 --> 00:30:01,800
benefits that you have presented
to the community as to why they 

465
00:30:01,800 --> 00:30:07,000
should engage in this, it has 
helped us Date of better at 

466
00:30:07,000 --> 00:30:10,100
Goldman, is helped us organize 
our data. 

467
00:30:10,100 --> 00:30:13,500
It has helped break down silos. 
It has helped the data quality 

468
00:30:13,500 --> 00:30:15,600
massively. 
It has helped data governance 

469
00:30:15,600 --> 00:30:20,700
aspects, massively internally. 
And so we point to sort of those

470
00:30:20,700 --> 00:30:24,600
success stories about how we do 
it internally that we could also

471
00:30:24,600 --> 00:30:30,500
help clients do the same thing. 
And you know if there's appetite

472
00:30:30,600 --> 00:30:33,600
of course that we could also 
help the broader industry break 

473
00:30:33,600 --> 00:30:36,800
down those silos as well. 
We have Another couple of 

474
00:30:36,800 --> 00:30:39,900
questions that are coming in 
from Twitter. 

475
00:30:39,900 --> 00:30:47,700
Do you think of your data or 
your platform as a product that 

476
00:30:47,700 --> 00:30:50,600
you can sell, given that you've 
built all of this 

477
00:30:50,600 --> 00:30:52,900
infrastructure? 
I'll say that's not the 

478
00:30:52,900 --> 00:30:57,100
intention of doing it. 
I think, though, again, we have 

479
00:30:57,100 --> 00:31:01,900
seen that as clients use it as 
other Banks, use it, that people

480
00:31:01,900 --> 00:31:06,100
sort of come back to us and ask,
okay, well, what could we What 

481
00:31:06,100 --> 00:31:10,600
could the support model B, or, 
hey, can you host this as a SAS 

482
00:31:10,600 --> 00:31:12,300
for us? 
So that we don't have to deal 

483
00:31:12,300 --> 00:31:14,300
with setting up the 
infrastructure, running the 

484
00:31:14,300 --> 00:31:15,600
infrastructure or things like 
that. 

485
00:31:15,600 --> 00:31:19,400
So I think we're still again, 
just not to oversell anything. 

486
00:31:19,400 --> 00:31:23,800
I think we're still in a very 
crawl stages of even thinking 

487
00:31:23,800 --> 00:31:26,800
about that community and how we 
could help do better and better 

488
00:31:26,800 --> 00:31:32,900
but Our intention isn't like 
Like make money off of this 

489
00:31:32,900 --> 00:31:34,100
thing. 
It's really about the 

490
00:31:34,100 --> 00:31:38,100
standardization parts now if 
people see a large adoption and 

491
00:31:38,100 --> 00:31:42,500
see a value, I think it could be
an exciting potential 

492
00:31:42,500 --> 00:31:46,400
opportunity. 
We have another question from 

493
00:31:46,400 --> 00:31:50,600
Twitter. 
Can you give any examples where 

494
00:31:50,700 --> 00:31:57,100
using the platform has resulted 
in innovative ways of for users 

495
00:31:57,100 --> 00:32:02,600
to use and or combined? 
the data we sort of took again, 

496
00:32:02,600 --> 00:32:08,400
a data-driven approach to 
helping our firm and our Risk 

497
00:32:08,400 --> 00:32:11,700
Managers and our sales. 
People in our Traders, 

498
00:32:11,700 --> 00:32:16,000
understand that risk from a 
data-driven perspective, right? 

499
00:32:16,000 --> 00:32:21,900
We have combined all that data 
and information into our 

500
00:32:21,900 --> 00:32:25,700
platform and have basically 
given I hate to use this word 

501
00:32:25,700 --> 00:32:30,600
but democratized the access to 
that our team has linked a bunch

502
00:32:30,600 --> 00:32:34,900
of that data together done. 
Some interesting analysis but 

503
00:32:34,900 --> 00:32:38,300
more importantly, we have In 
that those based sets of data, 

504
00:32:38,300 --> 00:32:40,900
those relationships, the 
relationships between those data

505
00:32:40,900 --> 00:32:44,900
that may not be obvious. 
To our users hands. 

506
00:32:44,900 --> 00:32:49,200
That now, they have the access 
to that information and now they

507
00:32:49,200 --> 00:32:52,500
could come up with creative 
things or risk, mitigating 

508
00:32:52,500 --> 00:32:57,900
things to on top of that data. 
So I think that's been, you 

509
00:32:57,900 --> 00:33:02,900
know, maybe one that resonates 
pretty recently here, it's kind 

510
00:33:02,900 --> 00:33:09,700
of the reverse of Shadow it that
cios didn't like years ago. 

511
00:33:09,708 --> 00:33:13,900
Some going back, maybe, five 
years, you're Ali, taking the 

512
00:33:13,900 --> 00:33:18,400
data and I assume the tools and 
placing it in the hands of 

513
00:33:18,900 --> 00:33:21,600
users, so that they can be 
creative. 

514
00:33:21,600 --> 00:33:24,400
They can innovate with with that
data. 

515
00:33:24,700 --> 00:33:28,000
Exactly. 
But in a fully governed quality 

516
00:33:28,400 --> 00:33:33,400
way, right, where that shadow, 
it problem isn't really a 

517
00:33:33,400 --> 00:33:35,300
problem. 
We have put all the right guard 

518
00:33:35,300 --> 00:33:38,900
rails and all the right 
governance around that. 

519
00:33:38,900 --> 00:33:43,000
So now people can actually 
innovate in a safe space. 

520
00:33:43,200 --> 00:33:47,000
In a safe way but it's not just 
lock to our team. 

521
00:33:47,400 --> 00:33:52,700
Can you give us some insight 
into what kind of governance or 

522
00:33:52,700 --> 00:33:58,600
how do you balance the need for 
security for privacy against 

523
00:33:58,600 --> 00:34:02,100
making that data, available 
accessible and easy to use for 

524
00:34:02,100 --> 00:34:04,900
folks before we do anything with
data. 

525
00:34:05,500 --> 00:34:08,100
Right? 
We have first of all very strict

526
00:34:08,100 --> 00:34:11,900
policies and Frameworks about, 
you know, understanding who's 

527
00:34:11,900 --> 00:34:13,400
allowed to see what. 
Data. 

528
00:34:14,500 --> 00:34:17,000
And even if you're allowed to 
see it, should you see it? 

529
00:34:17,000 --> 00:34:18,600
You know, we call it need to 
know, right? 

530
00:34:18,600 --> 00:34:20,400
Like just because you're allowed
to see it. 

531
00:34:20,400 --> 00:34:23,000
Maybe you don't actually need to
know that client information. 

532
00:34:23,000 --> 00:34:28,300
So first of all, as an 
overarching theme, I'd say like 

533
00:34:28,300 --> 00:34:32,699
one of the one of the most 
critical pieces is is you know 

534
00:34:32,699 --> 00:34:37,000
writing those policies down and 
actually then sort of enforcing 

535
00:34:37,000 --> 00:34:40,000
them in the platform. 
But before we do anything like 

536
00:34:40,000 --> 00:34:44,199
that is a clear thing and then 
and we have very clear rules 

537
00:34:44,400 --> 00:34:46,600
About who can get access to 
what? 

538
00:34:46,600 --> 00:34:51,900
How, and then again, would be a 
little broken record encode, 

539
00:34:51,900 --> 00:34:55,600
those rules in the platform and 
make sure that that is like a 

540
00:34:55,600 --> 00:34:57,500
Slayer. 
Like if we get that wrong, 

541
00:34:57,600 --> 00:35:01,000
nothing matters, right? 
So the first thing is to make 

542
00:35:01,000 --> 00:35:07,100
sure that that Baseline Works 
before before giving access to 

543
00:35:07,100 --> 00:35:10,500
him. 
So in other words, as long as 

544
00:35:10,500 --> 00:35:17,300
the foundation of governance, A 
risk control, compliance with 

545
00:35:17,300 --> 00:35:19,400
regulations, whatever is 
necessary. 

546
00:35:19,700 --> 00:35:26,300
As long as those elements are in
place, then you're able to share

547
00:35:26,300 --> 00:35:30,300
the data. 
And let people, I was going to 

548
00:35:30,308 --> 00:35:33,800
say have free range, obviously 
that's not the case, but have 

549
00:35:33,800 --> 00:35:36,100
enough rain, where they can use 
it. 

550
00:35:36,200 --> 00:35:38,900
Use that data creatively in the 
service of whatever their 

551
00:35:38,900 --> 00:35:42,700
business goals happen to be much
better said, then that then 

552
00:35:42,700 --> 00:35:45,900
white man eyes that it but 
Exactly, exactly. 

553
00:35:45,900 --> 00:35:51,500
The right mental model. 
What advice do you have for 

554
00:35:51,800 --> 00:35:58,000
folks who are building a data 
strategy, based on what you've 

555
00:35:58,000 --> 00:36:00,900
learned at and done at Goldman 
Sachs? 

556
00:36:01,300 --> 00:36:05,100
Make sure you attach yourself to
business outcomes. 

557
00:36:06,300 --> 00:36:09,600
The things that people care 
about at your company, right? 

558
00:36:09,600 --> 00:36:14,800
That I think, As a technologist.
You know sometimes as 

559
00:36:14,800 --> 00:36:17,600
technologists in general maybe 
I'll make a broad generalization

560
00:36:17,600 --> 00:36:19,700
sometimes. 
You know, we care a lot more 

561
00:36:19,700 --> 00:36:25,500
about the tech than the outcome 
but specifically for data 

562
00:36:25,500 --> 00:36:28,400
strategy, I think it's even more
important to over-index on the 

563
00:36:28,400 --> 00:36:32,900
outcomes because data sort of 
becomes this nebulous thing 

564
00:36:33,100 --> 00:36:35,800
where now you know like well 
what does it mean to have a data

565
00:36:35,800 --> 00:36:37,900
strategy? 
What is that like what is data 

566
00:36:37,900 --> 00:36:39,400
even mean? 
Why do I need that? 

567
00:36:39,400 --> 00:36:43,000
Like an abstract? 
I get why I want information, 

568
00:36:43,000 --> 00:36:45,100
but like what are you talking 
about? 

569
00:36:45,100 --> 00:36:48,100
So the first thing, I the first,
first thing I talked about 

570
00:36:48,100 --> 00:36:52,600
always is attached yourself, the
business outcomes and show how 

571
00:36:53,000 --> 00:36:58,300
the data and the data strategy 
actually makes those cheaper 

572
00:36:58,300 --> 00:37:02,000
faster, better easier makes 
money for your clients. 

573
00:37:02,700 --> 00:37:06,600
Helps reduce risk, help. 
Save money, do so. 

574
00:37:06,600 --> 00:37:11,500
That's that's the first sort of 
Baseline advice. 

575
00:37:11,500 --> 00:37:17,300
I give to everybody. 
And then then I think, you know,

576
00:37:17,300 --> 00:37:22,600
but then below that then become 
sort of okay set out. 

577
00:37:22,600 --> 00:37:25,800
What is the platform strategy 
going to be? 

578
00:37:25,800 --> 00:37:28,800
How are you going to actually 
make the engineering work? 

579
00:37:28,800 --> 00:37:33,800
Think about the business 
workflows and the Pro workflows.

580
00:37:35,100 --> 00:37:37,900
And then, you know, we talked a 
little bit about the org 

581
00:37:37,900 --> 00:37:40,700
structure and the framework. 
So I think those are also key 

582
00:37:40,700 --> 00:37:44,400
pieces, right? 
Make sure that, you know, if 

583
00:37:44,400 --> 00:37:46,800
you're going to do it in an 
engineering strategy, you have a

584
00:37:46,800 --> 00:37:51,000
strong platform team, if you're 
going to, you know, have a 

585
00:37:51,000 --> 00:37:52,600
Content T. 
Make sure they actually 

586
00:37:52,600 --> 00:37:56,400
understand the domain that 
you're working with. 

587
00:37:59,500 --> 00:38:03,600
And so bringing those like or 
pieces together. 

588
00:38:04,300 --> 00:38:09,300
But, but the number one key is 
like, make sure you're driving 

589
00:38:09,300 --> 00:38:11,600
outcomes. 
What's the relationship between 

590
00:38:11,600 --> 00:38:15,200
your data strategy, and Cloud, 
where does cloud fit into all of

591
00:38:15,207 --> 00:38:16,700
this? 
I can't believe we haven't 

592
00:38:16,700 --> 00:38:19,700
spoken about that. 
I think of the cloud again, as a

593
00:38:19,700 --> 00:38:24,900
tool in the toolbox, right? 
It's not a means to an end it's 

594
00:38:25,300 --> 00:38:28,800
in my Arsenal how I want to 
build my platform. 

595
00:38:28,800 --> 00:38:31,100
Yeah. 
I want infinite scalability, I 

596
00:38:31,100 --> 00:38:35,100
wanted it, I want, you know, 
other people have done, the hard

597
00:38:35,100 --> 00:38:38,800
work on the infrastructure 
front, I want great databases 

598
00:38:39,100 --> 00:38:41,900
that people have built to not 
rebuild that on my own. 

599
00:38:42,200 --> 00:38:46,700
And so to me, the cloud is a 
great enabler, it's a great 

600
00:38:46,700 --> 00:38:51,000
tool, my toolbox, it's a great 
way to get the scale. 

601
00:38:51,600 --> 00:38:56,200
So it's definitely a big part of
the overall data strategy. 

602
00:38:56,200 --> 00:39:01,700
But again, it just to be clear. 
I just make sure like, It's not 

603
00:39:01,700 --> 00:39:05,000
a thing for the sake of a thing,
it's like, okay. 

604
00:39:05,000 --> 00:39:08,600
These are great components, 
great pieces of infrastructure 

605
00:39:08,600 --> 00:39:12,100
that other people that I can now
stand on the shoulder of giants 

606
00:39:12,600 --> 00:39:15,300
execute on. 
And that's the way I think about

607
00:39:15,300 --> 00:39:16,800
it. 
It was super important part of 

608
00:39:16,808 --> 00:39:21,300
the strategy just to be clear. 
I can see that you are you're 

609
00:39:21,300 --> 00:39:25,500
very purpose-driven, you are 
always coming back to the 

610
00:39:25,500 --> 00:39:26,800
reference point. 
Why are we doing it? 

611
00:39:26,800 --> 00:39:28,100
What are we doing at? 
What are we getting out of it 

612
00:39:28,107 --> 00:39:30,000
from them from the business? 
Outcome standpoint. 

613
00:39:30,000 --> 00:39:36,000
That's Very, very, very clear. 
And finally, one last question, 

614
00:39:36,700 --> 00:39:40,700
what is the relationship between
data and building business 

615
00:39:40,700 --> 00:39:44,000
models economic models Financial
models? 

616
00:39:44,000 --> 00:39:47,200
How does how to those pieces fit
together in technology? 

617
00:39:47,200 --> 00:39:51,100
It comes down to algorithms and 
data and so like those are the 

618
00:39:51,100 --> 00:39:54,300
two big inputs, right? 
Like, you need the data to be 

619
00:39:54,300 --> 00:39:56,000
great. 
You need it to be clean and 

620
00:39:56,000 --> 00:39:59,100
needed to be organized. 
You need to make sure that like 

621
00:39:59,200 --> 00:40:00,900
all those pieces are set, it's 
easy. 

622
00:40:01,100 --> 00:40:04,900
Accessible. 
It's findable to govern and then

623
00:40:05,200 --> 00:40:10,400
that becomes a major input into 
the Algos, whether you're doing 

624
00:40:10,400 --> 00:40:14,400
forecasting or algorithmic 
trading or helping your client 

625
00:40:14,400 --> 00:40:19,400
with something that those pieces
just have to like work together 

626
00:40:19,700 --> 00:40:23,700
as a as a team to get the job 
done. 

627
00:40:24,200 --> 00:40:26,800
Again, I'm putting words in your
mouth, but this is just 

628
00:40:26,800 --> 00:40:29,500
summarizing what you've just 
been talking about is the 

629
00:40:29,500 --> 00:40:32,400
linkage to the out. 
Outcome and being clear about, 

630
00:40:32,400 --> 00:40:34,600
what's the data going in? 
What's the expected result at 

631
00:40:34,600 --> 00:40:37,800
the other and and ensuring that 
the to match up? 

632
00:40:38,100 --> 00:40:42,400
Yep, yep, exactly. 
And with that, unfortunately, 

633
00:40:42,400 --> 00:40:45,400
we're out of time Nema. 
I just want to say a huge. 

634
00:40:45,400 --> 00:40:47,800
Thank you for spending time with
us. 

635
00:40:47,800 --> 00:40:50,800
I really, really appreciate it. 
Thank you, thanks for the great 

636
00:40:50,800 --> 00:40:53,100
questions. 
Thanks audience, amazing 

637
00:40:53,100 --> 00:40:55,700
questions. 
I really had a really fun time 

638
00:40:55,700 --> 00:40:57,700
talking to talking to you 
Michael. 

639
00:40:57,700 --> 00:41:00,000
So thanks for having me, 
everybody. 

640
00:41:00,100 --> 00:41:03,300
Thank you. 
For watching, I just want to say

641
00:41:03,300 --> 00:41:09,700
a huge thank you to Nema Rafael.
He is the head of global data 

642
00:41:09,700 --> 00:41:14,800
engineering and the chief data 
officer of Goldman Sachs. 

643
00:41:15,300 --> 00:41:20,000
Now before you go, please 
subscribe to our YouTube channel

644
00:41:20,000 --> 00:41:23,700
and hit the Subscribe button at 
the top of our website. 

645
00:41:24,600 --> 00:41:26,900
Actually, you know, the 
Subscribe button has moved to 

646
00:41:26,900 --> 00:41:29,700
the bottom of our website, so 
hit the Subscribe button at the 

647
00:41:29,700 --> 00:41:31,700
bottom of our website. 
So We can send you our 

648
00:41:31,700 --> 00:41:34,700
newsletter and keep you 
up-to-date on our upcoming live 

649
00:41:34,700 --> 00:41:36,000
shows. 
Thanks so much, everybody. 

650
00:41:36,300 --> 00:41:38,900
Hope you have a great day and 
we'll see you next time.

