Data Science/Machine Learning roles finance NYC/Chicago

Joined
6/11/17
Messages
40
Points
28
Any updates on the market or thoughts on general projections for Data Scientists and Machine Learning Engineers in Quantitative Finance in New York?

Questions:
  1. How large will this industry grow
  2. Are these mainly operations roles? Has anyone seen roles on the buy-side across investment banking and asset management?
  3. Compensation packages: I have seen a number of Data Science roles be ranked as Assistant Vice President or Vice President roles in Buldge Bracket Banks
  4. Are Data Scientists/ML Engineers replacing quants who specialise in pricing with PDE's etc - what are the general thoughts here?

Thanks!

Remark: Naturally this will be a tough set of questions to answer due to the diversity in firms and general strategies, and preferred technologies but would be great to get an idea.
 
Last edited:
4. I have seen some ML papers purporting to be able to solve 100-dimemsioanl PDEs..

1. It will grow until it becomes mainstream or that it remain niche.
 
1. Difficult to know at this point. 10 years back "big data" was a buzzword, now it's "data science". This time around, with the promise of machine learning and AI, it's gained more traction at banks, among university students and technology companies, but for banks it means still the same thing: trying to get data sorted. Then maybe run linear regression on it. Maybe in the future it'll be something else. There's potential there, so I see this area to be growing for years to come.

3. AVP, VP etc are just titles of seniority and only mean that they have a few years of work experience. I'm sure whoever's leading the effort at each bank is an MD.

4. No, and that's not even the purpose really. All the PDEs are is to an extent a fancy way of interpolating prices given some constraints (namely, arbitrage freeness). A machine learning algorithm in this space would probably either generate the input to these PDEs from market data, or process the output.

Also, and I guess this is now interpreting your question in a different way, namely organizationally rather than skill/methodwise: Math is never the hard part and what I think is more realistic than a group coming from tech to replace quants is that the quants learn the new methods (and hire people who know them) and adapt as appropriate. There are many reasons, but I think the primary one is the way communication lines are typically set up and it'll be difficult to lodge decades of tradition there. Dislodging the current pricing methodologies, if superior ways of doing things, won't be as difficult: models get upgraded as a matter of course all the time anyway.
 
4. No, and that's not even the purpose really. All the PDEs are is to an extent a fancy way of interpolating prices given some constraints (namely, arbitrage freeness). A machine learning algorithm in this space would probably either generate the input to these PDEs from market data, or process the output.

But you can do the I/O stuff without ML.

This paper addresses the PDE problem. Over my head (too much hand waving) but I must be missing something.

https://arxiv.org/pdf/1706.04702.pdf

IMO ML is not optimal for PDE for several reasons. But it is being applied to many things these days ..

The hype is reminiscent of the irrational exuberance of the late 80's with OOP.
 
Last edited:
4. No, and that's not even the purpose really. All the PDEs are is to an extent a fancy way of interpolating prices given some constraints (namely, arbitrage freeness). A machine learning algorithm in this space would probably either generate the input to these PDEs from market data, or process the output.

But you can do the I/O stuff without ML.

This paper addresses the PDE problem. Over my head (too much hand waving) but I must be missing something.

https://arxiv.org/pdf/1706.04702.pdf

IMO ML is not optimal for PDE for several reasons. But it is being applied to many things these days ..

The hype is reminiscent of the irrational exuberance of the late 80's with OOP.

Though interesting (and the paper in particular looks to be well written; thanks for it, I'll try to read it at some point), I see a myriad of problems with the approach of using neural networks to solve PDEs in finance. To list a couple: The solution with NN does not as far as I understand it, guarantee arbitrage free solutions. What if I find out that my greeks are not stable under some specific conditions that only come to light at some point in the future? I'll have to retrain my entire network and introduce noise everywhere and worse, might introduce an issue someplace else? That's not going to fly too well.

Also just to replace a FDM solver by NN is probably not the kind of triumph of ML techniques over the models rooted in stochastic calculus the OP was looking for.
 
The approach in that paper is flawed. I wrote to the authors. No response.. (Just for the record I did MSc and Phd in PDE/FDM/FEM + industry etc. so it is a topic that I am familiar with).

Also just to replace a FDM solver by NN is probably not
Can't be done IMO. Unless it involves Automatic Differentiation (AD) in some way.
 
Last edited:
1. Difficult to know at this point. 10 years back "big data" was a buzzword, now it's "data science". This time around, with the promise of machine learning and AI, it's gained more traction at banks, among university students and technology companies, but for banks it means still the same thing: trying to get data sorted. Then maybe run linear regression on it. Maybe in the future it'll be something else. There's potential there, so I see this area to be growing for years to come.

3. AVP, VP etc are just titles of seniority and only mean that they have a few years of work experience. I'm sure whoever's leading the effort at each bank is an MD.

4. No, and that's not even the purpose really. All the PDEs are is to an extent a fancy way of interpolating prices given some constraints (namely, arbitrage freeness). A machine learning algorithm in this space would probably either generate the input to these PDEs from market data, or process the output.

Also, and I guess this is now interpreting your question in a different way, namely organizationally rather than skill/methodwise: Math is never the hard part and what I think is more realistic than a group coming from tech to replace quants is that the quants learn the new methods (and hire people who know them) and adapt as appropriate. There are many reasons, but I think the primary one is the way communication lines are typically set up and it'll be difficult to lodge decades of tradition there. Dislodging the current pricing methodologies, if superior ways of doing things, won't be as difficult: models get upgraded as a matter of course all the time anyway.

Excellent response thanks!

Your assessment that quants will adapt to new methods seems fairly accurate and realistic.

Someone smart with a postgraduate background in Theoretical Physics/Engineering/Applied Mathematics should be able to learn most of the new algorithms and techniques.

Good article on the requirements - it's a bit over the top but coming from a similar background and now working predominantly on ML solutions for my job think it gives a broad overview: The Mathematics of Machine Learning – Towards Data Science

Great take in my opinion.

The approach in that paper is flawed. I wrote to the authors. No response.. (Just for the record I did MSc and Phd in PDE/FDM/FEM + industry etc. so it is a topic that I am familiar with).

Also just to replace a FDM solver by NN is probably not
Can't be done IMO. Unless it involves Automatic Differentiation (AD) in some way.

Much appreciated as always Daniel you always have great answers.
 
Thank you. In this case I wish I could have been more positive. But ML is not all things for all problems.

One more rant :)
I miss quite a bit of mathematical rigour in the articles that are coming off the 'ML printing press' these days.
 
Thank you. In this case I wish I could have been more positive. But ML is not all things for all problems.

One more rant :)
I miss quite a bit of mathematical rigour in the articles that are coming off the 'ML printing press' these days.

I honestly place 99% of the blame on online Machine Learning MOOCs + Andrew Ng. Everyone is claiming to be a Data Scientist or Machine Learning expert after taking 3-4 months worth of online MOOCs.

They then publish basic level articles for other beginners to read from and the viscous cycle continues.

There is nothing more dangerous than people claiming expertise or using predictive models without knowing how the underlying algorithms work.
 
They then publish basic level articles for other beginners to read from and the viscous cycle continues.

What is academically and in practice very bad in the recent spate of articles on NN is that many of the issues and challenges have been addressed and solved during the golden age of numerical analysis.
 
They then publish basic level articles for other beginners to read from and the viscous cycle continues.

What is academically and in practice very bad in the recent spate of articles on NN is that many of the issues and challenges have been addressed and solved during the golden age of numerical analysis.

By NN I'm assuming you mean Neural Networks?

Whilst many of the new architectures in Machine Learning/AI take inspiration from computational neuroscience - most of the algorithms and issues faced are actually numerical analysis problems (in supervised learning anyway). Which is why I personally was able to transition over so easily. In terms of mathematical complexity most of the ML research papers are like reading Numerical Analysis/Scientific computing papers (probably easier as CS papers are usually very light on rigorous mathematics - no offence any computer science PhDs/Masters students on the forum).

I am of the belief that a good Engineer/Mathematician at a PhD/Masters level can learn all the Machine Learning algorithms and will be a far more effective Data Scientist/Quant than someone who has taken an MBA style Analytics course and has learnt the algorithms with no mathematical background.

What frustrates me even more is that I see famous Professors from US schools retweeting - we can teach you Machine Learning without Mathematics articles and the Financial Times even recently did an article on how business schools are now teaching courses on Machine Learning/Artificial Intelligence (i.e. introductory level Python courses).

I think the general Data Science industry needs some barrier to entry. From experience hiring self-taught Data Scientists can lead to projects taking months rather than taking a few weeks. They have basic understanding of what can/can't be achieved with these techniques and underestimate the importance of good old mathematical modelling.

Subscribe to read
Data science is the big draw in business schools
 
In terms of mathematical complexity most of the ML research papers are like reading Numerical Analysis/Scientific computing papers (probably easier as CS papers are usually very light on rigorous mathematics.

This was my feeling exactly ... CS maths background falls short for this kind of problem. So, with only a background + hoping that numerical recipes will save the day is wishful thinking. But what do I know? :)

A bit like ju jitsu guys trying to do judo throws. They are very similar but the former is more rough and ready, more robotic while the latter teaches the nuances, variants and combinations.


CS papers are usually very light on rigorous mathematics - no offence any computer science PhDs/Masters students on the forum).
This is good advice and being direct is important.
 
Last edited:
Back
Top Bottom