Is Stochastic Processes still a relevant course to take for a prospective quant?

Dude im talking about new grads interviewing questions with some of our new hires who didnt make it to the buy side

Everyone knows interviewing process/questions r not necessarily actual day2day job. Its just what the current state is. If u cant get hired, whats the point of talking about the usefulness of ml
 
It is extremely idiotic and downright unproductive to "program my own ml library". "Understanding the fundamentals" is a generic advice that is applicable to all fields and endeavors in life. Why on earth would you duplicate all the hard work that has gone in to make scikit learn fast, modular and uniform across so many regression and classification algorithms?

I am pretty sure you have never worked in any capacity as a ML researcher/data scientist and just making banal and amateurish comments to people actually asking for help. Yeah go ahead and make your own deep learning library while we all try to make some real progress with free frameworks designed by Google and Facebook.
This comment is ridiculous then why any reputable bb do their inhouse pricing and risk libraries they must be stupid and should just outsource to some vendor like blackrock
 
This comment is ridiculous then why any reputable bb do their inhouse pricing and risk libraries they must be stupid and should just outsource to some vendor like blackrock

1. I am surprised that you don't know how pervasive Aladin has become in the industry. A lot of firms are using Aladin for risk management and portfolio analytics. For example, DWS now runs their portfolio risk and analytics almost entirely on Aladin.
2. Comparing machine learning frameworks with "inhouse pricing and risk libraries" is another ill informed statement in itself. No regulator goes around auditing machine learning frameworks in large banks and banks don't measure their capital adequacy or leverage ratios based on tensorflow. Those are two different things and have nothing in common.

Again my point - it is super important to learn the ML and DL algorithms, understand the methods to measure performance and keep tuning your model once it is in production. No sane or productive organization today should re-create scikit learn or tensorflow from scratch. Knowing ML algorithims inside out, knowing when to use one and time complexity is one thing but programming a SVM from scratch is a waste of everyones time.
 
It is extremely idiotic and downright unproductive to "program my own ml library". "Understanding the fundamentals" is a generic advice that is applicable to all fields and endeavors in life. Why on earth would you duplicate all the hard work that has gone in to make scikit learn fast, modular and uniform across so many regression and classification algorithms?

I am pretty sure you have never worked in any capacity as a ML researcher/data scientist and just making banal and amateurish comments to people actually asking for help. Yeah go ahead and make your own deep learning library while we all try to make some real progress with free frameworks designed by Google and Facebook.
Writing a library with bells and whistles takes knowledge and experience. On the other hand, I think it is useful if you can code up simple algorithms to 1) double check others' work, 2) reverse engineer to a certain extent what is going on in ML libraries, 2)avoiding becoming deskilled, e.g. when you need to tweak parameters when the algorithms break down. Worst case is trial-and-error testing.

A good example IMO is to program your own simple SGD and see what the challenges are; then move to a production version.
 
From the ensuing discussions I have inferred that it woild be more valuable to take the ML and Probability based courses from this degree and (maybe) learn the stochastic stuff on the side
 
Writing a library with bells and whistles takes knowledge and experience. On the other hand, I think it is useful if you can code up simple algorithms to 1) double check others' work, 2) reverse engineer to a certain extent what is going on in ML libraries, 2)avoiding becoming deskilled, e.g. when you need to tweak parameters when the algorithms break down. Worst case is trial-and-error testing.

A good example IMO is to program your own simple SGD and see what the challenges are; then move to a production version.

Sure, its like when the Quantnet C++ course made us write our own implementation of resizable arrays in the exercises. It was challenging and gave a glimpse of how its implemented in real world without getting into the weeds with amortization analysis etc. But we all use the std::vector object in real life as we know it is part of STL and just works out of the box.

Actually a lot of parameter tuning is trial-and-error :) But its largely automated in ML libraries with gridsearch and pipelines which streamline your workflow a lot.
 
Perhaps we consider this in 2 dimensions - the user, and the application. For the ML user - potentially trying to get a job on the buyside, maybe there are two main groups - "Betty Crocker' (famous cooking recipe book in the US) ML, and expert.

The "Betty Crocker" recipe users maybe went through a superficial course where they learned the libraries, calls, and general applications without much of the underlying math. These are the ones who might not know you need to normalize with a specific approach, or what the impact of dropping the ones col is going to have for another.

The expert users are able to approach the libraries with an understanding of the underlying math and the implemented algorithms in the library. Maybe it's rare to have to build your own library, but there definitely are firms willing to absorb the high fixed cost given the potential up-side. Two sigma, for example.

Similarly for many applications, the popular libraries are sufficient. Specialized applications will undoubtedly require some adjustment - to take Aladdin for (a non-ML) example, it's a useful product especially for a non-quantitative firm - DWS, for example. I don't think DRW would use Aladdin - at least not as their primary risk tool since their use case is wildly different than the applications specification.

The original question regarding StoCal - I'm both miserable and thrilled I'm doing it. On the buyside, you can get quite far with a simple linear factor model, but then you're going to have a very rough time with implementation if you're not careful. StoCal is critical for trading and pricing in a risk neutral framework. In this way, there is kind of a convergence of traditional sell side math and prediction techniques from the buy side. Not a lot of (larger, more traditional) buyside shops are excited about that, since it's expensive and risky to set up and maintain a trading operation.

That said, almost all the research roles are looking more for data science / ML kind of skills. There is some window dressing bullshit aspect to this, and there's another element of an emerging technology that no one has really figured out how to use but there's a lot of intuition that it could be quite useful based on success in other domains. How relevant those use cases are to investing is definitely up for debate, but the relevance of those skills for an early stage quant interested in the buyside seems to be a settled question from my point of view.
 
Sure, its like when the Quantnet C++ course made us write our own implementation of resizable arrays in the exercises. It was challenging and gave a glimpse of how its implemented in real world without getting into the weeds with amortization analysis etc. But we all use the std::vector object in real life as we know it is part of STL and just works out of the box.

It's a concrete case and its gets you learn the syntax as well. At a certain stage the library is replaced by a production.
In my work develop stable and accurate FDM methods, so in the initial stages the choice of matrix library is not (yet) on the critical path.
(On ML, it is essentially a bunch of algorithms (nothing wrong with that, just saying)).

My ruleZ are:

1. Get it work

then

2. Get it right

then and only then

3. Get it optimized

A lot of folk start with 2 (or even 3 LOL).

4. Write a Proof-of-Concept (POC) ASAP
5. Implement the most important (for client) features first. Remember, the Rembrandt painting is the product, not the frame.
6. Get project done, on time and within bu$get.

Anyhoo, that's the way I do it. What do I know :alien:

// These days, writing your own (mini) SGD is very instructive and is the 2020 equivalent of 'My Little Matrix".
// There was a time when people wrote their own string, date and list classes. Here's a stone-age example. It's a Swiss army knife.
 

Attachments

Last edited:
i mean sometimes there’s no off the shelf solution from sklearn and/or need to do ml in the company’s in house programming language. like if u need some special optimization function or least squares scheme or neural network architecture that is fit for a particular situation. i doubt pdt or tgs or renaissance actually uses those
Companies like to make their own models and use open source libraries as "second opinion".
 
Back
Top Bottom