Projects idea for students going into quant field

Joined
5/2/06
Messages
12,165
Points
273
Whenever anyone sks how to get into a field involving #programming or #machinelearning, the advice is always "do projects and upload them to GitHub." The same applies to #quant investing, but what #quant projects would impress a prospective employer?

Well, I can tell you what would impress me. Other quants can comment on what would be good for them.

1) Code in a commonly used language like Python, C++, Java, Julia, or C#. Do not use Excel for this. Excel is not how production-level systems are deployed, and the difference between implementing in Excel and implementing in code is massive. You CAN use Jupyter notebooks as a way to integrate code presentation of results but keep in mind that isn't where production code lives

2) Ideally, you want to work off of real world datasets. If you don't already have access to them, you can find them at QuantConnect or Kaggle.

3) Build some sensible features. They don't need to be mind-blowing but put some thought into it. If you are using futures, make sure carry and momentum are in there somewhere. If you are looking at stock data, make sure you get a decent sample of the McLean and Pontiff (2016) signals (https://buff.ly/3OX2KvD). It is better to have too many features here than too few. If you don't know what to put, just make stationary (!) ratios, diffs, and percentage diffs where appropriate. Even using historical returns and share turnover over various periods as predictors is reasonable.

4) Implement various return prediction models using a linear model (e.g., linear ridge), a tree-based model (e.g., gradient boosting), and a neural network model (here, feed forward is fine). Don't worry about getting a large dataset. Of course, we know that if you want feed forward neural networks to be effective, we need a large dataset, but the point isn't to get an incredible model. It is to demonstrate skill.

5) Build a portfolio and calculate backtested returns. You can use mean-variance optimization (meaning you need to calculate covariance) or take a Kelly criterion approach if you're sizing bets and might be levered up or not. I personally don't care whether the model is any good. I just want to know how to put the pieces together.

6) Create a pretty graph of growth of a dollar and performance table compared to a benchmark (if appropriate).

Make sure the code is clean and elegant. Input, model, and output should be separated and should each be runnable as one line of code. No absolute paths should exist in the input or output--only in a config file, stored as a variable. In fact, it should be the only thing a user would have to change to run the code (as long as they have the data).

Source: Vivek Viswanathan
 
These days it's almost a red flag if a company heavily uses Excel for trading. Maybe some banks still do. The world moved to Python 5+ years ago for more ad-hoc analysis and scripting.
I mean Python is used for rapid prototyping and ad-hoc analysis, but there would still be places you need a GUI and interactivity for doing a variety of things, that could be written in .NET, Java and in some cases just good 'ol excel.
 
I mean Python is used for rapid prototyping and ad-hoc analysis, but there would still be places you need a GUI and interactivity for doing a variety of things, that could be written in .NET, Java and in some cases just good 'ol excel.
The point is about Excel, not .NET or Java. It's pretty easily to build and deploy simply Python (web-)GUIs. I haven't seen Excel heavily used in trading for 5+ years and probably would be very hesitant joining a shop that still does.
 
The point is about Excel, not .NET or Java. It's pretty easily to build and deploy simply Python (web-)GUIs. I haven't seen Excel heavily used in trading for 5+ years and probably would be very hesitant joining a shop that still does.
Excel in finance has a long history. Let's put things in perspective. I'm not talking about "programming in Excel" with macros but Excel interop with VBA, C++ (xll etc.), dll have to deal with legacy systems. I am neutral on which to use but AFAIK traders do use it.

How would you replace a C++ pricing library using Excel to Python? Many stakeholders use and want Excel or are stuck with Excel because no way will you rewrite a legacy system just to interface it with Python, I reckon.

Your market making niche is different, I agree. Horses for courses, I suppose.
 
Last edited:
Excel in finance has a long history. Let's put things in perspective. I'm not talking about "programming in Excel" with macros but Excel interop with VBA, C++ (xll etc.), dll have to deal with legacy systems. I am neutral on which to use but AFAIK traders do use it.

Your market making niche is different, I agree. Horses for courses, I suppose.
And most sell-side shops are anything but monolithic in how their software/quantitative models are consumed. It could be legacy excel tools, newer web tools. In short, its just a nice to know thing : C++ <-> Excel interop.
 
Why is it a red flag? Just curious.
It shows they are using legacy tech, are probably fairly manual, ... Bit like you wouldn't want to work for a company that is still on Python 2.7, C++11 (98?) or Java 7. I can see how banks have more legacy systems to maintain and that Excel is a good tool for less technical people like sales, structureres, ... For trading companies it's not a great signal as the market is very competitive and the best people typically want to work with the newest tech.
 
Last edited:
It shows they are using legacy tech, are probably fairly manual, ... Bit like you wouldn't want to work for a company that is still on Python 2.7, C++11 (98?) or Java 7. I can see how banks have more legacy systems to maintain and that Excel is a good tool for less technical people like sales, structureres, ... For trading companies it's not a great signal as the market is very competitive and the best people typically want to work with the newest tech.
Good sales pitch. Not everyone wants new tech.
I'm sure most quants can make up their own minds. Many have skills beyond programming. Many are interested in math and finance as well.

Today's newest tech is tomorrow's legacy, unless software systems have a short shelf life. C++11 is 90% of what is needed. 20/80 Pareto rule.

// when I was a developer in Comprimo Amsterdam (where the BNR building is now) 45 years I wrote an enterprise P&L manhour control on Apple II and in Pascal. It lived for 19 years after having been ported to a minicomputer. No one cared what the tech was. It was all about the business. There was no Excel in those days..
 

Attachments

  • mpc.webp
    mpc.webp
    57.2 KB · Views: 79
Last edited:
On "new tech", most of the cool features in C++20 were developed 30-40 years ago

generics ML and Ada
futures/promises/tasks
parallel/asynch programming
coroutines
tuples
lambda and functional programmig (1930s)
etc.
 
Just saw a post on LinkedIn

My dream as a 30-years old: to work as a trader in a room packed with people, feel the adrenaline of the noise and be in front of 6 mega screens.
My dream as a 40-years old: to run my business anywhere in the world, preferably in a quiet place and in front of a small screen laptop.
 
Last edited:
Good sales pitch. Not everyone wants new tech.
I'm sure most quants can make up their own minds. Many have skills beyond programming. Many are interested in math and finance as well.

Today's newest tech is tomorrow's legacy, unless software systems have a short shelf life. C++11 is 90% of what is needed. 20/80 Pareto rule.

// when I was a developer in Comprimo Amsterdam (where the BNR building is now) 45 years I wrote an enterprise P&L manhour control on Apple II and in Pascal. It lived for 19 years after having been ported to a minicomputer. No one cared what the tech was. It was all about the business. There was no Excel in those days..
How is it a sales pitch if I have no product to sell (as opposed to others here *cough*)?
 
It feels a bit like preaching, speaking for myself. It might be a cultural/language thing.
It's the way I read it.
 
Last edited:
Download QuantLib.

Look at all the open issues Issues · lballabio/QuantLib and see if you can fix/add something.

Or, make reference imlementations for calculations that should be there but aren't yet, e.g. post-libor RFR curves for various currencies, correct cash flows and price/yield (i.e. matching Bloomberg) for various types of bonds...

Try to get your contributions accepted. They would certainly look good on your cv.
 
In terms of Github, the projects are useless without decent readmes. The code you write is proof of what you explain in an interview or write in a Readme.md. Therefore, if your readme is 1-2 paragraphs, you've "shot yourself in the foot" with the project.
 
Developers have a dismal reputations for documenting their code. Most poeple are not interested in reading someone else's code.
There should a story

Fin model -> math model -> numerical model -> C++/Python -> Conclusions
Make sure you understand it A-Z.

At least, if you want to win friends and influence people.


This had 55K views and 25 reposts on LinkedIn, recently.


// On my LI entry I have posted about 30 MSc theses I supervised.
 
Last edited:
Back
Top Bottom