- Joined
- 5/2/06
- Messages
- 12,165
- Points
- 273
Whenever anyone sks how to get into a field involving #programming or #machinelearning, the advice is always "do projects and upload them to GitHub." The same applies to #quant investing, but what #quant projects would impress a prospective employer?
Well, I can tell you what would impress me. Other quants can comment on what would be good for them.
1) Code in a commonly used language like Python, C++, Java, Julia, or C#. Do not use Excel for this. Excel is not how production-level systems are deployed, and the difference between implementing in Excel and implementing in code is massive. You CAN use Jupyter notebooks as a way to integrate code presentation of results but keep in mind that isn't where production code lives
2) Ideally, you want to work off of real world datasets. If you don't already have access to them, you can find them at QuantConnect or Kaggle.
3) Build some sensible features. They don't need to be mind-blowing but put some thought into it. If you are using futures, make sure carry and momentum are in there somewhere. If you are looking at stock data, make sure you get a decent sample of the McLean and Pontiff (2016) signals (https://buff.ly/3OX2KvD). It is better to have too many features here than too few. If you don't know what to put, just make stationary (!) ratios, diffs, and percentage diffs where appropriate. Even using historical returns and share turnover over various periods as predictors is reasonable.
4) Implement various return prediction models using a linear model (e.g., linear ridge), a tree-based model (e.g., gradient boosting), and a neural network model (here, feed forward is fine). Don't worry about getting a large dataset. Of course, we know that if you want feed forward neural networks to be effective, we need a large dataset, but the point isn't to get an incredible model. It is to demonstrate skill.
5) Build a portfolio and calculate backtested returns. You can use mean-variance optimization (meaning you need to calculate covariance) or take a Kelly criterion approach if you're sizing bets and might be levered up or not. I personally don't care whether the model is any good. I just want to know how to put the pieces together.
6) Create a pretty graph of growth of a dollar and performance table compared to a benchmark (if appropriate).
Make sure the code is clean and elegant. Input, model, and output should be separated and should each be runnable as one line of code. No absolute paths should exist in the input or output--only in a config file, stored as a variable. In fact, it should be the only thing a user would have to change to run the code (as long as they have the data).
Source: Vivek Viswanathan
Well, I can tell you what would impress me. Other quants can comment on what would be good for them.
1) Code in a commonly used language like Python, C++, Java, Julia, or C#. Do not use Excel for this. Excel is not how production-level systems are deployed, and the difference between implementing in Excel and implementing in code is massive. You CAN use Jupyter notebooks as a way to integrate code presentation of results but keep in mind that isn't where production code lives
2) Ideally, you want to work off of real world datasets. If you don't already have access to them, you can find them at QuantConnect or Kaggle.
3) Build some sensible features. They don't need to be mind-blowing but put some thought into it. If you are using futures, make sure carry and momentum are in there somewhere. If you are looking at stock data, make sure you get a decent sample of the McLean and Pontiff (2016) signals (https://buff.ly/3OX2KvD). It is better to have too many features here than too few. If you don't know what to put, just make stationary (!) ratios, diffs, and percentage diffs where appropriate. Even using historical returns and share turnover over various periods as predictors is reasonable.
4) Implement various return prediction models using a linear model (e.g., linear ridge), a tree-based model (e.g., gradient boosting), and a neural network model (here, feed forward is fine). Don't worry about getting a large dataset. Of course, we know that if you want feed forward neural networks to be effective, we need a large dataset, but the point isn't to get an incredible model. It is to demonstrate skill.
5) Build a portfolio and calculate backtested returns. You can use mean-variance optimization (meaning you need to calculate covariance) or take a Kelly criterion approach if you're sizing bets and might be levered up or not. I personally don't care whether the model is any good. I just want to know how to put the pieces together.
6) Create a pretty graph of growth of a dollar and performance table compared to a benchmark (if appropriate).
Make sure the code is clean and elegant. Input, model, and output should be separated and should each be runnable as one line of code. No absolute paths should exist in the input or output--only in a config file, stored as a variable. In fact, it should be the only thing a user would have to change to run the code (as long as they have the data).
Source: Vivek Viswanathan