- Joined
- 3/28/23
- Messages
- 6
- Points
- 13
Hi there,
I am an applied math PhD student studying theoretial ML. It is theoretical because we mainly work with various computational models and assume various things about data(distributions, sample sizes, flows etc).
Anyway, I am now interested in getting into quant, but I find myself unprepared for the "engineering" parts of quant. Specifically, coding and data engineering.
Coding: I am familiar with standard ML frameworks(Pytorch etc), so I can implement basic things in Python. I am also familiar with the basics of algorithms and data structures at the undergrad level. But I know nothing about C++ and have very little experience in using computer programs to automate mundane tasks.
Data engineering: I took an intro course at the MMF program at my institution, and did some quant trading on my own. I might be wrong, but I find that, just as in "practical" ML, data engineering is very important. By data engineering, I mean things like "what kind of data to look at", "how to collect large amount of data", "how to structure the collected data".
For me, data engineering is an even higher barrier to entry than other quantitative problems. I have only interviewed with one firm so far. The technical questions were all interesting, but I found myself very unprepared for the questions related to data engineering. Perhaps they had higher expectations because of my background? But I am really more of a math person than a CS/ML person. I would have thought this will be the responsbility of quant developers.
So my question is, how much data engineering do real QRs do? What can an individual(not belonging to a group) do to become good at it? This really seems to be a new topic for interviews(at least there aren't not many in the green book).
A related question: Did you form alternative datasets for your own trades(before becoming a quant)? How did you do it?
I am an applied math PhD student studying theoretial ML. It is theoretical because we mainly work with various computational models and assume various things about data(distributions, sample sizes, flows etc).
Anyway, I am now interested in getting into quant, but I find myself unprepared for the "engineering" parts of quant. Specifically, coding and data engineering.
Coding: I am familiar with standard ML frameworks(Pytorch etc), so I can implement basic things in Python. I am also familiar with the basics of algorithms and data structures at the undergrad level. But I know nothing about C++ and have very little experience in using computer programs to automate mundane tasks.
Data engineering: I took an intro course at the MMF program at my institution, and did some quant trading on my own. I might be wrong, but I find that, just as in "practical" ML, data engineering is very important. By data engineering, I mean things like "what kind of data to look at", "how to collect large amount of data", "how to structure the collected data".
For me, data engineering is an even higher barrier to entry than other quantitative problems. I have only interviewed with one firm so far. The technical questions were all interesting, but I found myself very unprepared for the questions related to data engineering. Perhaps they had higher expectations because of my background? But I am really more of a math person than a CS/ML person. I would have thought this will be the responsbility of quant developers.
So my question is, how much data engineering do real QRs do? What can an individual(not belonging to a group) do to become good at it? This really seems to be a new topic for interviews(at least there aren't not many in the green book).
A related question: Did you form alternative datasets for your own trades(before becoming a quant)? How did you do it?