• C++ Programming for Financial Engineering
    Highly recommended by thousands of MFE students. Covers essential C++ topics with applications to financial engineering. Learn more Join!
    Python for Finance with Intro to Data Science
    Gain practical understanding of Python to read, understand, and write professional Python code for your first day on the job. Learn more Join!
    An Intuition-Based Options Primer for FE
    Ideal for entry level positions interviews and graduate studies, specializing in options trading arbitrage and options valuation models. Learn more Join!

Backtesting software Ideas

Hey folks,

I posted this question to another forum and didn't seem to garner much informed response. I realize quant net isn't too big on algo trading, but maybe some of you can steer me the right way here.

For a long time I've been fascinated with the idea of writing my own backtesting software. I've seen quite a few of the retail offerings and they don't seem to give me the transparency or flexibility that I want to have. So far the best retail solution I've seen is deltix, but I still don't think a retail solution is really what I'm looking for. I want to understand and also be able to customize every element of the backtesting process. I've done quite a few vectorized backtests in MATLAB which is great, but I really want to come up with an event-driven solution in C++ (and maybe build an event-driven MATLAB version or plug in as well later). My basic structure so far is a pretty much just a for loop that loops over time and my data and a header file of functions that are close to the functions you can find the IB TWS API. I also have made the commissions and some characteristics of the account management rules scriptable.

My question to you folks is what can I do to make this better? How do the big prop firms/funds etc. backtest? What can I add to improve the accuracy of this and make my tests more accurate? What omissions have I made?

In terms of optimization: Currently I just use a brute force parameter sweep. I split my data into a training and testing set. I bet there's an ideal way to split it up instead of just first 75% training, last 25% testing, because that obviously opens me up to the possibility or regime shifts, etc. I've been reading a lot about genetic optimization and am working on including something like that instead of just a big parameter sweep for optimization.

In terms of data and data storage: I'm lucky to have very good millisecond level data on certain assets made available to me by my school, which I'm learning how to store in the free version of kdb+ from my database. I've been using MySQL, but it's obviously not optimal.

In terms of the next steps I'm considering: I want to make it multithreaded to run multiple tests at a time or improve the performance of my future genetic algorithms. I've been looking into the possibility of using CUDA on my GPU as well but maybe that isn't so well suited to this task. I also want to add portfolio level functionality (that I don't have to code up each time).

So let's hear about some other ideas, pointers, and gotchas you've experienced that might help an aspiring algo trader out!
 
Thanks for the reply dale, I shot him a message on linkedin. I've actually been a big fan of his work for a while now.
 
Last edited:
theres another guy, ernie chan , i think, who has a book he promotes here and there. I dont know, my interest isnt in algo trading on tick data, i want a long term fundamentals approach, enforced by algo software. Sounds like you already know C++, but if not, C# might be an excellent substitute, and you could elverage some F# stuff as well for good parallelism. c# has a bunch of opencl / cuda type wrappers as well
 
Yeah, I'm quite familiar with Ernie Chan and his book. Overall really beneficial for me when I was just starting out. You've definitely suggested some cool languages, but my reason for choosing C++ (besides just knowing and liking it as you noted) is actually one Ernie Chan espouses. By developing a library in C++ as close to the interactive brokers API one as I reasonably can and running my backtests in that C++ environment, I can essentially just switch a few lines of code and keep the same program logic for live trading that I would in my demos. Obviously, there are other things at play that you can't really simulate with even the best tick data like order routing and order rejection logic...bad fills etc., but ignoring that stuff...pretty much the idea behind the C++ is to minimize recoding for live trading thus minimizing the difference in logic between my backtest code and my live code as I go from backtesting to forward testing and eventually hopefully live trading.

Like I said, I'm also interested in using MATLAB's engine API and just using my same C++ backend backtesting logic to call a MATLAB program to evaluate responses to new events instead of doing it all in C++. I know this approach will probably be painfully slow computationally, but it will also allow me to take advantage of MATLAB's super high level yet still kinda-sorta C like programming experience. The goal here is to give me a faster prototyping environment for ideas like C# or F# would yield me over C++. I'm still just sketching this whole thing out. I know you aren't particularly interested in tick data...before I started this project all my backtests were done in MATLAB using discrete time and calculating vectorized signals. Maybe that might interest you for your purposes. It's so fast and easy to code up and test (not super thoroughly) new discrete time ideas using MATLAB's vectorization. Here's a really cool webinar on it.

http://www.mathworks.com/videos/alg...inancial-applications-81775.html?refresh=true
 
Top