• C++ Programming for Financial Engineering
    Highly recommended by thousands of MFE students. Covers essential C++ topics with applications to financial engineering. Learn more Join!
    Python for Finance with Intro to Data Science
    Gain practical understanding of Python to read, understand, and write professional Python code for your first day on the job. Learn more Join!
    An Intuition-Based Options Primer for FE
    Ideal for entry level positions interviews and graduate studies, specializing in options trading arbitrage and options valuation models. Learn more Join!

MSc Theses on Machine Learning and Computational Finance

Daniel Duffy

C++ author, trainer
Background
I have been an external supervisor at the University of Birmingham since 2014 for MSc students who research and produce a thesis in the summer months June-September. The focus is on analysing a financial model, approximate it using numerical methods and then producing working code in C++ (or Python). Summarising, students have three months to produce useful results.
We take two theses which we hope are interesting to those working in computational finance. The results in these theses can be generalised to other related problems such as:
  • The Heston model (analytical solution and Yanenko Splitting and Alternating Explicit (ADE) methods).
  • Using (Gaussian) Radial Basis Functions instead of traditional Backpropagation to compute neural network weights.
  • A mathematical, numerical and computational analysis of the Continuous Sensitivity Equation (CSE) method.
  • Parallel software design for ML/PDE applications.
A number of these methods would be at PhD level. If you have any queries, please do not hesitate to contact us.

Daniel J. Duffy is mathematician, software designer and coach/trainer. He has a PhD from the University of Dublin, Ireland (Trinity College).


DALVIR MANDARA
This thesis is concerned with the application of artificial neural networks (ANN) to price options under the Black Scholes (BS) model and an ANN-based framework to predict implied volatility generated by the SABR stochastic volatility model. The experiments show that the ANN architecture is able to predict the price of call options under BS as well as to predict individual implied volatility and implied volatility surfaces under the normal and log-normal SABR models.
An important part in the thesis is the application of the promising image-based implicit method (Horvath et al 2019) which uses a grid of input values in contrast to traditional neural network setups in which input vectors map to a single output. The thesis discusses the benefits of this new method, learning non-linear relationships being one of them.
This thesis can be seen as a serious work on applying and integrating Machine Learning and computational finance. It is very well written and all supervisor’s suggestions were taken on board (cross-validation, K-folds, Figure 0.1) and in a timely fashion. Some of the results are original and future research is a possibility. The programming language used is Python.
The main result is to show how ML can be used for option pricing and implied volatility calculation. It is of interest to quantitative analysts and developers.


MATT ROBINSON
This thesis introduces and elaborates on how to approximate option and bond price sensitivities (Black Scholes and Cox-Ingersoll-Ross (CIR) models) in a variety of ways. In general, option price depends on time and on the underlying stock variables as well as on a number of parameters such as volatility, interest rate and strike. The rate of change of the option price with respect to these quantities is computed (in the main, the first and second derivatives).
The thesis discusses a wide range of techniques to compute sensitivities. For example, if an analytic expression for the option is known then we can differentiate the formula or we can apply the Complex Step Method (CSM) to compute the sensitivity. Another popular method is Automatic Differentiation (AD). Continuing, it is possible to discretize the PDE to compute an approximate option price as an array and from there compute option delta and gamma using divided differences or cubic splines. For other sensitivities (such as vega, for example) this approach does not work and then the Continuous Sensitivity Equation (CSE) method is used which allows us to write the sensitivity as the solution of an initial boundary value problem for a Black-Scholes type PDE. The student also discovered new research topics as the project progressed such as well-posedness of the PDEs resulting from CSE and cases in which a PDE can have multiple solutions.
The thesis is well written; the topics have been properly researched and documented. The programming language used is C++11 and the design patterns and state-of-art methods for PDE/FDM in the book Financial Instrument Pricing using C++, 2nd edition, 2018 (John Wiley) are applied and extended.
The main result is to show how PDE models can be used to calculate option sensitivities using a range of robust and accurate numerical methods.

(A third is concerned with the application of artificial neural networks (ANN) to price options under the Black Scholes (BS) model and the Heston stochastic volatility model. In both models the analytical solution is used to produce the training data. It will be published elsewhere.)
See below on where to download these two theses.

Dalvir Mandara Artificial Neural Networks for Black-Scholes Option Pricing and Prediction of Implied Volatility for the SABR Stochastic Volatility Model

Matt Robinson Sensitivities: A Numerical Approach

//
This full text etc. is also to be found here
 
Last edited:
Those are really well-done theses, especially considering they are just three-month master theses. I saw lots of carefully designed analysis. Those students would benefit a lot no matter they decide to work in industry or continue to pursue Ph.D. I wish I could have done such a master thesis when I started.

It's so interesting to see a comparison btw neural networks and classic numerical approach for option pricing. My personal take-on ML is they will only produce acceptable results when the data is super large and approximation has almost no effect on the results. So IMHO the back and bone for applied math is still classic numerical methods and statistics.
 
Last edited:

Daniel Duffy

C++ author, trainer
Thank you, Lynette
These were the students who really wanted to work me. They want to work hard, I keep them fit and I do it with a smile :)
And all the honour goes to the students!

//
JohnLeM,
Thanks. Glad you enjoyed it.The thesis was a joy to me as well. There's a real story in there with a beginning, middle and end. I will use it as a template structure for future MSc theses and even industry quants who write notes/articles can learn from the style
;)


Now, the goal was in essence a proof-of-concept (POC) to compare against the small number and unclear articles floating on the network. Remember, it is a 3-month project and the academic approach cannot be ignored. I motivated Dalvir to take a modular approach (Figure 0.1) to demystify that whole ML/PDE discussion.
So "get it working, then get it right" and avoid premature optimisation because it is not yet on the critical path,

Some feedback
1. Used Python because OpenCV C++ was not up to the job. Lack of time to investigate in 3 months. Ideally, I would prefer to do everything in C++ from the ground up.
2. That this can be done in traditional ways does not concern me just yet. It's not the point. The goal is that AI can get similar results. Having said that, it may turn out that your approach is much better. Time will tell.
3. I agree, the maths behind the ML is somewhat flaky, as evidenced by the quality of the discussion on the "UAT" thread (for me, it is the equivalent of deus ex machina/not even wrong/holy hand granada of Antioch, whatever, LOL), CS and maths is sometimes like oil and water. These days i'm very much in nitty-gritty mode.
Maybe take more time before publishing is better?
4. Performance via parallel design, TBD. We did not investigate as it is not not yet on the critical path,
5. I am not a fan of MLP in the sense that it is not the only kid on the block. This is OK because of modular decomposition (see Figure 0.1 again).
6. but nobody can tell if the resulting algorithm is performant or not. Not sure if I completely agree --> cross-validation and 5-folds were used.

In a sense, I see it as the first stake in the ground for further discussion. Your points can now be addressed, one by one, i.e. requirements for the next round of the 'spiral'.
 
This is a really good start to investigate AI in finance. Look forward to seeing more results from you and your team. The thesis was written in a very clear and coherent way, I believe both professionals and students can benefit from reading it. btw: Can you indicate what the discussion on the "UAT" is? what is "UAT"

btw: I'm also very uncomfortable about the "flaky" math behind AI. I'm reading Strang's book "learning from data" to see how he addresses it.
 
Last edited:

Daniel Duffy

C++ author, trainer
MSc thesis Chun Kiat Ong University of Birmingham UK, 2020.


Abstract

This is one of the first MSc theses to address the full software lifecycle of the analysis (maths), design (Structured Analysis/top-down decomposition) and implementation (C++, Python, ANN, Keras, TensorFlow) to computing option prices and implied volatility under rough Heston model. This new model resolves a number of issues surrounding the original Heston model.

We compare the solutions based on ANNs with more traditional computational solutions; based on our level playing field analysis (that is, we compare “apples with apples”), for this problem the performance of the ANN solution is 7 times slower for option pricing and 17 times slower for implied volatility modelling than traditional methods. Of course, this is only one example but it is hard evidence nonetheless.


There are few articles that discuss the application of ANNs to computational finance and the ones that have been published claim outlandish performance improvements (10,000 times faster) or claim that they can solve 100-factor partial differential equations (PDEs) with Deep Learning techniques.

The full text and pdf of thesis can be found here.


If you have queries please don't hesitate to contact me.
DFD.jpg
flow.jpg
 
Last edited:

Daniel Duffy

C++ author, trainer
Some background on this talk

Hilbert Space Kernel Methods for Machine Learning: Background and Foundations



Daniel J. Duffy, dduffy@datasim.nlJean-Marc Mercier, jean-marc.mercier@mpg-partners.com
Datasim Education BV www.datasim.nl
MPG-Partners




Abstract

Daniel will, in the first part of this talk, overviews RKHS (Reproducing Kernel Hilbert Space) methods and some of their applications to statistics and machine learning. They have several attractive properties such as solid mathematical foundations, computational efficiency and versatility when compared to earlier machine learning methods (for example, artificial neural networks (ANNs)). We can draw on the full power of (applied) Functional Analysis to give sharper and a priori error estimation for classification and regression problems, and we have access to any partial differential equations driven approach. We discuss how RKHS methods subsume and improve traditional machine learning methods and we discuss their advantages for the two-sample problems for distributions and Support Vector Estimation and Regression Estimation.

Jean-Marc will then present and discuss a Python library called codpy (curse of dimensionality - for Python), that is an application oriented library supporting Support Vector Machine (SVM) and implementing RKHS methods, providing tools for machine learning, statistical
learning and numerical simulations. This library has been used in the last five years for the internal algorithmic needs of his company, as the main tool and ingredient of proof-of-concept projects for institutional clients. He will also present a benchmark of this library against a more traditional neural network approach, for two important, sometimes critical, classes of applications: the first one is classification methods, illustrated with the benchmark MNIST pattern recognition problem. The second one is statistical learning, for which he will compare both approaches with methods computing conditional expectations.



Daniel J. Duffy is mathematician, software designer, trainer and mentor. He has been working since 1988 with C++ and its applications to computational finance, process-control, Computer-Aided Design (CAD) and holography (optical technology). His company Datasim (www.datasim.nl) was the first to promote C++ and object-oriented technology in the Netherlands. He has trained thousands of practitioners and MSc/MFE degree students in the areas of requirements analysis, design, programming and advanced applied and numerical mathematics as well as being MSc supervisor for several top US and UK universities. He is the originator of two very popular C++ courses in cooperation with www.quantnet.com and Baruch College NYC and is the author of ten books on mathematics, software design, C++ and C#. Daniel J. Duffy has BA (Mod), MSc and PhD degrees from University of Dublin (Trinity College), all in mathematics.



Jean-Marc Mercier is head of R&D of MPG-Partners, a French, mid-sized consulting firm operating in the industrial finance sector, specialized in risk management. He is mathematician, software developer, business analyst consultant, having > 20 years R&D experience as quantitative analyst, and earned a Ph-D Applied Mathematics from Bordeaux university. After his Ph-D, Jean-Marc first started a public researcher carrier (European Research Program), before turning to private R&D, mathematical finance. He also started in 2005 a research program of his own, concerning the curse of dimensionality. This led him to develop a framework using Reproducing Kernel Hilbert Space methods (Support Vector Machines) methods, that is used today in his company as the foundation of several applications in mathematical finance.
 
Top