• C++ Programming for Financial Engineering
    Highly recommended by thousands of MFE students. Covers essential C++ topics with applications to financial engineering. Learn more Join!
    Python for Finance with Intro to Data Science
    Gain practical understanding of Python to read, understand, and write professional Python code for your first day on the job. Learn more Join!
    An Intuition-Based Options Primer for FE
    Ideal for entry level positions interviews and graduate studies, specializing in options trading arbitrage and options valuation models. Learn more Join!

What exactly is AAD (Adjoint Algorithmic differentiation)?

Hi all,

Today, I visited the NAG website and was cursorily glancing around. On their website, I saw a link on Algorithmic differentiation. It says -

Algorithmic Differentiation (AD) is a Mathematical/Computer Science technique for computing accurate sensitivities quickly. For many models, Adjoint AD (AAD) can compute sensitivites 10s, 100s or even 1000s of times faster than finite differences. NAG are pioneers in providing AD technologies.

It sounds cool! Out of inquisitiveness, what exactly is AAD? If someone here can point me to any links/books - I would like to read up on it - would be very useful.
Last edited:

Daniel Duffy

C++ author, trainer
It computes derivatives for Greeks, Optimisation, Gradient Descent, Neural Networks etc. etc.

Neural Networks

This is a good intro by Cristian Homescu

Adjoints and Automatic (Algorithmic) Differentiation in Computational Finance by Cristian Homescu :: SSRN

The best way to start to do AD 'by hand' e.g.

f(x,y) = exp(x^2 + y^2)

df/dx = 2x f(x,y)
df/dy = 2y f(x,y)

After that the magic disappears and then just use a C++ or C# AD library.

There is also semi-automatic differentiation that not many people know about.

See book by Nocedal and Wright.
Last edited:

Daniel Duffy

C++ author, trainer
Here is a closely related method ("semi-automatic differentiation" using complex analysis)

Computing derivatives without the pain of 1) differentiation, 2) catastrophic round-off errors. Sample code.
The scalar (almost a 1-liner). I leave the vector case as an exercise.


// TestComplexStep.cpp
// Complex-step method to compute approximate derivatives.
// Example is scalar-valued function of a scalar argument.
// https://pdfs.semanticscholar.org/3de7/e8ae217a4214507b9abdac66503f057aaae9.pdf
// http://mdolab.engin.umich.edu/sites/default/files/Martins2003CSD.pdf
// (C) Datasim Education BV 2018

#include <functional>
#include <complex>
#include <iostream>
#include <iomanip>
#include <cmath>

// Notation and function spaces
using value_type = double;

template <typename T>
    using FunctionType = std::function < T(const T& c)>;
using CFunctionType = FunctionType<std::complex<value_type>>;

// Test case from Squire&Trapp 1998
template <typename T> T func(const T& t)
    T n1 = std::exp(t);
    T d1 = std::sin(t);
    T d2 = std::cos(t);

    return n1 / (d1*d1*d1 + d2*d2*d2);

template <typename T> T func2(const T& t)
{ // Derivative of e^t, sanity check

    return std::exp(std::pow(t,1));
//    return std::exp(std::pow(t, 5));

value_type Derivative(const CFunctionType& f, value_type x, value_type h)
{ // df/dx at x using tbe Complex step method

    std::complex<value_type> z(x, h); // x + ih, i = sqrt(-1)
    return std::imag(f(z)) / h;

int main()
    // Squire Trapp
    double x = 1.5;    double h = 0.1;
        std::cout << std::setprecision(12) << Derivative(func<std::complex<value_type>>, x, h) << '\n';
        h *= 0.1;

    } while (h > 1.0e-300);

    // Exponential function (101 sanity check)
    x = 5.0;
    h = 1.0e-10;
    std::cout << "Exponential 1: " << std::setprecision(12) << Derivative(func2<std::complex<value_type>>, x, h) << '\n';

    return 0;
At the risk of shameless advertising :) I have just published a book with Wiley on AAD in finance, which you may find helpful. The book is found on Wiley's page:

Modern Computational Finance: AAD and Parallel Simulations

in hardcover and ebook format, and on Amazon:


in hardcover format.

I also posted a short preview, including Leif Andersen's preface, on SSRN:

Modern Computational Finance: AAD and Parallel Simulations by Antoine Savine :: SSRN

The book addresses AAD in the context of (parallel) Monte-Carlo simulations in finance and deals with professional C++ code. If you want to start with something lighter (and free), you may want to consult the slides of my half day workshop, which I made available on my GitHub repo:


Here, I show how to implement AAD with simplistic code and address it from the point of view of machine learning (where it is called back-propagation) and finance.

I hope it helps.

Antoine Savine
Last edited:

Daniel Duffy

C++ author, trainer

I am having difficulties printing the pdf of the short review .. it either stops printing or give NAN at page 24.
Hello Daniel, thank you. We haven't met but I learned Boost in your books many years ago, and it is nice to talk to you.
For some reason, all my papers on SSRN have vanished today and I get a 404 error when trying to access them.
I alerted SSRN and hopefully this is resolved soon.
In the meantime, I posted the preview on ResearchGate:
Apologies for the inconvenience.
Kind regards,
Ah yes GitHub is not great at previewing pdfs. Best is download it and open the downloaded pdf in a pdf reader or browser, then print if you must :)
I am concerned to post in one place, otherwise I will end up with different versions all over the internet. Are you having any trouble downloading the pdf from GitHub?
I presented AAD on Bloomberg Tech Talks in November, where I was asked to explain adjoint differentiation, backpropagation and how it all works in finance in just 15 min. My talk was recorded and posted on youtube, hopefully it helps people getting started with these ground breaking technologies.
Thank you Daniel, I will have a look at your first link.
Boost autodiff is not ad. It does not implement adjoint differentiation (also called reverse-mode ad) with the magic constant time speed, but the so-called forward-mode ad (here, a is for automatic, not adjoint), which is trivial to understand and implement but rather useless, since it computes differentials in linear time, just like bumping. I submitted an implementation of proper ad to boost a few years ago but never heard back. Guess they are not interested in the technology that powers deep learning, among many other things...

Daniel Duffy

C++ author, trainer
I googled "AAD" and it did not give any hits. My feeling is AD has been promoted to emphasise the adjoint (reverse) aspects of the 2 modes of AD. I am confused by the rationale.
Making up new names confuses people no end (it's rampant in ML, those guys make up names for things that already exist.)

1. forward
2. reverse mode


For 2, it is a graph model and for large problems it will demand yuge memory storage as the entire graph must be in memory, There are other ways, e.g. see link, Matt Robinson's thesis and McGhee's papers (the one with the infamous '10,000 faster" claim) where splines are used to compute sensitivities even though it is not accurate.

We are interested in all the different ways way to so sensitivities and to compare/contrast them and in Robibson's theis based on this link

Finally, in fairness to Boost, is is the best C++ on the planet. They specifically state that they do AD forward mode. They haven't done adjoint mode, yet. They built a car, they did not say it could fly.
The Boost maths guys were quite accessible whenever I approached them.
Last edited: