What exactly is AAD (Adjoint Algorithmic differentiation)?

Quasar Chunawala · 6/23/18

Hi all,

Today, I visited the NAG website and was cursorily glancing around. On their website, I saw a link on Algorithmic differentiation. It says -

Algorithmic Differentiation (AD) is a Mathematical/Computer Science technique for computing accurate sensitivities quickly. For many models, Adjoint AD (AAD) can compute sensitivites 10s, 100s or even 1000s of times faster than finite differences. NAG are pioneers in providing AD technologies.

It sounds cool! Out of inquisitiveness, what exactly is AAD? If someone here can point me to any links/books - I would like to read up on it - would be very useful.

Daniel Duffy · 6/24/18

It computes derivatives for Greeks, Optimisation, Gradient Descent, Neural Networks etc. etc.

Neural Networks

This is a good intro by Cristian Homescu

Adjoints and Automatic (Algorithmic) Differentiation in Computational Finance by Cristian Homescu :: SSRN

The best way to start to do AD 'by hand' e.g.

f(x,y) = exp(x^2 + y^2)

df/dx = 2x f(x,y)
df/dy = 2y f(x,y)

After that the magic disappears and then just use a C++ or C# AD library.

There is also semi-automatic differentiation that not many people know about.
http://mdolab.engin.umich.edu/sites/default/files/Martins2003CSD.pdf

See book by Nocedal and Wright.

Daniel Duffy · 9/23/18

Here is a closely related method ("semi-automatic differentiation" using complex analysis)

Computing derivatives without the pain of 1) differentiation, 2) catastrophic round-off errors. Sample code.
The scalar (almost a 1-liner). I leave the vector case as an exercise.

https://pdfs.semanticscholar.org/3de7/e8ae217a4214507b9abdac66503f057aaae9.pdf

C++:

// TestComplexStep.cpp
//
// Complex-step method to compute approximate derivatives.
// Example is scalar-valued function of a scalar argument.
//
// https://pdfs.semanticscholar.org/3de7/e8ae217a4214507b9abdac66503f057aaae9.pdf
//
// http://mdolab.engin.umich.edu/sites/default/files/Martins2003CSD.pdf
//
// (C) Datasim Education BV 2018
//

#include <functional>
#include <complex>
#include <iostream>
#include <iomanip>
#include <cmath>

// Notation and function spaces
using value_type = double;

template <typename T>
    using FunctionType = std::function < T(const T& c)>;
using CFunctionType = FunctionType<std::complex<value_type>>;


// Test case from Squire&Trapp 1998
template <typename T> T func(const T& t)
{
    T n1 = std::exp(t);
    T d1 = std::sin(t);
    T d2 = std::cos(t);

    return n1 / (d1*d1*d1 + d2*d2*d2);
}

template <typename T> T func2(const T& t)
{ // Derivative of e^t, sanity check

    return std::exp(std::pow(t,1));
//    return std::exp(std::pow(t, 5));

}
 
value_type Derivative(const CFunctionType& f, value_type x, value_type h)
{ // df/dx at x using tbe Complex step method

    std::complex<value_type> z(x, h); // x + ih, i = sqrt(-1)
    return std::imag(f(z)) / h;
}

int main()
{
    // Squire Trapp
    double x = 1.5;    double h = 0.1;
    do
    {
        std::cout << std::setprecision(12) << Derivative(func<std::complex<value_type>>, x, h) << '\n';
        h *= 0.1;

    } while (h > 1.0e-300);

    // Exponential function (101 sanity check)
    x = 5.0;
    h = 1.0e-10;
    std::cout << "Exponential 1: " << std::setprecision(12) << Derivative(func2<std::complex<value_type>>, x, h) << '\n';

    return 0;
}

antoinesavine · 12/6/18

At the risk of shameless advertising

I have just published a book with Wiley on AAD in finance, which you may find helpful. The book is found on Wiley's page:

Modern Computational Finance: AAD and Parallel Simulations

in hardcover and ebook format, and on Amazon:

https://www.amazon.com/gp/product/1119539455

in hardcover format.

I also posted a short preview, including Leif Andersen's preface, on SSRN:

Modern Computational Finance: AAD and Parallel Simulations by Antoine Savine :: SSRN

The book addresses AAD in the context of (parallel) Monte-Carlo simulations in finance and deals with professional C++ code. If you want to start with something lighter (and free), you may want to consult the slides of my half day workshop, which I made available on my GitHub repo:

asavine/CompFinance

Here, I show how to implement AAD with simplistic code and address it from the point of view of machine learning (where it is called back-propagation) and finance.

I hope it helps.

Antoine Savine

Daniel Duffy · 12/7/18

Congratulations!

I am having difficulties printing the pdf of the short review .. it either stops printing or give NAN at page 24.

antoinesavine · 12/7/18

Hello Daniel, thank you. We haven't met but I learned Boost in your books many years ago, and it is nice to talk to you.
For some reason, all my papers on SSRN have vanished today and I get a 404 error when trying to access them.
I alerted SSRN and hopefully this is resolved soon.
In the meantime, I posted the preview on ResearchGate:
https://www.researchgate.net/public...tational_Finance_AAD_and_Parallel_Simulations
Apologies for the inconvenience.
Kind regards,
Antoine

antoinesavine · 12/7/18

SSRN fixed

Daniel Duffy · 12/7/18

asavine/CompFinance

Hi Antoine,
Thanks. It's the above pdf slide show that I wish to print. I can view it OK.

I'll have a read of the contents.

antoinesavine · 12/7/18

Ah yes GitHub is not great at previewing pdfs. Best is download it and open the downloaded pdf in a pdf reader or browser, then print if you must

Daniel Duffy · 12/7/18

Would it be possible to post

Intro2AADinMachineLearningAndFinance.pdf

here?

antoinesavine · 12/7/18

I am concerned to post in one place, otherwise I will end up with different versions all over the internet. Are you having any trouble downloading the pdf from GitHub?

Daniel Duffy · 12/7/18

antoinesavine said:
I am concerned to post in one place, otherwise I will end up with different versions all over the internet. Are you having any trouble downloading the pdf from GitHub?

I can download it and read on screen no problem!
I just can't print it.

antoinesavine · 12/7/18

I will try to help you print it, please send me an email to antoine@asavine.com so we plan a chat over the week-end?

Daniel Duffy · 12/7/18

Problem solved. Thanks very much.

antoinesavine · 12/7/18

antoinesavine · 12/14/19

I presented AAD on Bloomberg Tech Talks in November, where I was asked to explain adjoint differentiation, backpropagation and how it all works in finance in just 15 min. My talk was recorded and posted on youtube, hopefully it helps people getting started with these ground breaking technologies.

Daniel Duffy · 12/14/19

Antoine,
You might be interested in two excellent theses on ML and finance that discusses a number of related methods.

Blogs - MSc Theses on Machine Learning and Computational Finance :: Datasim

MSc Theses at University of Birmingham 2019 MSc Mathematical Finance Programme Supervisor: Dr. Daniel J. Duffy, dduffy@datasim.nl Course Director: Dr. Colin Rowat,..

www.datasim.nl

Daniel Duffy · 12/14/19

Boost C++ library has autodiff for AD as well (header-only).

https://www.boost.org/doc/libs/master/libs/math/doc/html/math_toolkit/autodiff.html

antoinesavine · 12/14/19

Thank you Daniel, I will have a look at your first link.
Boost autodiff is not ad. It does not implement adjoint differentiation (also called reverse-mode ad) with the magic constant time speed, but the so-called forward-mode ad (here, a is for automatic, not adjoint), which is trivial to understand and implement but rather useless, since it computes differentials in linear time, just like bumping. I submitted an implementation of proper ad to boost a few years ago but never heard back. Guess they are not interested in the technology that powers deep learning, among many other things...

Daniel Duffy · 12/16/19

I googled "AAD" and it did not give any hits. My feeling is AD has been promoted to emphasise the adjoint (reverse) aspects of the 2 modes of AD. I am confused by the rationale.
Making up new names confuses people no end (it's rampant in ML, those guys make up names for things that already exist.)

1. forward
2. reverse mode

See

https://arxiv.org/pdf/1107.1831.pdf

For 2, it is a graph model and for large problems it will demand yuge memory storage as the entire graph must be in memory, There are other ways, e.g. see link, Matt Robinson's thesis and McGhee's papers (the one with the infamous '10,000 faster" claim) where splines are used to compute sensitivities even though it is not accurate.

We are interested in all the different ways way to so sensitivities and to compare/contrast them and in Robibson's theis based on this link

Computing Gradients and Derivatives of Functions in Finance, Optimisation and Machine Learning.

Computing Gradients and Derivatives of Functions in Finance, Optimisation and Machine Learning. How many Ways are there to compute Derivatives of a Function? Differentiation, Sensitivities (greeks), gradients, Jacobians etc.

www.linkedin.com

Finally, in fairness to Boost, is is the best C++ on the planet. They specifically state that they do AD forward mode. They haven't done adjoint mode, yet. They built a car, they did not say it could fly.
The Boost maths guys were quite accessible whenever I approached them.

What exactly is AAD (Adjoint Algorithmic differentiation)?

Quasar Chunawala

Daniel Duffy

C++ author, trainer

Daniel Duffy

C++ author, trainer

antoinesavine

Daniel Duffy

C++ author, trainer

antoinesavine

antoinesavine

Daniel Duffy

C++ author, trainer

antoinesavine

Daniel Duffy

C++ author, trainer

antoinesavine

Daniel Duffy

C++ author, trainer

antoinesavine

Daniel Duffy

C++ author, trainer

antoinesavine

antoinesavine

Daniel Duffy

C++ author, trainer

Blogs - MSc Theses on Machine Learning and Computational Finance :: Datasim

Daniel Duffy

C++ author, trainer

antoinesavine

Daniel Duffy

C++ author, trainer

Computing Gradients and Derivatives of Functions in Finance, Optimisation and Machine Learning.