• C++ Programming for Financial Engineering
    Highly recommended by thousands of MFE students. Covers essential C++ topics with applications to financial engineering. Learn more Join!
    Python for Finance with Intro to Data Science
    Gain practical understanding of Python to read, understand, and write professional Python code for your first day on the job. Learn more Join!
    An Intuition-Based Options Primer for FE
    Ideal for entry level positions interviews and graduate studies, specializing in options trading arbitrage and options valuation models. Learn more Join!

Machine Learning, Deep Learning and AI: Perspectives



Renowned software engineer Grady Booch, one of the trio that developed the Unified Modeling Language, also had concerns. He described Galactica as “little more than statistical nonsense at scale.”

Prof. Emily M. Bender, director of the University of Washington’s Computational Linguistics Laboratory, said it's "not surprising in the least" that Galactic generates text that is both “fluent and … wrong.”
 

He Spent $140 Billion on AI With Little to Show. Now He Is Trying Again.​

Billionaire Masayoshi Son said he would make SoftBank ‘the investment company for the AI revolution,’ but he missed out on the most recent frenzy​



Cheap at half the price..
 
About the Archive
This is a digitized version of an article from The Times’s print archive, before the start of online publication in 1996. To preserve these articles as they originally appeared, The Times does not alter, edit or update them.
Occasionally the digitization process introduces transcription errors or other problems; we are continuing to work to improve these archived versions.
COMPUTER scientists, taking clues from how the brain works, are developing new kinds of computers that seem to have the uncanny ability to learn by themselves.
The new systems offer hope of being able to perform tasks such as recognizing objects and understanding speech that have so far stymied conventional computers. Moreover, with the ability to learn by themselves, such machines would not require the laborious programming of rules and procedures that is now required to allow computers to work.
The new computers are called neural networks because they contain units that function roughly like the intricate network of neurons in the brain. Early experimental systems, some of them eerily human-like, are inspiring predictions of amazing advances.
''I'm convinced that this will be the next large-scale computer revolution,'' said Pentti Kanerva of the Research Institute for Advanced Computer Science in Mountain View, Calif., run by a consortium of universities. But he and other experts note that the technology is still in its infancy and there are many obstacles to surmount.

Among the recent developments are these:
* At Johns Hopkins University Terrence Sejnowski developed a program that teaches itself to read out loud. The system is given no rules about how letters are to be pronounced; its errors are merely corrected. At first, the talk is mere gibberish. After a while it begins to utter some baby-like sounds as it learns to distinguish between consonants and vowels. After a night of computing, it reads with few mistakes.
* At Avco Financial Services in Irvine, Calif. a neural network learned how to evaluate loan applications after being fed data on 10,000 past loans. One test showed that had the neural system been used in place of the company's existing computerized evaluation system, it would have increased profits 27 percent.
* At Los Alamos National Laboratory, researchers used a neural network to predict whether particular DNA sequences represented genetic codes for the manufacture of proteins. The network seemed to work with greater than 80 percent accuracy, better than conventional statistical techniques.
In each of these cases, a computer was ''trained'' with a set of tasks or problems and a set of correct answers. As it completed each task, it compared its results with the correct answers. When it was wrong, it altered its own program and tried again. Gradually, it ''learned'' the right approach.
Up to a point, the more the computer is ''trained'' the more accurate it gets. But no one can tell at the outset how long the training may take.


Lured by the promise of this technology, universities and electronics companies like TRW, I.B.M. and A.T.&T. are pursuing work on such computer systems. And more than two dozen small neural network companies have been formed, mainly in the last two years, according to Intelligence, a New York newsletter that follows the field.
 
It seems like the individual quoted is expressing a perspective on the practicality and relevance of deep learning, particularly in the context of machine learning and data science within the industry.
 
It seems like the individual quoted is expressing a perspective on the practicality and relevance of deep learning, particularly in the context of machine learning and data science within the industry.
Not sure what you are saying. Who is the "individual" in this case?
 
Not sure what you are saying. Who is the "individual" in this case?
I'm 99% sure its AI generated, look at the other posts from that account

Edit: I ran some of the posts and the pfp through AI checkers and they come back as AI generated content detected
 
I'm 99% sure its AI generated, look at the other posts from that account

Edit: I ran some of the posts and the pfp through AI checkers and they come back as AI generated content detected
??
it is from NYT,yes?? (pay wall)


They published similar hype in 1958 (and that's a fact)

// Electronic 'Brain' Teaches Itself (Published 1958)

nytimes.com/1958/07/13/archives/electronic-brain-teaches-itself.html
 
Last edited:
BTW in 1987 was the dawn of a new AI revolution. It fizzled out. I dabbled a bit in Prolog (must be good if the Japanese were using it ...), but no.I saw no future .. a lot of hype, unfortunately.
Microsoft, Oracle won.
And C++.
 
JP Morgan pulls plug on deep learning model for FX algos
US bank turns to less complex models that are easier to explain to clients




 
Last edited:
Lovely article! It's funny because I'm making heavy use of all things deep learning and neural network at school and at work. It's only for binary classification so far, and we boost and bag everything. If you can build things right on the metal in something like C, it's fast and somewhat useful as a starting point of the analysis.

I think for folks that just pull in a library in python and try to go heavy for algo trading or something more complicated and nobody really understands what is going on under the hood, it is going to be way too slow and unpredictable. That's a no no when real loot is on the line!

Prof Duffy can't wait to read a future book where you address this field from a rigorous math angle! 🥸🥩🍺🫡
 
Lovely article! It's funny because I'm making heavy use of all things deep learning and neural network at school and at work. It's only for binary classification so far, and we boost and bag everything. If you can build things right on the metal in something like C, it's fast and somewhat useful as a starting point of the analysis.

I think for folks that just pull in a library in python and try to go heavy for algo trading or something more complicated and nobody really understands what is going on under the hood, it is going to be way too slow and unpredictable. That's a no no when real loot is on the line!

Prof Duffy can't wait to read a future book where you address this field from a rigorous math angle! 🥸🥩🍺🫡
My prediction in this area is that C++20 will be needed.
And linear algebra is not enough.

(the dolphins told me!)
 
Here's a work in progress


C++:
#ifndef kernels_hpp
#define kernels_hpp

#include <vector>
#include <functional>
#include <type_traits>
#include <cmath>
#include "NestedMatrix.hpp"

// K:R(n) X R(n) -> R (or complex)

template <typename T>
    using VectorType = std::vector<T>;

template <typename T>
    using MatrixType = NestedMatrix<T>;

// Kernels with vector arguments
template <typename T>
    using KernelType = std::function<T(const VectorType<T>& a, const VectorType<T>& b)>;

template <typename T1, typename T2, template <typename T2> class S >
    using KernelTypeGeneral = std::function<T1 (const S<T2>& a, const S<T2>& b)>;

template <typename T1, typename T2>
    using KernelVectorType = KernelTypeGeneral<T1, T2, VectorType>;

template <typename T1, typename T2>
    using KernelMatrixType = KernelTypeGeneral<T1, T2, MatrixType>;

// Evaluate a kernel function k(x,y)
// Uses concepts + template template parametsrs
template <typename T, template <typename T> class K >
    concept KernelClient = requires (K<T> k, const VectorType<T>&a, const VectorType<T>&b)
{ // Evaluate kernel at two vector points a and b
    k(a, b);
};

// Evaluate a similarity function for kernels
template <typename T, template <typename T> class K >
    concept KernelClientII = requires (K<T> k, MatrixType<T>& p, MatrixType<T>& q)
{ // Evaluate similarity function for two point sets p and q

    k.similarity(p, q);
};

// Evaluate a distance function for kernels
template <typename T, template <typename T> class K >
    concept KernelDistance = requires (K<T> k, MatrixType<T>& p, MatrixType<T>& q)
{ // Evaluate similarity function for two point sets p and q

        k.distance(p, q);
};

template <typename T, template <typename T> class K>
    concept KernelProtocol = KernelClient<T, K> && KernelClientII<T, K> && KernelDistance<T, K>;

// Kernel classes
template <typename T, typename D>
    class BaseKernel
{ // Using CRTP and Template Method Pattern
private:
    
public:
    BaseKernel() {}

    // K(x,y) for derived classes

    T operator() (const VectorType<T>& x, const VectorType<T>& y)
    {
        return static_cast<D*>(this)->kernel(x,y);
    }
    
    T kernel(const VectorType<T>& x, const VectorType<T>& y)
    {
        return (*this)(x, y);
    }

    
    NestedMatrix<T> kernelMatrix(const NestedMatrix<T>& x, const NestedMatrix<T>& y)
    { // x[0], x[1], ...,x[n] and y[0], y[1], ...,y[m] are vectors of size D

        std::size_t n = x.size1();
        std::size_t m = y.size1();

        NestedMatrix<T> mat(n, m);
        for (std::size_t i = 0; i < n; ++i)
        {
            for (std::size_t j = 0; j < m; ++j)
            {
                mat(i, j) = kernel(x.row(i), y.row(j));
            }
        }

        return mat;
    }

    NestedMatrix<T> distanceMatrix(const NestedMatrix<T>& x, const NestedMatrix<T>& y)
    { // x[0], x[1], ...,x[n] and y[0], y[1], ...,y[m] are vectors of size D

        std::size_t n = x.size1();
        std::size_t m = y.size1();

        NestedMatrix<T> mat(n, m);
        for (std::size_t i = 0; i < n; ++i)
        {
            for (std::size_t j = 0; j < m; ++j)
            {
                mat(i, j) = kernel(x.row(i), y.row(j));
            }
        }

        return mat;

    }


    T similarity (MatrixType<T>& p, MatrixType<T>& q)
    {
                                                                    
        T result = 0.0;

        for (std::size_t i = 0; i < p.size1(); ++i)
        {
            for (std::size_t j = 0; j < q.size1(); ++j)
            {
                // Kick down to the derived class
                result += static_cast<D*>(this)->kernel(p.row(i), q.row(j));
            }
        }

        return result;
    }
    
    T distance (MatrixType<T>& p, MatrixType<T>& q)
    { // aka sqrt(discrepancy error)

        return std::sqrt(similarity(p, p) + similarity(q, q) - 2.0* similarity(p, q));
    }

    T distance(const VectorType<T>& p, const VectorType<T>& q)
    {
        return std::sqrt(kernel(p, p) + kernel(q, q) - 2.0 * kernel(p, q));
    }

    NestedMatrix<T> distaneMatrix(const NestedMatrix<T>& x, const NestedMatrix<T>& y)
    { // x[0], x[1], ...,x[n] and y[0], y[1], ...,y[m] are vectors of size D

        std::size_t n = x.size1();
        std::size_t m = y.size1();

        NestedMatrix<T> mat(n, m);
        for (std::size_t i = 0; i < n; ++i)
        {
            for (std::size_t j = 0; j < m; ++j)
            {
                mat(i, j) = distance(x.row(i), y.row(j));
            }
        }

        return mat;
    }


};
    template <typename T>
        class LinearKernel : public BaseKernel<T, LinearKernel<T>>
    {
    private:
        T c;
    public:
        LinearKernel(T offset = 0.0) : c(offset) {}

        // K(x,y)
        T operator() (const VectorType<T>& x, const VectorType<T>& y)
        {

            T result = x[0] * y[0] + c;

            for (std::size_t i = 1; i < x.size(); ++i)
            {
                result += x[i] * y[i];
            }

            return result;
        }

        T kernel(const VectorType<T>& x, const VectorType<T>& y)
        {
            return (*this)(x, y);
        }
        

    };

    template <typename T>
        class RELUKernel : public BaseKernel<T, LinearKernel<T>>
    {
    private:
            T c;
    public:
            RELUKernel(T offset = 0.0) : c(offset) {}

            // K(x,y)
            T operator() (const VectorType<T>& x, const VectorType<T>& y)
            {

                T result = x[0] * y[0] + c;

                for (std::size_t i = 1; i < x.size(); ++i)
                {
                    result += std::abs(x[i] -  y[i]);
                }

                return 1.0 - result;
            }

            T kernel(const VectorType<T>& x, const VectorType<T>& y)
            {
                return (*this)(x, y);
            }


    };

    template <typename T>
        class MaternKernel : public BaseKernel<T, LinearKernel<T>>
    {
    private:
            T c;
    public:
            MaternKernel(T offset = 0.0) : c(offset) {}

            // K(x,y)
            T operator() (const VectorType<T>& x, const VectorType<T>& y)
            {

                T result = x[0] * y[0] + c;

                for (std::size_t i = 1; i < x.size(); ++i)
                {
                    result += std::abs(x[i] - y[i]);
                }

                return 1.0 - result;
            }

            T kernel(const VectorType<T>& x, const VectorType<T>& y)
            {
                return (*this)(x, y);
            }


    };

template <typename T>
    class GaussianKernel : public BaseKernel<T, LinearKernel<T>>
{
private:
        T sig;
public:
        GaussianKernel(T sigma) : sig(sigma) {}

        // K(x,y)
        T kernel (const VectorType<T>& x, const VectorType<T>& y) const
        {

            T diff = 0.0;
            T tmp;

            for (std::size_t i = 1; i < x.size(); ++i)
            {
                tmp = x[i] - y[i];
                diff += tmp*tmp;
            }

            return std::exp(-diff / (2.0 * sig * sig));
        }

};

template <typename T>
    class LaplacianKernel : public BaseKernel<T, LinearKernel<T>>
{
private:
        T sig;
public:
        LaplacianKernel(T sigma) : sig(sigma) {}

        // K(x,y)
        T kernel(const VectorType<T>& x, const VectorType<T>& y) const
        {

            T tmp = 0.0;

            for (std::size_t i = 1; i < x.size(); ++i)
            {
                tmp += std::abs(x[i] - y[i]);
            }

            return std::exp(-tmp / sig);
        }

};

template <typename T>
    class MLPKernel : public BaseKernel<T, MLPKernel<T>>
{ //A multilayer perceptron(MLP) is a class of feedforward artificial neural network(ANN).
private:
        // K(x,y) = tanh(a x*y + c)
        T c;
        T a;
public:
        MLPKernel(T intercept, T slope) : a(slope), c (intercept) {}

        // K(x,y) = tanh(
        T kernel (const VectorType<T>& x, const VectorType<T>& y) const
        {

            T sum = 0.0;
        
            for (std::size_t i = 0; i < x.size(); ++i)
            {
                sum += a*x[i]*y[i] + c;
            }

            return std::tanh(sum + c);
        }
};

template <typename T>
    class PolynomialKernel : public BaseKernel<T, MLPKernel<T>>
{ //A multilayer perceptron(MLP) is a class of feedforward artificial neural network(ANN).
private:
        // K(x,y) = (a x*y + c)^d
        T c;
        T a;
        T d;
public:
        PolynomialKernel(T slope, T intercept, T degree) : a(slope), c(intercept), d(degree) {}

        // K(x,y) =
        T kernel (const VectorType<T>& x, const VectorType<T>& y) const
        {

            T sum = 0.0;

            for (std::size_t i = 0; i < x.size(); ++i)
            {
                sum += a * x[i] * y[i] + c;
            }

            return std::pow(sum + c, d);
        }
};

template <typename T>
    class ANOVAKernel : public BaseKernel<T, LinearKernel<T>>
{ // radial basis function, useful in  multidimensional regression prob
private:
        T sig;
        T d;
public:
        ANOVAKernel(T sigma, T degree) : sig(sigma), d(degree) {}

        // K(x,y)
        T kernel(const VectorType<T>& x, const VectorType<T>& y) const
        {

            T sum = 0.0;
            T tmp;

            for (std::size_t i = 1; i < x.size(); ++i)
            {
                tmp = x[i] - y[i];
                sum += std::exp(std::pow(- sig*tmp*tmp,d));
            }

            return sum;
        }
};

#endif
 
C++:
template <typename T, template <typename T> class K> requires KernelProtocol<T, K>
    class SomeAlgorithm
{ // An example of Composition, testing conjunctional requirements

private:
    K<T> ker;
public:
    SomeAlgorithm(const K<T>& kernel) : ker(kernel) {}
    
    // Wrapper functions
    T computeKernel(const VectorType<T>& x, const VectorType<T>& y)
    {
        return ker(x, y);
    }

    T computeSimilarity(const MatrixType<T>& x, const MatrixType<T>& y)
    {
        return ker.similarity(x, y);
    }

    T computeDistance(const MatrixType<T>& x, const MatrixType<T>& y)
    {
        return ker.distance(x, y);
    }
};
 
Back
Top