• C++ Programming for Financial Engineering
    Highly recommended by thousands of MFE students. Covers essential C++ topics with applications to financial engineering. Learn more Join!
    Python for Finance with Intro to Data Science
    Gain practical understanding of Python to read, understand, and write professional Python code for your first day on the job. Learn more Join!
    An Intuition-Based Options Primer for FE
    Ideal for entry level positions interviews and graduate studies, specializing in options trading arbitrage and options valuation models. Learn more Join!

Wanna try to beat classical trading algorithms neural networks?

I have a wild, speculative guess as to what the service MIGHT be:

The service allows you to use many combinations of technical indicators, thresholds, triggers, which can in turn generate millions of different trading algorithms. Now inevitably when millions of algorithms are generated, it's not hard for many to perform exceptionally well when backtested, especially giving the high volatility and short history of crypto-gambling. Now you take those algorithms that performed really well, and there may very well be many thousands, and then you perform a "forward test".

Now out of THOSE thousands, inevitably perhaps a few dozen, or maybe between 5 and 15, are ones that REALLY stand out from the pack in the "FORWARD test". So now you have a short list of algorithms that did really well in a back test, and then really well in a "forward test". Never mind that such results is statistically inevitable if you have millions of algorithms being tested. Someone who is (understandably) not TOO familiar with probability and statistics might look at that and say, "WOW! A fool proof algorithm! License to print money!" Now a user of this service might generate millions of algos, and pick one which performs exceptionally well in the back test and forward test, use it to trade, and then have a shocked pikachu face when it doesn't give him a trading edge.

This is a classic case of not understanding what overfitting is. If you chop, slice, dice and filter the data in millions of different ways, inevitably some will generate patterns which, taken by itself, very strongly suggests statistical significance. It gives a sense of certainty and predictive power that is just not there.

Don't try to get rich quick, investing responsibly is supposed to be boring.
 
I've followed this robust thread with considerable interest. Full disclosure: I am associated (as an advisor) with Rockstart. (rockstart.com). Rockstart runs programs supporting startups. Deepcrypto was recently selected as one of a number of AI related startups which Rockstart supports (ROCKSTART ANNOUNCES THE TOP 10 STARTUPS FROM THE 3RD EDITION OF THE AI 2019 PROGRAM - Rockstart).

I want to thank Daniel Duffy for making the excellent thesis report available here. While I am neither quant nor AI expert I do have both math and computer science degrees so I can somewhat appreciate (if by no means fully understand) the quality - and relevance - of the thesis he made available to this discussion.

I do have a question to ApolloChariot to help my understanding: I am familiar (or at least I like to think I am familiar) with the concept of overfitting. My understanding (or guess) is that the concept of overfitting likely ties closely to the representativity of the data one has sampled with respect to the full data universe. If it isn't representative then my feeling is that you would have (by definition?) an overfit. Conversely, if it is representative then (by definition?) you wouldn't. If that is true then would that mean that you can't a priori say that any algorithm is an overfit or not? (Although you may suspect it of course). And if all the above is true then it seems to me the discussion becomes about whether any set of sample trading data in a (crypto)trading environment can ever be considered representative and, if so, what amount and type of data that would need to be.

Statistical significance is (by definition I would say) a statement about the representativity of the sampled data for the full data universe. If (and that is a big if) the sample data is representative then the algorithms found would (in my view) be (by definition?) statistically significant.

Possibly the question itself is too binary. One may want to actually look for a "sufficiently representative" sample which would mean that it produces "some" overfitting but "not too much". I have no idea how that would work but it would seem to me that it would tie in closely with the concept of how statistically significant a certain result is.

To sum up, it appears to me that the concepts of statistical significance, data representativity and overfitting are intimately linked or perhaps they are even simply alternative definitions of the same underlying concept.

P.S. I have traveled forward in time (in real-time) from 1954. I am strongly hoping that no one proves me wrong :^).
 
Why are you so aggressive? Did I offend you somehow?

Well, I am offended. 100 %.

I have used it quantnet since high school and I love idea of having experts giving me advice and read about people’s thoughts and doubts. When some jerks take advantage of the community and try to push something that looks like tricky schemes, it PISSES me off. Thanks to Duffy and ApolloChariot for not letting you get away with it.
 

Daniel Duffy

C++ author, trainer
Thanks, Aber
I probably have a developed a nose for bluffers down the years. I show no mercy.
And this one is full of it.

I want to thank Daniel Duffy for making the excellent thesis report available here. While I am neither quant nor AI expert I do have both math and computer science degrees so I can somewhat appreciate (if by no means fully understand) the quality - and relevance - of the thesis he made available to this discussion.
This was very good indeed. The student did a great job. Something that Lao san can learn from, maybe.
 

Daniel Duffy

C++ author, trainer
I've followed this robust thread with considerable interest. Full disclosure: I am associated (as an advisor) with Rockstart. (rockstart.com). Rockstart runs programs supporting startups. Deepcrypto was recently selected as one of a number of AI related startups which Rockstart supports (ROCKSTART ANNOUNCES THE TOP 10 STARTUPS FROM THE 3RD EDITION OF THE AI 2019 PROGRAM - Rockstart).

I want to thank Daniel Duffy for making the excellent thesis report available here. While I am neither quant nor AI expert I do have both math and computer science degrees so I can somewhat appreciate (if by no means fully understand) the quality - and relevance - of the thesis he made available to this discussion.

I do have a question to ApolloChariot to help my understanding: I am familiar (or at least I like to think I am familiar) with the concept of overfitting. My understanding (or guess) is that the concept of overfitting likely ties closely to the representativity of the data one has sampled with respect to the full data universe. If it isn't representative then my feeling is that you would have (by definition?) an overfit. Conversely, if it is representative then (by definition?) you wouldn't. If that is true then would that mean that you can't a priori say that any algorithm is an overfit or not? (Although you may suspect it of course). And if all the above is true then it seems to me the discussion becomes about whether any set of sample trading data in a (crypto)trading environment can ever be considered representative and, if so, what amount and type of data that would need to be.

Statistical significance is (by definition I would say) a statement about the representativity of the sampled data for the full data universe. If (and that is a big if) the sample data is representative then the algorithms found would (in my view) be (by definition?) statistically significant.

Possibly the question itself is too binary. One may want to actually look for a "sufficiently representative" sample which would mean that it produces "some" overfitting but "not too much". I have no idea how that would work but it would seem to me that it would tie in closely with the concept of how statistically significant a certain result is.

To sum up, it appears to me that the concepts of statistical significance, data representativity and overfitting are intimately linked or perhaps they are even simply alternative definitions of the same underlying concept.

P.S. I have traveled forward in time (in real-time) from 1954. I am strongly hoping that no one proves me wrong :^).
Got it.
I see that Rockstart is based in Amsterdam and seems to be associated with Tilburg and Eindhoven. Just so happens I have lived in Amsterdam/NL for years and years. So, your setup is a campus company.. You need a) a product and b) good marketing IMHO.

My 2 cents,
My first impression is that you are not going about this in the right way, maybe because you are not very familiar with quant finance. Most quants want to see facts, not rambling descriptions or being lectured on what an NN is.

You could improve your presentation skills and a bit more humility (e.g. don't begin a thread with 'Wanna' (poopie joopie)and don't take things personally ;)). Your 'stubborn' posters still make people more annoyed by their puerile posts.

This was my direct Dutch response which I hope may help you out of this impasse. Are you looking for alpha-testers for your software? Using QN to help find the requirements for the application? Not clear.

Met vriendelijke groet

DD
 
Last edited:
Daniel,

Thanks for your remarks. I agree with your 2 cents (in fact, I would put the value far higher based on a Black-Scholes model :^). I wasn't involved in that initial messaging/posting and I think the original poster learned a (hard) lesson in communication :^). I do know the poster personally and he is at heart a researcher struggling with the correct approach and is by no means a slick salesman, let own a snake oil peddler. If that were the case Rockstart would never have selected the company to support. To put it mildly, the wording of the post(s) was perhaps somewhat unfortunate :^).

Be that as it may, I don't want to throw the baby out with the bathwater. We really do think that there may be "something" there (which of course is the reason why Rockstart is supporting this effort in the first place). In fact, the paper you kindly released actually bolsters me in that thinking in that it shows that neural networks can provide accurate predictions on certain trading behaviors under certain conditions. Now, I am not naive enough to think that those networks can directly transfer over to the cryptocurrency market. That market is very different as ApolloChariot clearly explained. But it is a market and there are identifiable trading behaviors which are, of course, very likely very different from the mainstream stock market.

One difference that most people (I believe) would agree on is that the level of manipulation in the cryptocurrency market is much higher it is far less regulated. I think someone advanced the suggestion that this manipulation might be a reason that neural networks or machine learning, in general, may not work in the cryptomarkets. When you break it down, however, manipulation is the (intentional) generation of real or perceived events to move the market. It occurred to me that society is "manipulating" the stock market all the time by (unintentionally) generating real (or perceived) events that move the market. I would contend (or at least hypothesize) that from a machine learning standpoint whether the manipulation is intentional or unintentional is not important. It is simply a case of having the right amount and type of data (if you, of course, can get the right amount and type of data - that is by no means a given) to discover relevant patterns (and I would say there is a fair chance that intentional manipulation actually does follow certain patterns).

As to your question: we are primarily currently interested to see if QN can help us understand the requirements of traders (and trading) and to help us validate to what extent (if any) neural networks (as expressed in this software) can help provide consistently useful (i.e. profitable) predictions and also whether those neural networks can be provided and configured in a fairly simple user friendly way.

To address another question asked by someone in this discussion which (rightly) caused a lot of skepticism: if you have found a license to print money why offer it here - and for free? If we were really printing money then we would (naturally :)) be having cocktails in the Bahamas (at least before the hurricane hit). This is something that appears promising - no more no less. Even if these tools do work it by no means disintermediates the (skilled) trader: you still need a lot of trading skills to select and test the right portfolio of strategies. This startup wants to make meaning by developing AI software (initially for trading) and not by doing trading themselves. Not everyone wants to be (or has the skillset) to be a trader.

Finally, we are also interested in the broader question of better understanding whether the complex and currently very expensive process of developing a user-friendly AI platform can be disintermediated i.e. can you have a ready-made user-friendly AI platform that can be fairly easily configured to your specific domain needs and is user-friendly in its usage.

Hope this helps clarify our intentions.

regards roelof
 
Last edited:

Daniel Duffy

C++ author, trainer
All software projects are risky. In your case I think you have more questions than answers and I see it as a high-risk undertaking. I suspect also that other competitorts are several years ahead.

Out of curiosity, what is the expertise of your team? CS, maths, phyiscs, web design. Lao was unable/unwilling to tell me which algorithm is being used.. quants don't like black boxes and they want to know the inner workings.

Anyways, good luck. I'm out.

"Finally, we are also interested in the broader question of better understanding whether the complex and currently very expensive process of developing a user-friendly AI platform can be disintermediated i.e. can you have a ready-made user-friendly AI platform "

This a low-risk item, it is not the main requirement. The essential difficulties are usually the ones that developers address last, as deadline is reached and budgets get stretched.
As Fred Brooks said: "No Silver Bullet".
 
Last edited:
Our company is an early-stage startup working on deep learning for algo trading. We are developing it for over a year now.

DeepCrypto.AI is a tool that mimics the process of the development of trading algorithms but uses neural networks instead of classical algos. NN training replaces algo creation and backtesting phase, and then exhaustive forward testing can be done.

The trained neural network is just an advanced algorithm. It can split incoming data in 100 000 or more features and therefore can find patterns which humans will miss.

We have run over a 1 000 000 back and forward tests and have received promising results.
However, we need professionals to prove that we are wrong if that's possible.

Feel free to do it at: www.deepcrypto.ai

Linear regression time! :D
JK. Sounds cool.
 
All software projects are risky. In your case I think you have more questions than answers and I see it as a high-risk undertaking. I suspect also that other competitorts are several years ahead.

No doubt that this is a high-risk undertaking but that usually tends to be the nature of startups anyway...
I see you're quoting Fred Brooks: that belies your (and my) age, Daniel :^).

Out of curiosity, what is the expertise of your team? CS, maths, physics, web design.

The cast of characters involved includes the following profiles:
  • Data scientist, Ph.D., 16 years of research experience, 4 years running startup experience
  • Business development 19 years running IT company experience
  • Trader, Investor with 10 years of trading experience
  • Operations with 19 years running IT company experience
  • IT Architect 18 years of software development experience
  • Lead developer with 13 years of software development experience
That is separate from support from Rockstart so I think it fair to say that there is a fair amount of expertise involved.

Lao was unable/unwilling to tell me which algorithm is being used.. quants don't like black boxes and they want to know the inner workings.

Do you mean specifically which machine learning algorithm/paradigm is being employed?

Anyways, good luck. I'm out..

Thanks, we'll need it :).

regards roelof
 

Daniel Duffy

C++ author, trainer
Do you mean specifically which machine learning algorithm/paradigm is being employed?

Yes, BPN, SGD, momentum, LSTM, etc.

More generally, the design blueprint/top level data flow diagram.
Do organizations do white papers these days so that you can see what the product is/does? Other industries (e.g. auto glossies) do.
 
Last edited:
This is why i have a deep distrust of industry practitioners. Running a million backtests on a fixed data set without a rigorous explanation of why it makes sense is a recipe for disaster. Researchers have already covered this widely in the past (ie White 2003; Harvey et al 2016).

This reminds me of an article i read earlier this year Bloomberg - Are you a robot?
 
Top