Why are you so aggressive? Did I offend you somehow?
Got it.I've followed this robust thread with considerable interest. Full disclosure: I am associated (as an advisor) with Rockstart. (rockstart.com). Rockstart runs programs supporting startups. Deepcrypto was recently selected as one of a number of AI related startups which Rockstart supports (ROCKSTART ANNOUNCES THE TOP 10 STARTUPS FROM THE 3RD EDITION OF THE AI 2019 PROGRAM - Rockstart).
I want to thank Daniel Duffy for making the excellent thesis report available here. While I am neither quant nor AI expert I do have both math and computer science degrees so I can somewhat appreciate (if by no means fully understand) the quality - and relevance - of the thesis he made available to this discussion.
I do have a question to ApolloChariot to help my understanding: I am familiar (or at least I like to think I am familiar) with the concept of overfitting. My understanding (or guess) is that the concept of overfitting likely ties closely to the representativity of the data one has sampled with respect to the full data universe. If it isn't representative then my feeling is that you would have (by definition?) an overfit. Conversely, if it is representative then (by definition?) you wouldn't. If that is true then would that mean that you can't a priori say that any algorithm is an overfit or not? (Although you may suspect it of course). And if all the above is true then it seems to me the discussion becomes about whether any set of sample trading data in a (crypto)trading environment can ever be considered representative and, if so, what amount and type of data that would need to be.
Statistical significance is (by definition I would say) a statement about the representativity of the sampled data for the full data universe. If (and that is a big if) the sample data is representative then the algorithms found would (in my view) be (by definition?) statistically significant.
Possibly the question itself is too binary. One may want to actually look for a "sufficiently representative" sample which would mean that it produces "some" overfitting but "not too much". I have no idea how that would work but it would seem to me that it would tie in closely with the concept of how statistically significant a certain result is.
To sum up, it appears to me that the concepts of statistical significance, data representativity and overfitting are intimately linked or perhaps they are even simply alternative definitions of the same underlying concept.
P.S. I have traveled forward in time (in real-time) from 1954. I am strongly hoping that no one proves me wrong :^).
Our company is an early-stage startup working on deep learning for algo trading. We are developing it for over a year now.
DeepCrypto.AI is a tool that mimics the process of the development of trading algorithms but uses neural networks instead of classical algos. NN training replaces algo creation and backtesting phase, and then exhaustive forward testing can be done.
The trained neural network is just an advanced algorithm. It can split incoming data in 100 000 or more features and therefore can find patterns which humans will miss.
We have run over a 1 000 000 back and forward tests and have received promising results.
However, we need professionals to prove that we are wrong if that's possible.
Feel free to do it at: www.deepcrypto.ai
All software projects are risky. In your case I think you have more questions than answers and I see it as a high-risk undertaking. I suspect also that other competitorts are several years ahead.
Out of curiosity, what is the expertise of your team? CS, maths, physics, web design.
Lao was unable/unwilling to tell me which algorithm is being used.. quants don't like black boxes and they want to know the inner workings.
Anyways, good luck. I'm out..