Out-of-sample trading strategy with neural networks

ssp729 · Nov 18, 2020

Hello everyone. So i tried to predict the close price of S&P 500 using neural networks (there are a lot of papers on the web) with NeuroshellTrader. The results in-sample was amazing, i was already ready to live on the beach and left my job.
When i tried to predict out-of-sample (the result that matters!) the dissapointment was huge.

Do you think it´s impossible to have at least good results?

MarkBrown · Nov 18, 2020

i experienced what you just went thru in 1994 with the same product lol.

amazing decades latter the lure still catches fish.

stochastix · Nov 18, 2020

If you used the neural network in your head, you would have realized that is a fool's errand

GaryBtrader · Nov 18, 2020

What kind of time frame are you talking about? From the research I've done into machine learning, neural networks, etc. it may be possible to come up with a strategy that predicts 1 or 2 days out and has a positive expectancy and acceptable drawdown.

ssp729 · Nov 18, 2020

i was using daily timeframes, 2 years in-sample and 1 day of the next close price, as well as 2 days

temnik · Nov 18, 2020

ssp729 said:
When i tried to predict out-of-sample (the result that matters!) the dissapointment was huge.
Do you think it´s impossible to have at least good results?
More...

I am a trained Ph.D., so take the following with a pinch of salt.

Neural nets and all other machine-learning tools have enormous flexibility and large number of "coefficients" (in a linear regression sense) that you don't need to explicitly specify. This makes them deceptively easy to apply to almost any problem.

Because of that under-the-hood flexibility, there's always a danger of overfitting - as in finding signals in transient noise - especially when your dataset has hidden regime changes, drift etc. Which is why, in order to feel somewhat secure in your trading, you have to hire more trained Ph.D.'s and pay them more $$$.

But, of course, there are no guarantees.

Here's a little exercise even a non-Ph.D. can do - it assumes that your system generates frequent signals (almost always trading). Compare your PnL streams in-sample (IS) vs. out-of-sample (OOS). Report back to us:

how many largest IS wins you need to turn into losses to make IS and OOS Sharpe Ratios equal?

how many largest OOS losses you need to turn into wins to make IS and OOS Sharpe Ratios equal?

shatteredx · Nov 18, 2020

ssp729 said:
Hello everyone. So i tried to predict the close price of S&P 500 using neural networks (there are a lot of papers on the web) with NeuroshellTrader. The results in-sample was amazing, i was already ready to live on the beach and left my job.
When i tried to predict out-of-sample (the result that matters!) the dissapointment was huge.

Do you think it´s impossible to have at least good results?
More...

Neural networks are not good at predicting the market directly. Neither is ML.

Linear regression easily beats both with less than 1/10th of the required CPU time.

Once you have a robust LinReg estimator, you can feed the out-of-sample results into ML to estimate the probability of profit, which is what Ernie Chan is doing at his latest venture.

ssp729 · Nov 18, 2020

temnik said:
I am a trained Ph.D., so take the following with a pinch of salt.

Neural nets and all other machine-learning tools have enormous flexibility and large number of "coefficients" (in a linear regression sense) that you don't need to explicitly specify. This makes them deceptively easy to apply to almost any problem.

Because of that under-the-hood flexibility, there's always a danger of overfitting - as in finding signals in transient noise - especially when your dataset has hidden regime changes, drift etc. Which is why, in order to feel somewhat secure in your trading, you have to hire more trained Ph.D.'s and pay them more $$$.

But, of course, there are no guarantees.

Here's a little exercise even a non-Ph.D. can do - it assumes that your system generates frequent signals (almost always trading). Compare your PnL streams in-sample (IS) vs. out-of-sample (OOS). Report back to us:

how many largest IS wins you need to turn into losses to make IS and OOS Sharpe Ratios equal?

how many largest OOS losses you need to turn into wins to make IS and OOS Sharpe Ratios equal?

More...

I was basing on this book:

He claimed to have 95% accurate on the prediction of the close price of S&P 500 (up/down) to 10 days in the future.

IS the prediction was almost perfect(correlation of 0.98 and 99% accuracy). OOS was 0.1 correlation and 20% accurate

ssp729 · Nov 18, 2020

shatteredx said:
Neural networks are not good at predicting the market directly. Neither is ML.

Linear regression easily beats both with less than 1/10th of the required CPU time.

Once you have a robust LinReg estimator, you can feed the out-of-sample results into ML to estimate the probability of profit, which is what Ernie Chan is doing at his latest venture.
More...

Can you show some results? Now i think its´impossible to predict financial series

Snuskpelle · Nov 18, 2020

ssp729 said:
Hello everyone. So i tried to predict the close price of S&P 500 using neural networks (there are a lot of papers on the web) with NeuroshellTrader. The results in-sample was amazing, i was already ready to live on the beach and left my job.
When i tried to predict out-of-sample (the result that matters!) the dissapointment was huge.

Do you think it´s impossible to have at least good results?
More...

Edit2: Maybe your post is mostly ironic and you're already fully aware of the issue, but I leave my post for any other readers that do not.

You can configure basically any kind of algorithm to output a nice looking return in-sample. Literally, you optimize the seed of a pseudo random number generator or a hashing algorithm to do that. It does not imply that it works out-of-sample. Not working in-sample does suggest not working out of sample (but is not a binary implication since it depends on what you have in-sample).

An interesting case I observed was on a site where people submitted Python codes/bots to play Rock-Paper-Scissors. One of the best performing bots was a few characters long as just calling a particular MD5 hash of the sequence of moves observed thus far. It turned out the guy that wrote it had downloaded all bots and checked which particular MD5 hash output (when transformed into an output move) would win against all deterministic bots. Against the best bots incorporating randomness it still loses almost every time though (and has 50% win rate against lesser ones).