Out-of-sample trading strategy with neural networks

Discussion in 'Strategy Building' started by ssp729, Nov 18, 2020.

  1. ssp729

    ssp729

    Hello everyone. So i tried to predict the close price of S&P 500 using neural networks (there are a lot of papers on the web) with NeuroshellTrader. The results in-sample was amazing, i was already ready to live on the beach and left my job.
    When i tried to predict out-of-sample (the result that matters!) the dissapointment was huge.

    Do you think it´s impossible to have at least good results?
     
  2. MarkBrown

    MarkBrown

    i experienced what you just went thru in 1994 with the same product lol.

    amazing decades latter the lure still catches fish.
     
    guowei58, El Trado, cvds16 and 4 others like this.
  3. If you used the neural network in your head, you would have realized that is a fool's errand
     
    guowei58, ssp729 and MarkBrown like this.
  4. What kind of time frame are you talking about? From the research I've done into machine learning, neural networks, etc. it may be possible to come up with a strategy that predicts 1 or 2 days out and has a positive expectancy and acceptable drawdown.
     
  5. ssp729

    ssp729

    i was using daily timeframes, 2 years in-sample and 1 day of the next close price, as well as 2 days
     
  6. temnik

    temnik

    I am a trained Ph.D., so take the following with a pinch of salt.

    Neural nets and all other machine-learning tools have enormous flexibility and large number of "coefficients" (in a linear regression sense) that you don't need to explicitly specify. This makes them deceptively easy to apply to almost any problem.

    Because of that under-the-hood flexibility, there's always a danger of overfitting - as in finding signals in transient noise - especially when your dataset has hidden regime changes, drift etc. Which is why, in order to feel somewhat secure in your trading, you have to hire more trained Ph.D.'s and pay them more $$$.

    But, of course, there are no guarantees.

    Here's a little exercise even a non-Ph.D. can do - it assumes that your system generates frequent signals (almost always trading). Compare your PnL streams in-sample (IS) vs. out-of-sample (OOS). Report back to us:
    1. how many largest IS wins you need to turn into losses to make IS and OOS Sharpe Ratios equal?
    2. how many largest OOS losses you need to turn into wins to make IS and OOS Sharpe Ratios equal?
     
    yc47ib, ssp729 and beginner66 like this.
  7. shatteredx

    shatteredx

    Neural networks are not good at predicting the market directly. Neither is ML.

    Linear regression easily beats both with less than 1/10th of the required CPU time.

    Once you have a robust LinReg estimator, you can feed the out-of-sample results into ML to estimate the probability of profit, which is what Ernie Chan is doing at his latest venture.
     
    yc47ib, guowei58 and ssp729 like this.
  8. ssp729

    ssp729

    I was basing on this book:




    He claimed to have 95% accurate on the prediction of the close price of S&P 500 (up/down) to 10 days in the future.

    IS the prediction was almost perfect(correlation of 0.98 and 99% accuracy). OOS was 0.1 correlation and 20% accurate
     
  9. ssp729

    ssp729

    Can you show some results? Now i think its´impossible to predict financial series
     
  10. Snuskpelle

    Snuskpelle

    Edit2: Maybe your post is mostly ironic and you're already fully aware of the issue, but I leave my post for any other readers that do not.

    You can configure basically any kind of algorithm to output a nice looking return in-sample. Literally, you optimize the seed of a pseudo random number generator or a hashing algorithm to do that. It does not imply that it works out-of-sample. Not working in-sample does suggest not working out of sample (but is not a binary implication since it depends on what you have in-sample).

    An interesting case I observed was on a site where people submitted Python codes/bots to play Rock-Paper-Scissors. One of the best performing bots was a few characters long as just calling a particular MD5 hash of the sequence of moves observed thus far. It turned out the guy that wrote it had downloaded all bots and checked which particular MD5 hash output (when transformed into an output move) would win against all deterministic bots. Against the best bots incorporating randomness it still loses almost every time though (and has 50% win rate against lesser ones).
     
    Last edited: Nov 18, 2020
    #10     Nov 18, 2020