Your Thoughts On Walk-forward Testing To Assess Robustness

Discussion in 'Strategy Development' started by tommo, Jan 5, 2020.

  1. tommo

    tommo

    Hi,

    Just wondered what your thoughts were on walk forward optimization as a way to test the robustness of a system?

    I have done conventional walk-forwards, so for example build a systems based on 4 years of data (sample of around 350 trades) and then test that system on the most recent two years of data. I have systems that work on this basis which is a good sign you have robustness in your strategy. However if I try a walk forward optimization (so for example optimize every 30 days then restest 10 days out of sample) over the same period it destroys the backtest.

    Part of me thinks this means the system is not robust, the last 6 weeks or so of optimized settings should have some predictive value on the next 2 weeks.

    However, the same system is profitable on a basic 4 year in-sample versus 2 year out of sample test with no parameter changes. Which is a positive sign.

    I find it very hard to get any walk forward optimization backtests to produce profitable systems. I don't want to cheat myself by trading non robust systems but at the same time don't want to set an unrealistic bar whereby I throw out good systems.

    What are your thoughts?
     
  2. richDude

    richDude

    It’s as useless as backtest
     
    Real Money and nooby_mcnoob like this.
  3. tommo

    tommo

    So you're against all forms of systematic trading/ research?
     
  4. It would seem wise to "back off" a bit to consider what you are attempting from a higher perspective. Insure your system is robust, by insuring you have adequate knowledge of the edge you are exploiting, and how that edge may change over time. On the surface, it seems like you are optimizing on a moving market you expect to project itself into the future. <-- unless well thought out, does not seem robust.
     
    lindq likes this.
  5. tommo

    tommo

    Hmm walk-forward optimization is considered a standard methodology in strategy development. Optimization is likely to lead to curve fitting and damages robustness. But walk-forward optimization is actually considered a way to eliminate curve fitting as you are constantly in walk-forward to test sample data against future data.

    I was just asking for advice on how best to use it really.
     
  6. richDude

    richDude

    generally speaking. Yes.

    these are attempts to find the holy grail that doesn’t exist. What it may find is some temporary market behavior skewed away from randomness. But these skews will be discovered and arb’ed away quickly anyway.

    you need a true edge like PTJ who can see the order flow faster than anyone else. Edges that cannot be negated by others.
     
    SimpleMeLike likes this.
  7. tommo

    tommo


    I partly agree and partly don't. I've been a prop trader for 11 years and had massive edges that lasted for years, sometimes went months with barely a losing trade. But I have been discretionary. I'm needing to automate more approaches. So in theory I could automate what i'm doing now.. thus proving system trading works (however is done at the order book level which is hard to backtest). Also the largest hedgefunds in the world are made up of people analysing data and backtesting it to build models. SO definitely is an edge in systematic trading.

    But on the other hand its ridiculously easy to end up just curve fitting something with no value.
     
    10_bagger, qlai, fan27 and 1 other person like this.
  8. Walk forward is good. But fitting using small amounts of data is crazy.

    GAT
     
    shatteredx and tommo like this.
  9. gaussian

    gaussian

    Why do people like @richDude come in here with this pseudo-philosophical BS instead of answering the question.


    To OP - yes walk forward testing is a fine methodology to insure the robustness of the system. You are looking at it too "systematically" however. The purpose of WFT is to prevent you from overfitting data by having your algorithm fit to new data (lookahead bias). If your retraining period is 2 weeks, you should fit for 2 weeks and test for the next 2 weeks. If you find the system is not robust across this period abandon the system and start a new hypothesis - DO NOT adjust the hypothesis to explain the data the previous one failed on.

    The problem you are experiencing (30/10 destroying your backtest) is because you are over-optimizing. You could try to fit to a larger period and test along a similarly large period - assuming you are correct about the 4 year/2 year tests being correct.

    Backtesting is more art than science is a lot of ways. It borrows from statistical robustness testing but has it's own spin on things. You will also want to include ruin analysis and/or monte carlo testing into your backtest routine. Most importantly, remember everything is flexible. Did your backtest really "fail"? Did you define your failure parameters before running the test? Can you explain the failure with an anomaly in the data that is getting captured with a quicker fitting period and not decaying out fast enough? A backtest "failing" usually gives you useful information into why it failed - and you may only need to try a larger fitting period to smooth out variation rather than throwing your entire hypothesis in the trash. For example, if you ran a system along 2007-2009 your system going forward would be deeply pessimistic. However if you fit from 1984-2009, your system would (hand-waving here) likely have less exposure to crazy temporal variation.
     
    10_bagger, taowave, fan27 and 2 others like this.
  10. tommo

    tommo


    Hi,

    Thanks for that, good response. Yeah I tested from 2014-2017 to build the system (averaging about 1 trade every other day so lots of samples) and then put the exact same parameters on the same market from 2017 to present and results were actually slightly better.

    I wanted to use the Optimized Walkfoward Testing (OWT) to keep doing mini walkforwards to get more confidence but think just optimizing a 4 to 6 week period and then running live for 2 weeks probably optimizing too frequently.
     
    #10     Jan 5, 2020