Is Walk-Forward (out of sample) testing simply an illusion?

Discussion in 'Strategy Development' started by pursuit, Oct 17, 2017.

  1. tommcginnis


    Not only has no one written that the domain between sub-segments be the same, but I have repeated referenced it being variable as a relevant factor.:rolleyes:, :wtf:, :banghead:, :cool:

    That said, great exhibits (labeling aside).:thumbsup:
    #31     Oct 23, 2017
    Joebone likes this.
  2. Simples


    Simple models and simple changes to such can yield vastly different results. A tool that may help is how many reasons do you have for your solutions not to be overfit? Doesn't matter what they are, but how you establish them matter greatly. These reasons may even be superior to out of sample and forward testing, because if they're right, they should work regardless of these tests, though they could still act as a tool for model validation.

    Complex models on the other hand, may be overfit already, simply because of how they became so complex in the first place (in order to fit the data perhaps?). They're often characterized by lack of robustness and fickle dependencies (ie. bad data quality).

    It's a mindbender and topic of exploration that may take lifetimes.
    #32     Oct 23, 2017
  3. userque


    Lol...yeah...coulda did a much better job with just a little bit more effort.
    #33     Oct 23, 2017
    tommcginnis likes this.
  4. Macca1


    I'm here for trading related entertainment.

    -Segment 1 (70% or data) turns out to be based on a strong bull market,
    -Segment 2( 30% of data) turns out to be based on a rapid decline.
    -Segment 3 (100% of data)

    *we have a long only strategy
    *we are blinded and have no idea what the data in segment 2 looks like

    A) If we tested strategies based only on segment 1, then the equity curves could significantly under-perform on segment 2, making the strategies no longer viable. If some still performed as expected ( even after a regime change), then we know what to investigate further.

    B) If we were unblinded and tested strategies across all data 1+2( Segment 3) our strategy design could have already compensated for the decline seen in segment 2( In fact we might have decided that a long only strategy was no longer a viable option). Either way, we have opened ourselves up to curve fitting, or at least increased the likelihood.

    When Segment 1 contains vastly different characteristics to Segment 2, then the strategies we arrived at in (B), are going to be different to the Strategies we arrived at in (A). Even though the strategies that performed well in (A) will still perform the same in (B), they could easily get overlooked for better performing strategies derived from only (B). Therefore we will not arrive with the same choice of strategies in both cases.
    Last edited: Oct 23, 2017
    #34     Oct 23, 2017
  5. userque


    You are missing that the x-axis values are different in segment 1 vs segment 2.
    Hypothetically, the best strategy for segment 1 can also be the best strategy for segment 2.

    #35     Oct 24, 2017
    Macca1 likes this.
  6. sle


    As a matter of fact, if
    - a strategy is built on a good prior hypothesis
    - the effect has good statistical significance
    - and the number of free parameters is low (preferably none)
    it's a perfectly OK thing to do. In fact, you would be better served building a collection of simple strategies this way vs going in circles optimizing something complex.
    #36     Oct 24, 2017
  7. userque


    Oh, of course. I agree. I was merely pointing out that that's not the conclusion that can be drawn from this particular hypothetical.
    #37     Oct 24, 2017
    sle likes this.
  8. Macca1


    What are you talking about? Hypothetically sure, the best streagy for segment 1 can also be the best strategy for segment 2. However, it can also not be the best strategy aswell.
    #38     Oct 24, 2017
  9. userque


    I wanted to expound, but had to stop my analysis of your post. (See below).

    I know right.


    This is not what the OP says. The OP says that we pick one of the available strategies that also does well in segment 2 as well as segment 1. So I guess I must stop here since your hypothetical requires something different.

    #39     Oct 24, 2017
    Macca1 likes this.
  10. pursuit


    Optimizing on seg1 and then picking only strats that look pretty on seg2 will result in a similar selection of strats as optimizing on the whole seg3. If we are testing a non-optimized strat - same thing. We end up with a similar selection regardless of whether we explicitly optimize some parameters or not. By selecting only pretty equity curves we are "optimizing".

    It's really not that hard to understand (or I guess it is for some people judging from some of the replies on the thread). The out of sample thing is a fallacy and great for marketing, especially to retail traders.

    It proves nothing and does nothing to increase the likelihood of success live. Other tests of robustness must be implemented.
    Last edited: Oct 28, 2017
    #40     Oct 28, 2017
    digitalnomad likes this.