How do you avoid overfitting or over-optimization in your backtest?

Discussion in 'Strategy Building' started by mizhael, Feb 24, 2010.

  1. Sisko7

    Sisko7

    Only with adequate testing.
     
    #11     Feb 25, 2010
  2. I'd like to understand your point but I can't tell if you're saying that optimization is necessary and unavoidable or if system development is impossible. (i.e. all systems have to have at least one parameter)
     
    #12     Feb 25, 2010
  3. How?

    If you split the historical data into two parts, and train the system on one data and test the system on the other, then the test data may not represent the training data...
     
    #13     Feb 25, 2010
  4. Good questions. My answers are as follows:

    (1) All systems that rely on some function of price (not price itself) that involves variable(s) are essentially optimized with respect to some arbitrary objective function. This does not mean automatically that optimization is a bad thing but experience shows it is a bad thing.

    (2) Not all systems have to have at least one parameter. Example: close > open. This silly system does not have any parameters. The same holds for a large class of similar systems that are based solely on price data and have no functions of price. The problem here is how you choose the systems and this theoretically introduces selection bias. Selection bias does not automatically translate to bad practice but experience has shown it can be a problem, much lesser than optimization though.

    Optimization is orders of magnitude more potentially damaging than selection bias. The order depends on the number of parameters involved.

    Furthermore, nothing said above implies that system development is impossible.
     
    #14     Feb 25, 2010
  5. Maybe I'm missing something as your logic seems to be circular. C>O is a rule, just like EMA(10)>EMA(20) is a rule.

    Rules are either statistically significant or they are not...

    The fitting problem isn't a result of optimization, its a result of not developing fundamentally sound and robust rules, i.e. the trading concept is critical.
     
    #15     Feb 25, 2010
  6. Dacamic

    Dacamic Guest

    Not only is C>O a rule, it also has parameters defining which bar's close and open prices to use (in this case, most likely from the current bar).
     
    #16     Feb 25, 2010
  7. I have told you before...whatever you say...I didn't talk about rules...where did you see the word "rule". What is a rule anyway?

    You have never in the past presented a sound argument against what I am saying other than the irrelevant claim that my logic is circular based on you assigning the common label "rule" to the examples I gave.

    Well, these examples refer to totally different trading systems. By asserting that they are both rules, it is your logic that is circular, not mine. It is like saying that 2 and 3 are the same because they are both numbers. It is a trivial argument that your are making based on semantics rather than intrinsic qualities.
     
    #17     Feb 25, 2010
  8. intradaybill,

    I've followed some of your arguments regarding curve fitting with rules.
    Although I follow your reasoning, I think if you view it from a classification perspective, you should see that you are always curve fitting. For instance, not only is C>O a curve fit: C-O>0, but there is an implicit parameter, which is the value zero. However you arrived at that strategy C>O, there could have been an infinite number of alternative strategies, i.e. C-O>x, where
    x subset Real numbers. Who's to say that C-O>0 was the most robust strategy? C-O>2% might be far more robust. Perhaps even a non-linear relationship would be more robust. Would you agree that there is a bias other than selection present here?
     
    #18     Feb 25, 2010
  9. A rule is any condition.

    "If Close > Open then ..." is a rule.

    "If F(Close) > F(Open) then ... " where F is some function is also a rule.

    "If mars is red = true then " is a rule.

    I'm not arguing with you, I'm presenting a thought process that doesn't distinguish between the types of rules/parameters/variables used (whatever you choose to call it).

    Rules are what one alters when they perform an optimization. All rules have parameters and all rules can be weighted (strengthened or removed entirely). The issue here is the validity of the core/initial rule set and I'll say it again - is the fundamental concept one seeks to trade based off a proven market dynamic?

    I don't see how you make the assertion that these examples refer to totally different trading systems and your example does not make your point.

    If we posit that a trading system attempts to quantify a market behavior by whatever means necessary then "If C>O AND EMA(10)>EMA(20) THEN etc" is no different structurally or logically from "If mars is red AND I'm sad THEN etc". Again, the "whatever mean necessary" is subjective and often fundamentally flawed but it does not make a distinction between price and its derivative, i.e. a rule or a parameter is no less a rule or parameter because it is optimizable or a function of price.
     
    #19     Feb 25, 2010
  10. dloyer

    dloyer

    The way I see it, curve fitting is good.

    OVER fitting is bad.

    The problem is that there is no fine line between them, but a big, wide, gray fuzzy line.

    Any testing of rules or parameters will result in some degree of selection bias. Selection bias can not be avoided, only managed, just like risk must be managed.
     
    #20     Feb 25, 2010