Log in or Sign up

ET News & Sponsor Info

General Topics

Markets

Technical Topics

Brokerage Firms

Company Specific

Interactive Brokers

Tools of the Trade

Trading for a Living

Community Lounge

Site Support

Feedback

How do you avoid overfitting or over-optimization in your backtest?

Discussion in 'Strategy Building' started by mizhael, Feb 24, 2010.

Sisko7
- 74
  Posts
- 0
  Likes
Only with adequate testing.

#11 Feb 25, 2010

Share
stevegee58
- 3,580
  Posts
- 474
  Likes
Quote from intradaybill:

Well, the point is that regardless of the choice of paramaters, such system is always optimized. You can always find an objective function that any such system maximizes. The lesson is that if you have any parameters at all in your system, it is optimized, whether you actually optimized it or not.

I tell you, very few understand what I wrote above but I know a few do.
More...

I'd like to understand your point but I can't tell if you're saying that optimization is necessary and unavoidable or if system development is impossible. (i.e. all systems have to have at least one parameter)

#12 Feb 25, 2010

Share
mizhael
- 1,386
  Posts
- 0
  Likes
Quote from Sisko7:

Only with adequate testing.
More...

How?

If you split the historical data into two parts, and train the system on one data and test the system on the other, then the test data may not represent the training data...

#13 Feb 25, 2010

Share
intradaybill
- 2,961
  Posts
- 11
  Likes
Quote from stevegee58:

I'd like to understand your point but I can't tell if you're saying that optimization is necessary and unavoidable or if system development is impossible. (i.e. all systems have to have at least one parameter)
More...

Good questions. My answers are as follows:

(1) All systems that rely on some function of price (not price itself) that involves variable(s) are essentially optimized with respect to some arbitrary objective function. This does not mean automatically that optimization is a bad thing but experience shows it is a bad thing.

(2) Not all systems have to have at least one parameter. Example: close > open. This silly system does not have any parameters. The same holds for a large class of similar systems that are based solely on price data and have no functions of price. The problem here is how you choose the systems and this theoretically introduces selection bias. Selection bias does not automatically translate to bad practice but experience has shown it can be a problem, much lesser than optimization though.

Optimization is orders of magnitude more potentially damaging than selection bias. The order depends on the number of parameters involved.

Furthermore, nothing said above implies that system development is impossible.

#14 Feb 25, 2010

Share
Mike805
- 1,796
  Posts
- 4
  Likes
Quote from intradaybill:

Good questions. My answers are as follows:

(1) All systems that rely on some function of price (not price itself) that involves variable(s) are essentially optimized with respect to some arbitrary objective function. This does not mean automatically that optimization is a bad thing but experience shows it is a bad thing.

(2) Not all systems have to have at least one parameter. Example: close > open. This silly system does not have any parameters. The same holds for a large class of similar systems that are based solely on price data and have no functions of price. The problem here is how you choose the systems and this theoretically introduces selection bias. Selection bias does not automatically translate to bad practice but experience has shown it can be a problem, much lesser than optimization though.

Optimization is orders of magnitude more potentially damaging than selection bias. The order depends on the number of parameters involved.

Furthermore, nothing said above implies that system development is impossible.
More...

Maybe I'm missing something as your logic seems to be circular. C>O is a rule, just like EMA(10)>EMA(20) is a rule.

Rules are either statistically significant or they are not...

The fitting problem isn't a result of optimization, its a result of not developing fundamentally sound and robust rules, i.e. the trading concept is critical.

#15 Feb 25, 2010

Share
Dacamic Guest
- 30
  Posts
- 0
  Likes
Quote from Mike805:

Maybe I'm missing something as your logic seems to be circular. C>O is a rule, just like EMA(10)>EMA(20) is a rule.
More...

Not only is C>O a rule, it also has parameters defining which bar's close and open prices to use (in this case, most likely from the current bar).

#16 Feb 25, 2010

Share
intradaybill
- 2,961
  Posts
- 11
  Likes
Quote from Mike805:

Maybe I'm missing something as your logic seems to be circular. C>O is a rule, just like EMA(10)>EMA(20) is a rule.

Rules are either statistically significant or they are not...

The fitting problem isn't a result of optimization, its a result of not developing fundamentally sound and robust rules, i.e. the trading concept is critical.
More...

I have told you before...whatever you say...I didn't talk about rules...where did you see the word "rule". What is a rule anyway?

You have never in the past presented a sound argument against what I am saying other than the irrelevant claim that my logic is circular based on you assigning the common label "rule" to the examples I gave.

Well, these examples refer to totally different trading systems. By asserting that they are both rules, it is your logic that is circular, not mine. It is like saying that 2 and 3 are the same because they are both numbers. It is a trivial argument that your are making based on semantics rather than intrinsic qualities.

#17 Feb 25, 2010

Share
dtrader98
- 1,927
  Posts
- 69
  Likes
Quote from intradaybill:

I have told you before...whatever you say...I didn't talk about rules...where did you see the word "rule". What is a rule anyway?

You have never in the past presented a sound argument against what I am saying other than the irrelevant claim that my logic is circular based on you assigning the common label "rule" to the examples I gave.

Well, these examples refer to totally different trading systems. By asserting that they are both rules, it is your logic that is circular, not mine. It is like saying that 2 and 3 are the same because they are both numbers. It is a trivial argument that your are making based on semantics rather than intrinsic qualities.
More...

intradaybill,

I've followed some of your arguments regarding curve fitting with rules.
Although I follow your reasoning, I think if you view it from a classification perspective, you should see that you are always curve fitting. For instance, not only is C>O a curve fit: C-O>0, but there is an implicit parameter, which is the value zero. However you arrived at that strategy C>O, there could have been an infinite number of alternative strategies, i.e. C-O>x, where
x subset Real numbers. Who's to say that C-O>0 was the most robust strategy? C-O>2% might be far more robust. Perhaps even a non-linear relationship would be more robust. Would you agree that there is a bias other than selection present here?

#18 Feb 25, 2010

Share
Mike805
- 1,796
  Posts
- 4
  Likes
Quote from intradaybill:

Example: close > open. This silly system does not have any parameters. The same holds for a large class of similar systems that are based solely on price data and have no functions of price.
More...

A rule is any condition.

"If Close > Open then ..." is a rule.

"If F(Close) > F(Open) then ... " where F is some function is also a rule.

"If mars is red = true then " is a rule.

Quote from intradaybill:

I have told you before...whatever you say...I didn't talk about rules...where did you see the word "rule". What is a rule anyway?

You have never in the past presented a sound argument against what I am saying other than the irrelevant claim that my logic is circular based on you assigning the common label "rule" to the examples I gave.

More...

I'm not arguing with you, I'm presenting a thought process that doesn't distinguish between the types of rules/parameters/variables used (whatever you choose to call it).

Rules are what one alters when they perform an optimization. All rules have parameters and all rules can be weighted (strengthened or removed entirely). The issue here is the validity of the core/initial rule set and I'll say it again - is the fundamental concept one seeks to trade based off a proven market dynamic?

Quote from intradaybill:
Well, these examples refer to totally different trading systems. By asserting that they are both rules, it is your logic that is circular, not mine. It is like saying that 2 and 3 are the same because they are both numbers. It is a trivial argument that your are making based on semantics rather than intrinsic qualities.
More...

I don't see how you make the assertion that these examples refer to totally different trading systems and your example does not make your point.

If we posit that a trading system attempts to quantify a market behavior by whatever means necessary then "If C>O AND EMA(10)>EMA(20) THEN etc" is no different structurally or logically from "If mars is red AND I'm sad THEN etc". Again, the "whatever mean necessary" is subjective and often fundamentally flawed but it does not make a distinction between price and its derivative, i.e. a rule or a parameter is no less a rule or parameter because it is optimizable or a function of price.

#19 Feb 25, 2010

Share
dloyer
- 107
  Posts
- 2
  Likes
The way I see it, curve fitting is good.

OVER fitting is bad.

The problem is that there is no fine line between them, but a big, wide, gray fuzzy line.

Any testing of rules or parameters will result in some degree of selection bias. Selection bias can not be avoided, only managed, just like risk must be managed.

#20 Feb 25, 2010

Share

(You must log in or sign up to reply here.)

Search