Backtesting & curve fitting

eusdaiki · Apr 25, 2012

Hi,

I started learning to trade with algorithms a few months ago. Ever since I've been reading a lot around and I've come to understand that curve fitting is not a good idea, and that back-testing results that start getting too close to perfect on a particular symbol or range of time are most likely showing signs of curve fitting.

So I'm wondering, what's considered to be a good back-testing result? Is it best to have a system that scores regularly against many symbols, with decent % of profitable trades and a healthy difference between trades that win and lose than one that scores very high on a few symbols only?

jcl · Apr 26, 2012

Curve fitting is not the problem, because that's the very purpose of a strategy optimization. The problem is _overfitting_ - meaning that the strategy parameters are not adapted to the general characteristics of the price curve, but to curve details that are not repeated in future trading.

Therefore, backtest results are meaningless if you don't know to which degree the optimization has overfit the parameters. Using few parameters reduces the danger of overfitting, but does not eliminate it.

The usual way to find out if the strategy is overfit or not is an out-of-sample test, especially with the WFO method.

eusdaiki · Apr 27, 2012

Quote from jcl:

Curve fitting is not the problem, because that's the very purpose of a strategy optimization. The problem is _overfitting_ - meaning that the strategy parameters are not adapted to the general characteristics of the price curve, but to curve details that are not repeated in future trading.

Therefore, backtest results are meaningless if you don't know to which degree the optimization has overfit the parameters. Using few parameters reduces the danger of overfitting, but does not eliminate it.

The usual way to find out if the strategy is overfit or not is an out-of-sample test, especially with the WFO method.
More...

Thank you for your reply, I clarifies my previous missconception.

Regarding WFO, I did some reading on it and it is basically continuously optimizing on discrete time increments. E.g. optimizing every hour to include the previous hour's data in the optimization to run with those paramethers for the next hour... once the hour is over, rinse and repeat... ?

alexvnew · Apr 28, 2012

Quote from jcl:

Curve fitting is not the problem - The problem is _overfitting_ -
More...

Like drinking and driving is not the problem - the problem is drinking too much and driving. Right?

jcl · Apr 29, 2012

Quote from eusdaiki:

Regarding WFO, I did some reading on it and it is basically continuously optimizing on discrete time increments. E.g. optimizing every hour to include the previous hour's data in the optimization to run with those paramethers for the next hour... once the hour is over, rinse and repeat... ?
More...

That's the basic idea, but normally you re-optimize not every hour, but every 2 months or so. The main advantage of WFO is that the optimization method is included in the test. This produces test results that can be achieved in real trading.

jcl · Apr 29, 2012

Quote from FringeAlgo:

curve fitting is bad bad bad in any amount
More...

Well, you're right and wrong. Curve fitting is just a mathematical procedure that is neither bad nor good -

http://en.wikipedia.org/wiki/Curve_fitting

but it can have bad or good consequences, depending on which properties of the price curve your parameters are fit to. This is a result of your choice of strategy parameters and of your optimization method.

Every strategy optimization constitutes a curve fit to some degree.

Albert Cibiades · Apr 29, 2012

Take a step backward. The worst problem is getting a positive expectation on a random result before you even start to optimize.

eusdaiki · Apr 30, 2012

@FringeAlgo & alexvnew

What approach do you use to back testing/optimization to avoid curve fitting alltogether?

or back to my original question.

What constitutes a good back test?
one that scores a high sharpe across multiple symbols?

logic_man · Apr 30, 2012

Quote from jcl:

Well, you're right and wrong. Curve fitting is just a mathematical procedure that is neither bad nor good -

http://en.wikipedia.org/wiki/Curve_fitting

but it can have bad or good consequences, depending on which properties of the price curve your parameters are fit to. This is a result of your choice of strategy parameters and of your optimization method.

Every strategy optimization constitutes a curve fit to some degree.
More...

One of the things I keep thinking about is the "why" of the improvements in profitability that an optimization brings. Some optimizations seem almost completely arbitrary, such as finding the "best" MA crossover, whereas others seem to at least bolster some real-world hypothesis, e.g. optimizing on a minimum amount of volume to generate a signal. There, the underlying "story" is that increases in volume indicate likely new buyers coming in who will support price going forward. I don't think anyone can explain (other than by reference to the "self-fulfilling prophecy" rationale) for why one MA crossover would be better than another.

When you work with the latter kind of parameters, I think you end up in a better place at the end of your optimization and with something more likely to work over the long term. There are two primary parameters in my model and I can explain in words why I think both are important and why the optimizations are reflective of something intuitive as opposed to something arbitrary.

Of course, I could be way off, but this distinction seems to make sense.

dom993 · Apr 30, 2012

Quote from logic_man:

One of the things I keep thinking about is the "why" of the improvements in profitability that an optimization brings. Some optimizations seem almost completely arbitrary, such as finding the "best" MA crossover, whereas others seem to at least bolster some real-world hypothesis, e.g. optimizing on a minimum amount of volume to generate a signal. There, the underlying "story" is that increases in volume indicate likely new buyers coming in who will support price going forward. I don't think anyone can explain (other than by reference to the "self-fulfilling prophecy" rationale) for why one MA crossover would be better than another.

When you work with the latter kind of parameters, I think you end up in a better place at the end of your optimization and with something more likely to work over the long term. There are two primary parameters in my model and I can explain in words why I think both are important and why the optimizations are reflective of something intuitive as opposed to something arbitrary.

Of course, I could be way off, but this distinction seems to make sense.
More...

+1