An example of curve fitting - How can I avoid it?

Discussion in 'Automated Trading' started by cunparis, Jan 3, 2009.

  1. I just found a good example of what happens with curve fitting.

    Before my vacation last month I ran optimized my strategy up until 12/4/2008 and I got values for the indicators I use. I copied them to excel for future reference.

    So today 1 month later I reran optimized through jan 2 and I got different values. Very different. I couldn't see how this was possible, so I re-optimized to 12/4 and got the old values again. This confirmed that I hadn't copied it wrong when I put it in excel.

    Then I ran the backtest for 12/4 - 1/2/2009 using the values obtained 12/4 and guess what? There was an $800 loss during this time! So when I optimized through jan 2, other parameters were determined to be optimal which eliminated this bad trade!!

    I'm kind of stuck here because I don't know what to do. If I use the new values, it's possible to have a bad trade come up that wouldn't have been chosen had I kept the original values. If I use the original, the reverse is possible.

    My thinking is to use the new values. I think we can't know either way which set are the best and only forward testing will tell. So my idea is to always use the values obtained from the most comprehensive data (longest timespan).

    But if one optimizes every month then won't the system be more and more curve fitted? Would that be a bad thing?

    Just for discussion, imagine a system that we can backtest for the past 10 years. It would seem that after 10 years we have the "optimal" values (let's assume that the market conditions don't change!). So it would seem that optimizing every month during the 10 years would move the system more and more towards the optimal values.

    However the market conditions change and what worked before may stop working.

    This is all very confusing to me. I'm afraid that my strategy won't work as well in the future as it did in the past. That is to be expected I think. If it will work similar then i'll be happy. If it doesn't work at all I'll be very disappointed.

    Final question: How many trades should one have in the backtest in order to have confidence that it isn't just curve fitted? Is 100 enough?

    Thanks in advance for sharing your point of view.
     
  2. slacker

    slacker

    Divide your data into 2 groups, backtest sample and 'out of sample' data. Train and optimize using your 'back test' and then run again on your 'out of sample' data set. The results from both sets should be almost identical otherwise you are curve fitting the data.

    You can also randomly select blocks of data to be your 'out of sample' reality check.

    A simple presentation on another technique to reduce the impact of curve fitting can be found at:

    http://www.amibroker.org/userkb/2007/08/13/5-io-out-of-sample-and-walk-forward-testing/

    A basic article on the subject at:

    http://findarticles.com/p/articles/mi_qa5282/is_/ai_n24296459

    Good trading.
     
  3. MGJ

    MGJ

    Remember that your task is to predict the future.

    Which parameter values will work well in the future?

    Backtesting and optimizing can help you discover some parameter values that worked well in the past.
     
  4. Thats why backtest mechanically is a waste of time.

    Best way to test something is take an idea and manually backtest it by going thru each and every bar. You will see things manually that you will never notice off summary of the mechanical test.

    So take an idea, plug it in and test it mechanically to see just basis results. If you like the idea regardless of the results then go back and do it manually. Takes a long time but you will have a much better understanding of how the idea works.

    John
     
  5. janus007

    janus007

    Backtesting never works, it's something invented by some smart sales department to deliver black boxing to the public :D
    A system should be created once and then work, you should of cause pay attention to which equities you use, for instance; small stocks, need different observation frames than the big stocks. Forget about optimizing MACD, RSI and so forth, instead focuse on the clear - what moves the stock up or down and then develop your system based at these observation i.e. time, tick etc.

    It depends on your timeframe, i.e. ticks, minutes, days etc.
    It would be a bit overkill to backtest a tickbased system 5 years back. Be aware that a stock can and will change personality over time, so create some simple rules along with the different systems to cover different behaviors regarding volume by day/ week according to your own timeframe!