how to do judge the degree of data-mining/over-fitting in a strategy?

Discussion in 'Data Sets and Feeds' started by mizhael, Jan 19, 2011.

  1. Hi all,

    Suppose somebody comes to you with a strategy and with backtest results. The Sharpe ratio is good and the cumulative PNL curve looks great!

    But you are concerned that there might be too much data-mining/overfitting...

    How do you evaluate the degree of data-mining/overfitting?

    How do you evaluate the quality of the strategy?
  2. buckoboy


    My most successful strategies are not overworked. The number of variables that go into a stock screener are best when kept to a minimal. Its easy to create a successful backtested strategy when you tailor it with 20 conditions. This is a sure fire sign the strategy was created for a particular market condition and may not work once the conditions change.
  3. nLepwa


    What's the fundamental reason that makes your strategy work?
    In other words, what market behaviour is your strategy based on?

    If you can't answer that question, you are most likely overfitting.

    (Quantitatively speaking, varying the parameters and checking for robustness should give you a rough idea. Also look at the distribution of your pf. If the distribution isn't stationary you can get back to work).

  4. I would ask which school they went to.
  5. You can tell if a system is overoptimized by looking at the type of trading rule involved in the system.

    If the system is optimized to one <B>specific value</b>, then likely the system is overoptimized. If you have complex queries generated binary results 0 or 1, then look at the queries themselves... what values are preset in the strategy, and how would varying the values alter the backtest results.

    Note that some people only create strategies that work on multiple instruments to avoid the historical bias inherent in single strategies....while these strategies are more volatile and may have larger drawdowns, they tend to be more robust over the long haul.
  6. Of course they all claim that their strategies are based on fundamental reasons.

    And why is this:

    "Also look at the distribution of your pf. If the distribution isn't stationary you can get back to work"?

    Could you please elaborate?
  7. Sometimes their strategies don't have obvious numerical parameters.

    For example, they take NASDAQ 100 index, and change the weights of the index constituents, based on some sort of fundamental reasoning... and the result is much better than NASDAQ 100 index.

    Such a strategy doesn't really have a parameter e.g. moving average, RSI, etc.

    Is there a way to quantitatively evaluate the dangerousness of such strategy?
  8. nLepwa



    You consider your pf as a random variable. At the end of every trading period you get a realization.
    You then analyze the characteristics of the distribution of that random variable.
    If the distribution is stationary your strategy is robust.

  9. nLepwa


    Fix or dynamic weights?

    For fix weights the risk of over-fitting is huge and I wouldn't trade it.

    For dynamic weights you can test the algorithm on different asset classes and come up with something quite robust.

    Actually, some algorithm even have theoretical guarantees. You can reach the performance of the best fixed weight portfolio in hindsight up to a constant.


  10. This.

    You must have enough trading experience and market expertise...
    To answer this question quite precisely.

    Your trading strategy MUST give you a Competitive Advantage...
    And if you cannot explain it in 30 seconds or 30 words...
    Then you do not have one.

    There is a basic reason why mechanical strategies do not work:

    You are trading against experts and insiders...
    And mechanical strategies...
    Usually put you on the opposite side of the trade versus the Pros.

    This is even more pronounced in sports betting...
    Backtested strategies that perform beautifully on paper...
    Have you betting against the "wise guys" in real life.
    #10     Jan 19, 2011