Backtest results vs real life

Discussion in 'Strategy Development' started by markd01, Jul 29, 2011.

  1. markd01



    Could someone share their long term backtest results vs what they get in real life, plus ways to make the two match closer?

    For example:
    A) backtested over 20 years: CAGR = 550%; DD = 20%; avg p/l = 2%; profit factor= 2.0; exposure = 55%. Skewed as Internet bubble and bust years would see few thousand percent returns per year. Worst year return = 40%; 2nd worst year return = 130%.
    B) traded real life for 6 months: CAGR = 20%; DD = 12% as not yet hit crazy times, did not trade this system live end of 2008 bear market; avg p/l = 0.5% and profit factor= 1.4 as avg losers are bigger and win rate is lower than in backtests; exposure = 45% as relatively there was less volatility which translated to less trades.

    About my system:
    I have an end of day trading system, where all orders are placed in the evening, outside of market hours. I place many limit orders to enter positions, and only a small percentage of them fill. I trade mean reversion both long and short, staying in positions at least one day and at most several days. It's all rules based, non-discretionary.

    Here are some of the things I look at in order to make real life closer to backtest results:
    1) Ranking: my end of day swing trading system backtests will differ, as fills in real life are done based on time, while I don't know which of my limit orders would fill. My backtests take the highest ranked trades for the day. Does anyone have a similar trading system to mine and used intraday quotes to test?
    2) Slippage: Almost all of my entries and exit are done with limit orders. But I also have code to add slippage in case prices gap and open is a better price than my limit order.
    3) Liquidity: I take out a tiny percentage of daily money flow. Partial fills are not common at all.
    4) Commissions: I plug in the exact commission from my broker.
    5) Manually excluding signals (due to news) while backtests take all trades no matter what. Most of the time I'd have different fills in place of the excluded as my exposure to the market is pretty high.
    6) Survivorship bias: I backtest using the whole universe of stocks including delisted.
    7) Significant sample size: My backtests go back 20 years, and average 800 trades a year.
    8) Not getting best price of the day: If my limit order is to buy at $9.50, low of the day needs to be $9.49 or lower for the backtest to take the trade.
    9) No shares available to short: I'm investigating having multiple brokerage accounts to have a higher chance of getting borrowable shares.
    10) Following my rules: it's rare, but if I did something that's not in my rules, I track differences to see if I added or subtracted value -- over few months it's a wash. I also compare live trading and backtesting results at the end of each week as I go, and overall results are pretty close over the last 6 months, even if sometimes different trades are taken.
    11) Quality of data: I paid for my end of day quote database, and it's used by many other system traders, with good reviews. I have seen data spikes in free data which was not real or tradeable.
    12) More limit orders than available buying power: I have a workaround where I can place a larger number of limit orders even if they exceed my buying power. Once orders are filled and buying power goes to $0, then outstanding limit orders would get rejected.

    Thank you,

  2. 550%? You will own the world in 20 years.
  3. markd01


    I'd obviously run against liquidty issues, and would have to start trading only the most liquid issues at some point... :) I don't have that "problem" yet, and can afford to trade some small caps. Currently, my backtest have a static minimum liquidity requirement.

    Any suggestions on how else to effectively lower backtests results and raise live trading results, by making both more realistic?
  4. ronblack


    Hopes and dreams.
  5. I track a small number of automated intraday systematic strategies (ETFs and futures), and compare 12+ months of live results with backtests over the same period.

    Yes, there are differences, as you suggest. I have managed to pin down the sources of a few of these:

    - Data: if the origin of your backtest data is not the same as the origin of the data you trade “live” from, there will be differences; i.e. the summarized, consolidated, after-the-fact historical data provided by the vendor will differ from what would have been available to you in realtime from your broker. If your executions occur at extreme prices for the day, this may be a factor… (and are you backtesting just off daily bars, or do your test bars have a finer resolution?)

    - Order queue position: for my own intraday trading, order queue position is important (is it also important for your swing style trading?). A simplistic tick based backtest can’t give you information about the probability of your limit order actually being executed at a given price level within a given time. Again, if your executions occur at extreme prices for the day, this may be a factor…(to get around this, you need the data to simulate the full history of all orders for all relevant venues, and an order matching simulator)

    Finally, you haven’t specified whether you are using a commercial testing package, or have built your own. In the later case, you might want to check that your backtester is not accidentally allowing you to “data snoop” (by which I mean “accidentally use data that would in real life only be available in the future to make trading decisions in the present”), which would certainly make backtest results appear better than anything you might achieve in “real life”…
  6. markd01


    abattia, thank you for your reply, elaborating on what makes backtests always better/different than live trading...

    a) Do you also do any comparisons of the in sample backtested data (older data your strategy was developed on before you started live trading) and out of sample or live data?
    b) Is 12 or more months the minimum you would compare (as less would not be as meaningful), or does it just happen to be the period when you started live trading those particular strategies?
    c) What's an acceptable degree of difference btw backtest and real life for you? I suppose that for in sample data you'd be more lenient and for example take a 30% CAGR live vs 60% backtested CAGR, or a factor of 2, but with out sample live vs out of sample backtested over same period they would have to be much closer..

    I use daily or EOD bars to test with.
    I remove some extremes by making sure I did not get the best price of the day.
    There are other extremes where in real life I may not get a live trade while backtester would have thought that I did. My order isn't live until the current price gets close to my limit price. I've seen sudden powerful moves in one tick only, usually at the open, where a large seller dumps a large chunk of shares at market prices. The next tick, when my order becomes live, is a much higher price as there are no other sellers willing to sell as those extreme prices.

    This would be done at a much smaller granularity than my EOD system.
    When it comes to avoiding extremes, how much do you think it helps that I
    a) take only a small percentage of daily liquidity
    b) make sure that I didn't not get the absolute best price of the day
    as opposed to going all out and using an order matching simulator?

    This is a very important point that I omitted to mention in my original post. Thank you for bringing it up.
    I did in fact make sure that I don't have future leaks in my backtesting code. All of my references to quotes are offset by one day/bar, so for example if I needed to get 20 day MA, I'd say get me the 20 day moving average based on close as of 1 day ago. If I used current day's close, I'd be looking into the future.
    I use AmiBroker as my backtesting platform.

  7. minmike


    I'm not a stock guy, but I believe that when posting trades outside of market hours, i think the nbbo rule are off. so the prices might even be available somewhere, but you might not be seeing/getting them.
  8. markd01


    I only trade within market hours, as majority of the issues I trade don't have much pre or post market activity. I don't know about availability nor reliability of historical extended hours quotes for US equities..
  9. minmike


    I guess I am not understanding you description very well. Has your forward testing along side your trades shown a decrease in profits also?
  10. markd01


    I have an end of day trading system, where all limit orders are placed in the evening, outside of market hours, to possibly be executed next day after market opens.
    #10     Aug 1, 2011