Understanding the Limitations of Backtesting in Trading Systems 1. Assumptions About Intrabar Price Movement Backtesting software often relies on assumptions regarding the sequence of price movements within a single bar (e.g., open, high, low, close). These assumptions can lead to discrepancies between backtested results and real-time trading outcomes, as the actual order of price movements within the bar is unknown. 2. Unrealistic Fill Assumptions for Limit Orders Backtests may assume that limit orders are filled whenever the price touches the limit level, without considering market liquidity or order queue priority. This can result in overly optimistic performance metrics that are unlikely to be replicated in live trading. QuantConnect+2Trading Code+2NinjaTrader Forum+2 3. Execution Timing Discrepancies Strategies that involve entering or exiting positions at the close of a bar may not account for the fact that orders are executed at the open of the next bar in live trading. This timing difference can lead to variations between backtested and actual performance. 4. Challenges with Tick-Level Backtesting While tick-level data can provide more granular insights, it is often limited in historical depth and can be resource-intensive to process. Moreover, even with tick data, accurately simulating order execution remains complex due to factors like slippage and latency. 5. Importance of Realistic Order Entry Strategies To enhance the reliability of backtesting, it's advisable to use order entry methods that reflect realistic trading scenarios. For instance, basing entries on the open of the next bar or using breakout strategies that trigger on previous highs or lows can provide more accurate assessments of a strategy's viability.
Good topic. For those interested in learning more about backtest and many pitfalls just looks up Darvinex channel on YouTube.
One of the worst things you can do with backtesting is to use Limit Orders. If look at backtesting using Limit Orders you will often see the exact tick of the high or low of a bar used in the results. This is just totally unrealistic that is possible in real trading. Optimizing Backtest Results for Realistic Trading Performance Traders often gravitate toward backtest configurations that yield the highest net returns. However, these optimal settings may not be sustainable in live trading environments due to factors like overfitting and unrealistic assumptions. To enhance the reliability of a trading system, it's advisable to focus on configurations that perform within the 75th to 90th percentile range of backtest results. This approach helps in identifying robust parameters that are more likely to withstand the variability of real market conditions.
It is best if using timebars to use ticks that has a start to finish to form time bars. And yes, takes much time to back test to get large sample sizes, good data not free. I have always used on limit orders price has to go beyond one tick to be filled. Sometimes using complicated entries like price touches a Buy Stop but limit orders few ticks below buy stop and then couple ticks below that price to average down a bit. Many will say backtesting is worthless, but I view it as a step in forming Stats. All my systems have had extensive backtesting. Am much more conscience of risk management than profits. Using MFE and MAE works well and I try to get 90%, but different markets make different percentages.
When you get to a certain level of confidence via live trading you become more of a statistician than you do a systems builder. Basically probabilities and statistics are at the core of any successful system. And it does take an extraordinary amount of testing and data to gain that confidence..
The whole purpose of backtest is to check if idea/strategy at least worked historically. If it fails even backtest, game over. But and it is a BIG BUT, most backtests will way outperform real results (even with 0 bugs and not looking into the future, which I obviously consider a bug as well). We programmed and tested insane number of strategies, most did not live to expected results.
If you cannot explain the reason to yourself why it should work that way, then it is usually only noise testing or very mediocre at best. I find this very important: Reasoning. It must make sense in a deeper way. Statistics alone are not enough. And you cannot backtest reasoning. That is why all AI is useless computational power spent, when AI did not learn before to ask (and answer), WHY any strategy should work.