Can you please describe what you mean by overfitting? I think I understand but want to make sure that I get it right. When I think of overfitting, I think of an absurdedly narrow "signal" window, with a very small sample size. Something like this: The signal is a narrow range of the 200-day MA of the USD. A small number of trades, concentrated on a few lucky spikes in VIX. Seems to me that this strategy probably was just a lucky one, since I'm scanning thousands of potential strategies. I'm actually not doing any cross-validation. Half because I'm lazy. ;-) Half because I think I can visually observe stationarity in the profit curve. Here's a profit curve that looks much better: over 1000 historical trades and reasonable performance in all market conditions. As a measure of stationarity, I actually compute the variance of the profit about a straight line connecting t=0,p=0 to t=tmax,p=p(tmax) and use it in the strategy screening process. In other words, if the profit curve is "blocky" or has lots of drawdowns, it will be penalized relative to a nice steady-marching profit curve.

firm definition of RSI https://stockcharts.com/school/doku...hnical_indicators:relative_strength_index_rsi Calculation To simplify the calculation explanation, RSI has been broken down into its basic components: RS, Average Gain and Average Loss. This RSI calculation is based on 14 periods, which is the default suggested by Wilder in his book. Losses are expressed as positive values, not negative values. The very first calculations for average gain and average loss are simple 14-period averages: First Average Gain = Sum of Gains over the past 14 periods / 14. First Average Loss = Sum of Losses over the past 14 periods / 14 The second, and subsequent, calculations are based on the prior averages and the current gain loss: Average Gain = [(previous Average Gain) x 13 + current Gain] / 14. Average Loss = [(previous Average Loss) x 13 + current Loss] / 14.

More or less, if you have a training data set and a test data set you should be fine. So you mine the last 10-9 years for rules and then run them over the next year. Or constantly update the rules as you go based on past info only. I'm sure I missed your architecture somewhere and you're doing the sensible thing. An extreme example of over fitting would be that you train a random forest on events up to yesterday and run the forest today to predict the movement. What you've got will work really well in the past but not the future.

Is anything predictive? There is only "when ABC happens, there is a X% probability that Y happens at most N days afterwards". Market can stay irrational... etc... Not at all mad.

IF that was a problem+ i dont think its is much, why risk a margin call=panic sellers never win?? AS you partly implied , morgan B,when you have a loss + that's part of it=Leverage is not your friend ; trend = friend. My cash position in DVY did get a flash crash once; but i seldom watch 100% intraday; i use 10 year paper charts + all data much more, than any flash crash which could be a bad quote. I know Black Rock called GS out[ in public] on that flash crash , as some kind of error.Planned sellers can win; panic sellers never win.Thanks,

Another example of overfitting: you have a EMA crossover strategy. You data mine the best fast/slow window and use that as your rule. In backtests, this is going to be b-e-a-youtiful. But in real trading, it's likely going to fail after some time if you're lucky, or immediately if you're unlucky. It is quite possible that you can optimize these parameters on a daily basis and pick the best rule but at that point, I think you're statistically not doing any better than picking random rules. To convince yourself make sure you are running backtests that avoid forward bias.

1) nerdy way to avoid overfitting https://seekingalpha.com/article/4126281-optimizing-trading-strategies-without-overfitting 2) Normie way to overfit less-ish https://blog.quantopian.com/parameter-optimization/