It is known that you have to use out-of-sample data for testing, otherwise the test will generate far too optimistic, meaningless results. However, as I learned the hard way, it also matters how you select the out-of-sample data. Previously I split the historic price data in 3-week intervals, and used 2 weeks for optimizing the strategy, and every third week for testing. This way I got very good results for some strategies, such as Sharpe ratios up to 5 and annual returns of 700%. When testing the same strategies in a different way - the whole first 70% for optimizing, the 30% rest for testing - the Sharpe ratios went down to 2. When testing them in real trading, the results are comparable to this 70/30 split method, not to the previous 3-weeks method. I'm sharing this info here in case someone runs into the same problem. But I'm not sure why the 3-weeks data split gives much more unrealistic results than the 70/30 method. Maybe someone has an idea?