Discussion in 'Strategy Development' started by pursuit, Oct 17, 2017.
Thank you for your deeply illuminating contribution to the discussion.
Sorry i'm not here to educate. There are many resources online that explain why your initial premise is wrong and how testing on Segment 3 is more prone to over fitting.
The only time you would test over segment 3 is if you were planning to then isolate each segment( 1 & 2 ) and run the likes of a linear regression through each to test for correlation. This could actually be a more effective technique then testing on segment 1 (in sample) and then segment 2( out of sample).
Is Walk-Forward (out of sample) testing simply an illusion?
If price movements evolve as a random walk, then yes, it is an illusion. It seems the best models of market price movement suggest random walk is the best fit. However all models are wrong to one level of precision or another. How wrong is always up for debate, therefore TA (and out of sample testing) may have some benefit after all.
Just because the random walk model is a good fit is no proof price changes are random (whatever that means even).
It just means price can vary alot and is not constrained too much!
So if you're not here to discuss and explain the reasoning for your opinions, I'm just curious what are you here for?
It's been proven that most financial time series are not a random walk. Anyway if we believed that I don't think any of us would be on this forum.
Walk forward and segementing folds of data is immensly useful if you try to understand the modelling process from a complexity/generalization perspective. Just imagine that you have a very finely optimized model on data segment 1, that performs spectacularly-- then (the same fitted model) on segment 2, performs terribly. That tells you something right there, that a (single) concatenated set will not.
Suppose we have 1000 models. Two classrooms in two different rooms get the same models and the same data.
The first classrooms is full of geniuses who know wassup and will test the "clever way", meaning they test all on seg1, then test pretty ones on seg2 and keep the ones that test pretty on seg2. That is their end selection.
The second classroom is full of idiots and they test the "dumb way". They run all models on the combined seg3 which is a simple continuous combo of seg1 and seg2 and keep the pretty ones for their end selection.
Question: don't you think the classrooms will end up with the same end selection of models?
Bonus question: don't you think at least some of the models in the end selection are there by random chance because we tested a lot of models and contain no alpha?
Of course you're right that optimization can produce overfitted models with no alpha.
However, WFA is much more than simply dividing your data into two segments.
You optimize your strategy on segment 1 and then test on segment 2. You are not allowed to optimize using data from segment 2. That is the difference between the two situations that you are describing.
Obviously, you will have a data mining bias after you perform WFA. Algo traders like Pardo and Eckhardt consider their WFA methods to be major trade secrets that they have refused to elaborate on in interviews.
Separate names with a comma.