Out of sample testing of patterns

ronblack · Apr 21, 2010

I need your opinion on this one. I used 10,000 bars of 60-minute QQQQ data for the in_sample and 1,800 bars for out_of_sample. I had APS Automatic Pattern Search find patterns in the in_sample with pf> 1 and 4% target and stop.

APS generated 84 patterns. I then had the program test them out_of_sample and it turned out that 40 of them remained profitable with pf >1. Out of those 40, 15 patterns had success rate of 100% (no losers).

My questions are:

A. Should I consider using only the patterns with 100% success rate in out_of_sample?

B. Should I assume that those patterns with 100% success rate will at some point soon revert towards mean value and instead use other patterns.

C. Should I consider using only the patterns with the highest number of out_of_sample trades? or those with the highest number of trades in the in_sample and with pf > 3 for example?

D. Any combination of the above or even none of the above?

Thanks

MGJ · Apr 21, 2010

Why not do a "certification test" on ALL of the patterns?

Wait 1,800 bars. Don't trade any of the patterns; do nothing

After 1,800 bars, test ALL of the patterns on this fresh new set of 1,800 bars.

Identify the patterns whose performance in the certification test pleases you the most. Call them MY LITTLE BABIES

Formulate a rule that describes MY LITTLE BABIES; it might be "The ones that had 100% success rate out-of-sample" ; or it might instead be "the ones with the largest number of out-of-sample trades" ; or perhaps "the ones that DIDN'T have 100% success rate out-of-sample"

Run APS again, using the new 1,800 bars of data

Use your selection rule, found in step 4, to choose a new set of your little babies. These are the ones you will actually trade, with real money.

Put both hands over your nads and hope that history repeats itself. It might not, you know.

goodgoing · Apr 22, 2010

I think you are both doing the same thing. I would by the way trade only those patterns with the highest number of past trades and a high enough hit rate.

RedRat · Apr 23, 2010

I suggest you to run these profitable patterns on other markets, similar to QQQQ, like CL, ER2, ES and GC. If they are profitable on other markets - chances are high they will work realtime.

As for QQQQ, you need to run test yourself. You may divide your 12000 historical data into several periods of InSample and OutOfSample. What you did is only first 10000 for InSample, other 2000 for OOS.

Divide your data as 8000 InSample, 2000 OutOfSample, 2000 test. Then you get profitable patterns after OOS and run them on the Test. I think that at least 50% of them may become non-profitable.

You may combine your test sets infinite times, say 2000 OOS, 2000-10000 OOS, 10000-12000 Test. Or 0-1000 + 11000-12000 Test or so on... Do not forget to report your results here .

Hugin · Apr 24, 2010

Quote from MGJ:

Why not do a "certification test" on ALL of the patterns?
More...

I'm with MGJ here. Maybe itâs my cautious nature but I would be very careful using result from out-of-sample tests result to select patterns to use in real trading. What you really want to achieve with the out-of-sample testing is to see whether the APS system gives you patterns that really have an edge. In my experience selection bias is always there to bite you so be careful.

From the figures about 50% of the patterns loose their edge in out-of-sample. To me that is a big warning sign. Obviously other aspects are as important. For example how does the trade frequency change for patterns that continue to work vs those that stops working? How much overlap do you have in the trades in the in-sample? Distribution of trades over time?

This is how I would go forward. Take all patterns generated in the in-sample and evaluate how a trading system using all these rules would work using both in-sample and out-of-sample. See if the edge (and other characteristics) remains in out-of-sample. If it does not then I would not use any of the patterns for real trading.

If on the other hand a system based on all rules have an edge out-of-sample then I might start using it, but I would never use out-of-sample results to select patterns.

If you have data to create a third set then this could be used to see the difference in results (adjusted for trading frequency and exposure) between a system using all patterns and one using patterns selected from out-of-sample results.

Hugin

intradaybill · Apr 24, 2010

Some good suggestions here but I have some questions and I would appreciate some answers. First of all, it should make a difference what exactly are each of the patterns found from the in_sample. For simplicity, assume that APS finds two patterns, one looks like a double bottom and the other like a flag. Then you run a test in the out_of_sample and you see no edge. Some have concluded here that you drop all patterns/system but I dissagree. The two patterns are very different and manifest different market situations. Selection makes sense in this case.

I think selection would not make any sense if the trade distributions of the two patterns concide a lot in time. If the concidence is minimum then selecting profitable patterns based on out_of_sample tests should make sense.

I still think that trading price patterns is not a very well understood subject. Some well known traders started considering these patterns long ago, Larry Williams and LInda Raschke, to name two. Michael Harris with his APS program moved significanctly ahead in the area of pattern discovery but details on whether selection is allowed or not and under which conditions cannot be easily found.

Another example that comes to mind is candlesticks, which appear to be special cases or a mathematical subset of OHLC price patterns. If one were to claim that a failure of a price pattern set in out_of_sample to provide an edge, which was discovered by APS and it was profitable in the in_sample, should mean that all in_sample patterns should be dropped and not used then no canlesticks should be used because one can develope a program similar to APS that identifies only candlesticks (I think there are some out there for some time). The same should hold true with all TA patterns.

I think selection is justified if it can be applied and the trade distribution is a good test but the subject becomes highly mathematical and maybe this is a reason that there are no easy answers.

Hugin · Apr 24, 2010

Quote from intradaybill:

Some good suggestions here but I have some questions and I would appreciate some answers. First of all, it should make a difference what exactly are each of the patterns found from the in_sample. For simplicity, assume that APS finds two patterns, one looks like a double bottom and the other like a flag. Then you run a test in the out_of_sample and you see no edge. Some have concluded here that you drop all patterns/system but I dissagree. The two patterns are very different and manifest different market situations. Selection makes sense in this case.

I think selection would not make any sense if the trade distributions of the two patterns concide a lot in time. If the concidence is minimum then selecting profitable patterns based on out_of_sample tests should make sense.

I think selection is justified if it can be applied and the trade distribution is a good test but the subject becomes highly mathematical and maybe this is a reason that there are no easy answers.
More...

OK, I can understand that in some cases you could argue that selection is allowed using out-of-sample trades, but I would require a lot more information on the individual patterns to do that.

But, wouldn't you agree that if an out-of-sample test for all patterns shows that the edge disappears, there is a big question mark around the systems ability to generate profitable pattern detectors (at least given the settings used when creating them)?

Still, there is a question mark around the process of eliminating some patterns and keeping others based on the out-of-sample results. To me there's a big risk this selection will cause live trading results to become disappointing.

Hugin

maxpi · Apr 24, 2010

If I have an utterly random system, like almost everybody does whether they know it or not, it has perhaps a 25% chance of looking good on a randomly selected data set.. if I take those "encouraging" results and test them on out of sample data, again there is a 25% chance it will look good. That gives a 6% chance that a random system will look good and I'll go live and my broker will make money off of me while I deplete my account... doing all these tests in such a sterile environment is futile I'd say unless one finds something that looks good first. Using the backtester to find out what is random or not is more than just a little futile...

The best computer anybody is ever going to own is between their ears. Live trading is indispensable to development of any strategy because it is experience for your brain, gotten while the market does it's thing, which includes drastic changes of pace and changes in condition that your backtesting methods aren't likely to discover.

I did tons and tons of backtesting... it is astonishing how so much stuff is truly random! It is further astonishing how you can take a "promising" system and keep adding things to make it better and still have nothing at all that is not entirely random on your hands...

All that work saved me the difficulty of learning while losing money, that can be really damaging... I knew a guy that had a huge and lengthy drawdown and basically, he got phobic about his computer and could not bring himself to turn it on... he couldn't do an email even... but he never stopped deriding me for spending so much time in research and never trading live...

intradaybill · Apr 24, 2010

Quote from Hugin:

Still, there is a question mark around the process of eliminating some patterns and keeping others based on the out-of-sample results. To me there's a big risk this selection will cause live trading results to become disappointing.
More...

Trading is all about risk however and that applies to system selection as well. Regardless I cannot distinguish between selecting patterns and selecting a trading system in general. Theretically, one could write a program to generate all possible independent systems because they are finite in number, no matter how large that number is. Not being able to select out of this set would mean that system trading is not theoritically sound. However, I have made money using systems and I know several people that have overall made money that way, some even a lot of money. I can only conclude that our notion of probability is either false or incomplete. There is some talk about this lately in academic journals regaring not only trading but other areas as welll.

Selection is risky, I agree with you totally on this one, but there is when the beef is IMO. I am still looking for a sound selection process. I think this is the key. Rejection is easy, selection is hard.

Hugin · Apr 24, 2010

Quote from intradaybill:

Selection is risky, I agree with you totally on this one, but there is when the beef is IMO. I am still looking for a sound selection process. I think this is the key. Rejection is easy, selection is hard.
More...

Agreed, definitely. We've been told by a (successful) hedge fund manager that what we do will not work because it is data mining/optimization/overfitting, but still it seems to work.

A sound selection process can improve your results significantly, but defining this process seems to be as hard as it is finding an edge.

Regarding selection bias IMO everybody is doing it all the time (though they need not understand it themselves) its just that we're using a computer, which requires you to be extra careful.

Hugin