Avoiding Curve fitting

Code7 · Jan 4, 2010

About that article by Michael Harris. I did my own analysis with the same setup. Buy on open, both profit target and stop loss 7 % away from entry. Tested with the same data, daily QQQQ from 05/07/2002 to 08/22/2008. I simply bought on every open and got 1562 closed-out trades, 950 winners and 612 losers.

I can now randomly pick some trades out of my pool of all 1562. My winning percentage will still average to the same number, 60.82. The long bias is caused by the data. QQQQ rose from 29.27 to 47.49 in the testing period.

Now, what about Michael Harris.

Source: http://www.tradingpatterns.com/Literature/article3/article3.html

He had a total of 128 trades, 77 were profitable. So his winning percentage was 60.16, slightly but insignificantly worse than random. In conclusion, his 8 long-only patterns had no edge whatsoever in the out-of-sample period. There was just a plain and simple uptrend in the data.

intradaybill · Jan 4, 2010

Quote from Code7:

In conclusion, his 8 long-only patterns had no edge whatsoever in the out-of-sample period. There was just a plain and simple uptrend in the data.
More...

What you are doing is called hindsight. At the time, in 2002, Harris picked up a trading system from APS based on actual historical data before that. He did not know that the market will trend upwards or downwards. His system generated trades in an unknown environment. You generated trades in a known enviroment, because you know the data afterwards. Your comparison and analysis is subject to hindsight. His was not because it was designed based on data before the period during which you randomly selected trades and which was not available during that time.

I do not assume you understand the difference. What you have proved is that Harris' system did well following the market trend, which was unknown to him when he designed his system.

Jerry030 · Jan 5, 2010

Anyone have any thought on the difference between bad and good curve fitting?

Bad Curve Fitting Example: A trade strategy with 8 technical indicators each with say 2 parameters and 2 rules. You optimize all of these and you appear to have a profitable system. However on unseen market data the performance is pedestrian or terrible.

Good Curve Fitting: You have a predictive analytics application; say a neural network, PCA, SVM or other. These tools are essentially non-linear curve fitters. You break historical data into three sets: Train, Test and Validation.

Training is used by the application to curve fir the optimal solution with modification based on the Test set, which only give performance but no information about the TI, patterns ore whatever is the basis of the system. Once the analytic model is done and has the best performance possible you test it on the Validation set which is totally unknown to the application. Performance is very similar. Then you test on live data real time. The performance remains very similar.

Does this thread make this distinction?

jack hershey · Jan 5, 2010

Quote from Jerry030:

Anyone have any thought on the difference between bad and good curve fitting?

Bad Curve Fitting Example: A trade strategy with 8 technical indicators each with say 2 parameters and 2 rules. You optimize all of these and you appear to have a profitable system. However on unseen market data the performance is pedestrian or terrible.

Good Curve Fitting: You have a predictive analytics application; say a neural network, PCA, SVM or other. These tools are essentially non-linear curve fitters. You break historical data into three sets: Train, Test and Validation.

Training is used by the application to curve fir the optimal solution with modification based on the Test set, which only give performance but no information about the TI, patterns ore whatever is the basis of the system. Once the analytic model is done and has the best performance possible you test it on the Validation set which is totally unknown to the application. Performance is very similar. Then you test on live data real time. The performance remains very similar.

Does this thread make this distinction?
More...

To specifically answer what you asked, here are my views. After my views are some additional subordinant thoughts.

Both examples you gave have been put on the table. For each case there has been some positive discussion, meaning some helpful information to minimize bad curve fitting and dome helpful information to augment good curve fitting.

My thoughts.

For each example a single methodology could have been used. The examples emphasize more so trading approaches rather than the heart of the problem, i. e., people looking for something to have to deal with.

Most people look for edges as the something. "Train" probably is not a way to get to square one. If only a few of the market variables are on the table, traning doesn't work. The example I see most is people examining price data.

For whatever reason looking at markets for over 50 years did something to my mind. Rather my mind did things to itself and how it viewed and did trading got me to somewhere. Trading is like driving a car for me. I feel that I am trained as a consequence of all those hours while being in a particular orientation.

Naturally, indicators were created and indicators were iteratively refined. They turned into tools. Tools work because they have rules.

I focussed on timing the market by using the leding aspects of indicators. this meant that my tools were oriented to parasitically front running the events that occur.

There is a caveat.

Nothing productive can come of spending time if the person is not oriented to how markets work instead of how to make money. For what ever reason, I obeyed going through the process of finding out how markets work. That is where the analogy of driving a car came into my picture.

I thought about how curve fitting either helps or hinders a person's progress in becoming competent in driving a car. Driving is a mental process and doing it correctly and defensively is all done unconsciously and confidently.

Last comment.

I examined thoroughly what I do in trading and why it is done the way it is. I certainly respect the efforts you make and the help and support you give to others. As we each came to a given fork in the road, it appears we took two different paths. What you have formulated to inquire is so different than the considerations that were imposed upon me over the years. In a nut shell, for what ever reason I have been deductively oriented. I believe and can see, through my interactions with the full spectrum of active traders, that there are many paths possible and most people choose to use induction. This difference of viewpoints leads to different approaches to iterative refinement.

The correlations as to why people fail are so closely related to how they go about thinking about trading. I am so glad I built my tools the way I did. Volume is clearly the most important variable of the market. I think a lot of people leave it out as they reason about things.

Jerry030 · Jan 5, 2010

Quote from jack hershey:

To specifically answer what you asked, here are my views. After my views are some additional subordinant thoughts.

Both examples you gave have been put on the table. For each case there has been some positive discussion, meaning some helpful information to minimize bad curve fitting and dome helpful information to augment good curve fitting.

My thoughts.

For each example a single methodology could have been used. The examples emphasize more so trading approaches rather than the heart of the problem, i. e., people looking for something to have to deal with.

Most people look for edges as the something. "Train" probably is not a way to get to square one. If only a few of the market variables are on the table, traning doesn't work. The example I see most is people examining price data.

For whatever reason looking at markets for over 50 years did something to my mind. Rather my mind did things to itself and how it viewed and did trading got me to somewhere. Trading is like driving a car for me. I feel that I am trained as a consequence of all those hours while being in a particular orientation.

Naturally, indicators were created and indicators were iteratively refined. They turned into tools. Tools work because they have rules.

I focussed on timing the market by using the leding aspects of indicators. this meant that my tools were oriented to parasitically front running the events that occur.

There is a caveat.

Nothing productive can come of spending time if the person is not oriented to how markets work instead of how to make money. For what ever reason, I obeyed going through the process of finding out how markets work. That is where the analogy of driving a car came into my picture.

I thought about how curve fitting either helps or hinders a person's progress in becoming competent in driving a car. Driving is a mental process and doing it correctly and defensively is all done unconsciously and confidently.

Last comment.

I examined thoroughly what I do in trading and why it is done the way it is. I certainly respect the efforts you make and the help and support you give to others. As we each came to a given fork in the road, it appears we took two different paths. What you have formulated to inquire is so different than the considerations that were imposed upon me over the years. In a nut shell, for what ever reason I have been deductively oriented. I believe and can see, through my interactions with the full spectrum of active traders, that there are many paths possible and most people choose to use induction. This difference of viewpoints leads to different approaches to iterative refinement.

The correlations as to why people fail are so closely related to how they go about thinking about trading. I am so glad I built my tools the way I did. Volume is clearly the most important variable of the market. I think a lot of people leave it out as they reason about things.
More...

Thank you for your thoughts. I cited indicators as a way to simplify the example, however they have serious limitations:

1) Most were conceived and designed decades ago long before our current computational and data environment.

2) They were designed to simplify market conditions for human interpretation and decision making...to dump down the market into a single number and a simple rule.

For analytics and prediction the mathematical components used to calculate the typical TI work much better than the indicators themselves. And dispensing with these there are much better ways to capture current market characteristics that have future predictive capacity such as Landmark.

Random.Capital · Jan 5, 2010

Quote from Jerry030:

Good Curve Fitting: You have a predictive analytics application; say a neural network, PCA, SVM or other. These tools are essentially non-linear curve fitters. You break historical data into three sets: Train, Test and Validation.
More...

That's not "good curve fitting", that's just really complicated way to curve fit badly. It solves none of the primary problems.

Jerry030 · Jan 5, 2010

Quote from Random.Capital:

That's not "good curve fitting", that's just really complicated way to curve fit badly. It solves none of the primary problems.
More...

Could you explain what you mean?

A predictive model, which in the case of say an neural network is in simple terms an optimized set of weights and connections strengths between say perhaps a few dozen inputs (TIs, mathematical measures, etc) and an output characteristic say a 5% rise in average price over a specified prediction window.

The model performance is stable and similar both on unseen historical data and totally unknown future data with live money.
Since to continues to make money, personally I'd call that "good" curve fitting.

What is not good here in your opinion?

jack hershey · Jan 5, 2010

Quote from Jerry030:

Thank you for your thoughts. I cited indicators as a way to simplify the example, however they have serious limitations:

1) Most were conceived and designed decades ago long before our current computational and data environment.

2) They were designed to simplify market conditions for human interpretation and decision making...to dump down the market into a single number and a simple rule.

For analytics and prediction the mathematical components used to calculate the typical TI work much better than the indicators themselves. And dispensing with these there are much better ways to capture current market characteristics that have future predictive capacity such as Landmark.
More...

Your query had a lot of definition and showed a lot of thought.

I am an example from that era you mention that had contributions to facilitate what people were doing.

Market characteristics are just a few steps away from how markets operate.

The market cycle manifests at some point as a consequence of examining the principles that create the description of a market as a system. What came to me from these people who did the item 1 and 2, above was that I could create tools for dealing with the Present and the future coming into the Present.

I turned to Information Theory I found out later. A great transition has been going on over the years and technology is developed in support of efficiency and effectiveness.

One line in the sand I crossed was not getting involved in prediction. I guess MACD was an example. The designer designated values to specify an ongoing market sentiment. It was simply a go/no go guage. MACD was named appropriately too.

What allows a person to not get involved in prediction? My approach was keeping track of surrent events. By that I mean keeping current on the status of the market movement through its cycles. Most people substitute prediction instead.

It was very interesting to see the advent of technological capabilities. They did focus on data processing usually for pragmatic purposes. The ticker tape and chalk dropped out of the picture at some point. Storage of data followed.

Trading was very offensive to some classes of people. Keeping my counsel was an issue in some ways. I was considered a "cheater" and how I cheated was created as a mystery by others.

Today, many people do "mine" for things that will be helpful. I stilll choose to use Information Theory as the basis for front running the market's offer. Boolean logic allows me to work in the non probabilistic fork of of Information Theory.

I do keep current on the literature of the contemporary prefernces of those who purchase talent and processing packages. It is almost entirely connected to probabilistic Information Theory except where dealing with facts is involved; that is precision data processing and not "prediction".

The era of intelinking electronics and mechanical sysems was an exciting one. I liked working in the rader and missile venu when I was in college during summertime. It was before digital systems took hold; then analog was predominent. It seemed that the math chosen had to do with problem solving.

Trading today by most systems oriented people is a common feedback systems approach to keep iteratively refining a construct. The construct is more about trading than how the market works, however. Few packages run parallel systems, however.

Having parallel systems is a necessity for me. Both are logic based in the non probabilistic sector of Information Theory.

The market is so much slower than radar based defense systems but in a way there is a lot of emphasis on speed and rep rates.

Monday was a really good example of a critically damped system through bar 46 (5 min). It was like taking a huge system up to speed and achieving a balance in the dynamic. My guess is that a lot of people did not know this. Today, was underdamped.....lol.... Late in the day the DOM was running well over 15K on each side...lol..

Trader666 · Jan 5, 2010

Jack, I hate to burst your bubble but stochastics is not a leading indicator of price as you've said it is:
http://www.elitetrader.com/vb/showthread.php?s=&postid=1302778&#post1302778
http://www.elitetrader.com/vb/showthread.php?s=&postid=2437787#post2437787

Did this flaw in your belief system help you do -24% in that trading contest?

Quote from jack hershey:

I focussed on timing the market by using the leding aspects of indicators. this meant that my tools were oriented to parasitically front running the events that occur.
More...

Random.Capital · Jan 5, 2010

Quote from Jerry030:

The model performance is stable and similar both on unseen historical data...
More...

How did you pick the weights for the NN?