Avoiding Curve fitting

Code7 · Dec 30, 2009

Quote from intradaybill:

You should understand that in order to have curve-fitting there must be a mapping operation. If you went to a college you should remember that functions map a domain to a range.
More...

Are you saying that C > C(1) is not a function and does not involve a mapping operation, like mapping a pair of closing prices to true or false?

Quote from intradaybill:

You are telling people that if they decide to trade the simple system C > C(1) they are curve-fitting because they could have chosen C > C(2) instead. This is what you are saying.
More...

No, I did not say that. You missed out the backtest of all variations. You are not looking for the best fit if you just randomly decide something.

MarkBrown · Dec 30, 2009

i personally like c-c[1] myself.

intradaybill · Dec 30, 2009

Quote from Code7:

Are you saying that C > C(1) is not a function and does not involve a mapping operation, like mapping a pair of closing prices to true or false?

No, I did not say that. You missed out the backtest of all variations. You are not looking for the best fit if you just randomly decide something.
More...

C > C(1) is not a function that maps prices to prices but in a broad sense can be considered a function that maps prices to truth values. For any given time series of prices, C, the mapping is fixed and cannot be fitted. The only way to change the mapping is to change the function, choose for instance C > C(2).

Thus, any function that contains only logical operations and has a truth value cannot be best fitted to its input because there is nothing you can do to accomplish that. The fit is "pre-selected" when selecting the function.

On the other hand, a function that does not contain only logical operations but also involves arithmetical operations, like for example C > aC(1)+b, can be best fitted by adjusting the parameters a, b. In this case, in addition to selection we also have curve-fitting potential.

Selection is unavoidable in every aspect of life, whether random or analytic, whereas curve-fitting may not be desirable or effective. Consider the following example: You are looking for an attractive woman (trading system). You can attempt to modify a not so attractive woman (under-performing system) or even an ugly woman that is fat (total loser) to an attractive woman by plastic surgery for example. You are trying a best fit in this case by adjusting any parameters you can. The result has high probability to fail.

On the other hand, a reasonable and natural way is to go out and look for the nice woman you want. You analyze each woman from a sample based on your criteria and you select the one that satisfies them. You select a target but your choice is subject to selection bias. You could have chosen another sample and your analysis may be limited to that. But in this case it does not matter that much if your selection satisfies your criteria. It matters when you define the criteria based on the available collection of data, something that it is not the case.

Your reference to backtesting results versus random choices is a red herring. If someone decides to trade C > C(1) based on backtesting and someone else picks it randomly, do you think the performance of each one will be different because one is "fitted", according to you of course, and the other is not? Just this simple example should convince you that you wrongly believe that the method of selection of a function influences the behavior of a function itself. Another absurdity of yours that is.

Code7 · Dec 30, 2009

I'm surprised you didn't reply to MarkBrown. The functions below are equivalent.
C > C(1)
C - C(1) > 0

What does that tell us about the relevance of arithmetical operations for curve fitting?

You are correct about the prevalence of selection. However, when it comes to trading systems I still find it arbitrary what you call fixed, unfittable and pre-selected. If time delta amounts are user provided inputs, they are changeable parameters and suitable for curve fitting. Any adjustment to improve backtest results here constitutes curve fitting.

Quote from intradaybill:

If someone decides to trade C > C(1) based on backtesting and someone else picks it randomly, do you think the performance of each one will be different because one is "fitted", according to you of course, and the other is not?
More...

No. I do think curve fitting is not necessarily bad and often unavoidable. It's just an issue that needs to be dealt with, not denied.

Code7 · Dec 30, 2009

For illustration purposes, think about a Donchian channel breakout system. Does it make sense to consider any different length parameter a different system? Like for example 3342, 3343, 3344 minutes are all different systems you select. Guess what it doesn't matter, the end result is the same. Curve fitting to the past.

Rocko Bonaparte · Dec 31, 2009

A lot of what traders seem to consider to be curve fitting would be known as selection bias to other people. So when people talk about curve fitting on here that's what I tend to think about. I wonder if a lot of arguments are stemming from that.

It's similar to how algorithmic trading is more regarded as a way of distributing large orders, rather than using algorithms to make general trading decisions. I would have figured they'd be under the same umbrella, yet people people seem to regard it as automated trading instead.

bashatrader · Jan 2, 2010

Hey guys, a lot of this talk may be meaningless. You all agree that curve-fitting is no evil, just something we must deal with. I do not entirely understand selection bias. I think it deserves more attention from my part.

Code7, if you get a chance take a look at this article by the "guru" you mentioned

Article about the robustness of patterns

Out of the 8 patterns Harris found with his software in 2002 and published in a magazine, two became slightly unprofitable, two became more profitable and the rest remained profitable during a forward test period of 6 years.

I do not know if these patterns are still profitable because I do not have APS but I find this result very impressive. Even if the patterns are not profitable any longer, a 6 year period of profitability is long enough. Most systems I have developed so far fail right after the first year.

Code7 · Jan 4, 2010

@ bashatrader

There is no mentioning of slippage and commission. Besides that...
Do you think you would see that page on his website if those patterns failed?
Do you know in how many magazine issues Harris presented some patterns?
Looks like you can start right here with your study of selection bias.

I don't think that's very impressive. Just have a look at the test period. From 05/07/2002 to 08/22/2008 and all 8 patterns are long only.

Code7 · Jan 4, 2010

Quote from bashatrader:

two became slightly unprofitable, two became more profitable and the rest remained profitable
More...

Nice way of saying that 6 out of 8 became less profitable.

jack hershey · Jan 4, 2010

Quote from bashatrader:

Can anyone provide an accepted definition of what curve-fitting stands for?
More...

This thread and three or four others stand a chance of moving forward the common understanding of the opportunity the market presents.

maninjapan introducs a pragmatic compelling concern and frames an issue.

Muskaka Joe gives maninjapan the means to procede.
The two lynch pins of the varied multi-paticipant constructive conversation are "parameter" and "thinking".

It is also important to learn from those making mistakes.

To attack, rigourously, the opportunity, I make the plea that I person stop and go to "empty". It will probably take a period of repeated skillful meditation. It is so important to relax when considiering seminal matters.

Skilled reasoning persons have spoken on "parameter" and "thinking" elsewhere. Many people who have posted here demonstrate the consequecnes of understanding and knowledge AND the skill of applying their experiences.

It has been explained to maninjapan that he has to take another fork in the road. He cannot continue on his current path but must adapt to the advise of others.

What is very clear is the non stationarity of the matter. (See Joe).

What is also clear is that "parameter" is defined clearly and it relates to the scope and bounds of the foundational logic of the opportunity. This is a statement about opportunity and NOT a statement about the construct (signal generator) used to view the opportunity.

As most of you see, data seems to be driving the conversation. maninjapan has taken an idea that relates to an opportunity and he is iteratively refining the idea as a way to realize the opportunity.

Were he to follow Joes several seminal suggestions, he would return to the opporuntiy, for a while and define its parameters taking them FROM the scope and bounds of the opportunity.

As the posters explain, they strayed far from the opportunity, and further did not define it in the first place.

Parameters are measures of the hypothesis set that define the opportunity. For the recorded explicit history of the markets, market definitions have not changed.

Cogently, Intradaybill set the standard of how to relate to market parameters and non stationarity always applies.

By "thinking" that data defines a market, you saw Galt go off the deep edge and make mud out of parameters. He leapt by doing data gathering instead of thinking of the defined opportunity as it is defined in order that Intradaybill's parametric measure requirement may be met.

The definition of curve fitting is seminal: It is using data to change the means of parametric measures of a fixed hypothesis set.

Muskoka Joe has a hypothesis set. So do others. It came to them by deduction related to the operation of the market.

Manijapan does not have this so it was suggested to go back and look to deduce the defining aspects of the opportunity. Pouring data into a mold does not produce anything except how the mold was defined.

How do patterns emerge from a pristine deduced opportunity. The path is a simple one and it was presented in this thread.

1. ML and IDB stated that a hypothesis set and it parametric measure exists. Certainly it does.

2. All data is to be converted so that the parametric measures may be used. This generates many degrees of freedom all set in place by thinking about the opportunity. My recommendation is about 70.

3. Only logic may be used to deal with the degrees of freedom. This involves several functional logical tools.

4. At any time, about 6 or 7 degrees of freedom provide certainty about what is going on in the Present with absolutely due regard to the non stationarity of the opportunity (and not the tooling).

5. Curvefitting is eliminated in the four parts above.

6. The "yield" of the above is what is fitted to the rules of the market, that is, how a person participates in making money.

By always having certainty in the Present, it is always apparent that three questions are always informed in a pristine manner.

For reference the questions are:

1. Where are you in the cycle.

2. What is next, and

3. How fast is it changing.

Maninjapan wishes to trade the breakout of the price and take the offer for that segment of price movement. He wishes to detect, in advance I reason, when that time is coming. He implied the cyclic nature of markets. Market sentiment expresses this at all times.

Muskoka Joe suggested to him to put the opportunity and its measurable variables on the table. He wanted maninjapan to have that intellectual experience. I agree it is a terrific experience.

Look at ATR and see, as a degee of freedom how remote it is from the intial degrees of freedom. Maninjapan has been requested to "lookback" at better resourses.

It can be seen that retracting back from curve fitted is a just thing to do. Going to curvefitting is straying from "thinking" about the opportunity.

For me working through the "Riddle of Induction" was a good "thinking" opportunity. A paradigm based solution emerges; a solution to the riddle of induction. Take that trip by all means. This is far afield for those who have committed their lives and reasoning by using induction.

If a person thinks his way towards Intradaybill's requirments for handling degrees of freedom parametrically, then he sees the necessity to follow Keynes admonition to deal in "like kind" for both hypotheses and for their parametric measure. In trading this is "simple".

If maninjapan goes back an sees the variable realtionships as a hypothesis set, then he sees how to meausre parametrically.

As he drills down into this "market system", he gets to have all the advantages of its construction. This precipitates a degree of precision to any extent he wishes to pursue.

Note: The 70 degrees of fredom have mostly to do with the fractal nature of what emerges from the market opportunity contruct. the interwoven nature of fractals produces "parametric outputs" on many levels as the order of events unfold. The term "breakout" is an example of a parametric output. It occurs well after the beginning of a profit segment and as such it often is used by some as what is known as "confirmation" or verification.

Because of the very great influence of inductive reasoning, it may be said from that context that the market is counterintuitive. What actually is at hand it that the real parametric measures of the market have largely slipped between the cracks.

If after a while, curve fitting is off the table and using data to inductively discover the market opportunity is discarded, then a new standard of trading performance emerges. The standard becomes he "market's offer". Discovering th market's offer takes some doing. One think that can be said for sure is thatthe market's offer is always being inferred or implied. I hope most people being to notice it.

When people work on methods of trading to make money, they usually curve fit. What they are doing is using data to test something they have created to extract money out of the market. This divorces the real signals of the market based on parametric measures of real market behavior from the artificial signal generator (coded software usually) the person has created.

In conflict in this process is the inductive work to create a signal generator as compared to a parametric measuring device based on a hypothesis set deductively obtained that signals the market's behavior at critical points in the market cycle.

The market operating paradigm was set forth and established many generations ago. At that time, reason prevailed. today, the convenience of data processing has lead the financiaol industry into massive inductive data processing. Often the search centers around seeking what are called advantages as a result market anomalies and inefficiencies. As previously stated the forest is no loger seen because the trees get in the way.

The alternative to cuve fitting to achieve higher extraction velocities is to iteratively refine timing by using the fractal nature of patterns in the market. Fortunately, with full consideration to non stationarity and granularity one pattern emerges and extraction comes down to more of a consideration of the market's capacity.