I was thinking about two types of parameters one can use in a model. Open-ended parameters like SMAs or percentage-based filters, where the value used can be anything from zero to infinite and those like percent, where the value can only be within a certain range, like 0% to 100%.
Are either of these two types more susceptible to curve-fitting? It would seem that you could say the first type is because the ability to calibrate the value of the parameter to historical data is nearly infinite, so someone could come to the conclusion that the 53-day SMA with a 13.2% volatility filter were the optimal values for an entry strategy, whereas with the second type, your ability to fit to a curve is limited by the range the variable can take on, meaning you'd be better off basing a model on the second type, to the extent that you can.
I'm just thinking out loud a bit, so if this simple comparison and conclusion is flawed, I'm happy to hear why. I realize that you'd kind of have to ignore the potential for infinite subdividing of the range-constrained parameter, so that you don't end up with a value like 15.898798798% as your model input.
Open-ended parameters are contrained by maxbarsback. If you have 4,000 bars in a file it does not make any sense to use a 4001 bars sma. So there goes you otherwise nice try.
Open-ended parameters are contrained by maxbarsback. If you have 4,000 bars in a file it does not make any sense to use a 4001 bars sma. So there goes you otherwise nice try.
OK, so there is a practical difficulty, in some cases.
Does that mean that the distinction is invalid and that there really aren't two types of parameter here?
I suppose on the most macro level, if a market has been traded for 50,000 days, it can't make sense to use the 50,001 SMA, so not only would there be data limitations on the parameter values, there would be historical limitations as well.
It still seems intuitive (which isn't always correct, obviously) that the fewer values a parameter can take on, the less susceptible to curve-fitting the model would be, which would mean that binary parameters would be the least likely to be curve fit, which seems correct.
You have to look at the sensitivities of the changes in parameters values and not at the range of values. There are infinite real numbers between 1 and 2 and between -100 and +100. Ranges mean nothing. Sensitivity is important. You have to look at the partial derivative (where is that moron quant by the way?) of the objective wrt that parameter as a function of time. It is a nasty problem but it can be done numerically using polynomial fitting.
You have to look at the sensitivities of the changes in parameters values and not at the range of values. There are infinite real numbers between 1 and 2 and between -100 and +100. Ranges mean nothing. Sensitivity is important. You have to look at the partial derivative (where is that moron quant by the way?) of the objective wrt that parameter as a function of time. It is a nasty problem but it can be done numerically using polynomial fitting.
You had me until you mentioned needing to look at it as a function of time.
Sounds like you are saying the less sensitive, the better, unless you want to use polynomials. Personally, I do not use them, with one exception.
I know relatively little about advanced statistics and the science of modeling. However, I have worked with Neuroshell for a number of years ( a neural networking program for the market). I have found that range limited items work better for longer modeling periods. If I look at the change in the Close over a period of years, as the price of an instrument climbs, the network sees a $1 change in a $50 stock as different than a $2 change in the same stock when it is at $100. However, if I look at the percent change in the close (2%), the net sees the same value. Thus, range limited info, for longer-term models, seems to get more robust results. Short-term modeling does not suffer to the same extent. At least this is true in the models I can develop. More sophisticated modelers might not find this to be true in their models. This might not apply to instruments, which by their nature are more range-bound, like short-bonds. But I've never tried modeling them.
.
Jack