I would like to develop some discussion on the optimal number of degrees of freedom a system may have which would not be curve fitted. What has your experience been in playing with different degrees of freedom? How did you go about obtaining the right balance of being optimized so that it can extract profits the system is capable of but not so much that its curve fitted and will fall apart in the future?
toughest question there is. we have 1-3 main parameters for each system plus up to a dozen for signal filtering and money management. i guess most people won't have much fewer if the count honestly. a dozen equally important parameters for signal alone is highly overfitted IMO. a single one will not "make it". four, five might be the upper border, but that is a very generalised statement. be careful not to sum up different indicators under one parameter and believe that it truly is just one while it is already say five ... i think that there is no real final answer for your question. it depends too much on the circumstances of that specific system. check out some of acrary's posts. i would have thought that he is the ultimate fitter ... yet he proves he is not. so the same number of parameters can have very different meaning when applied by a pro or by a newbie ...
Many degress of freedom and overfitting do not mean the same thing. Think of it, the human brain is the best curve-fitting machine known to exist and it is very good at generalization.
Stephen, you are right about the amazing human brain. In reverse, it is often also most absolutely amazing how difficult it is to translate seemingly clear conclusions by the brain into machine code trying to duplicate this process. Further, in truth, the brain doesn't bother about "degrees of freedom" and "overfitting". These are rigorously defined mathematical concepts. If these help one to squeeze money out of the markets, good for the practitioner. Too many market-losers juggle this kind of impressive sounding terminology at loss of something better. nononsense
nonsense, dare to contradict, the brain is very aware of fitting and error processes. just think of you're seeing someone in the distance who looks similar to someone else. we are very well able to decide to which extent we believe it really is that very person. actually the only really relevant factor in pattern recognition is the balance between fuzziness ...
i think that if you want to introduce a lot of degrees of freedom and optimization you can make it work well for you, with a bit of effort. here's one idea how to do it. the probablility that there will be some good sets in a single optimisation with many degrees of freedom are pretty high by chance so the first thing you should know is what that chance figure is. one way of doing that is to compare the results to an optimisation run where each time an optimisation setting is required it is set at random. then at the end you should map/scatter plot both the random and non random optimisations. what you should see if the degree of curve fitting is small is a much greater degree of auto-correlation in the non random optimisation. what would they look like if the results were not curve fit? large areas of the non random optimisation should be peaks and troughs with some noise between (like smooth & rolling high hills and low valleys. the random optimisation should show mostly noise, like flat but verybumpy or uneven ground. If you want a figure to describe the result you could do this. take every combination of nine neighbouring optimisation results for random and non random runs. take the median value for the non random run and determine which percentile of the random run it would fit into. so in the end you could say that for example the median (or 50th percentile) non random result was equivalent to the 99th percentile random result (in neighbouring correlation). next you want to know how profitable is the best non random nine neighbour average (again use percentile against the random set).
Yes, of course it is incredibly hard to reverse this process, some of the worlds brightest minds have been working on this problem intensely, with lots of amazing discoveries made in the last few years. Degrees of freedom: The number of degrees of freedom in a problem, distribution, etc., is the number of parameters which may be independently varied I'd dare say that the human brain has a huge number of parameters that are varied constantly thruough the life of a brain, and there is a high level of interaction between the data and the parameters.. Overfitting can be avoided by carefully understanding how you are optimizing, using regularization techniques, bayesian methods, reinforcement learning, etc. This terminolgy is very basic and is not 'impressive sounding'. The brain actually CAN overfit, if someone who is not reasoning properly might make causality assumptions when none exist in reality.
Stephen, Your piece lacks some common sense. All I meant to say is that if you have to worry about the degrees of freedom and overfitting within you brain, I would suggest that not much fruitful thinking is taking place. Don't forget that all mathematical concepts, including the in these posts much beloved "overfitting" and "degrees of freedom", are only fictions of human construction. Sometimes mathematics has been helpful in gaining a frail human understanding of the physical world. Such understanding is almost certainly dated and will be considered crude and obsolete in a not too far distant time. This applies of course to your: "Overfitting can be avoided by carefully understanding how you are optimizing, using regularization techniques, bayesian methods, reinforcement learning, etc." - your etc& being quite prudent. As we are dealing here with theories of speculation, I can tell you that after having spent a considerable time of my life with the above, that most of these are useless if you ever aspire to make some money in the markets. Especially, don't further mix in pretentions of simplistic understanding about the inner workings of the brain. Stop dreaming and look for something better. In fact, you could put your brain to work on this instead of mimicking control and estimation procedures that indeed appear useful in the case of some physical processes. If you don't, come back in about 10-20 years to tell us about your pipe dreams. nononsense