Avoiding Curve fitting

Jerry030 · Jan 5, 2010

Quote from Random.Capital:

How did you pick the weights for the NN?
More...

The weights for the NN or any similar analytics software are not picked by the human but discovered by the software. The process is similar in some ways to learning in the human brain (which is why the software is called a neural network) through trial and error. Like learning to ride a bike...after a number of attempts and falling down a bit your brain and body learn how to balance.

This automated discovery of the unknown is where the term data mining comes from.

The only human decision process here is deciding what data elements to present to the software (TI, pattern dynamics, mathematical transforms) and what goal to predict in the future (change in price, end of trend, etc)

What might make this process âgoodâ curve fitting is the non-linear nature of the math. As discussed so called system optimization with a few dozen TIs tend to create systems doomed to fail. This process is linear (If RIS > .8 AND MACD > X and Channel status = Low, etc., etc., then buy. The success of non-linear system in the market suggests that the market itself is non-linear. So trying to trade it with linear tools is a bit like trying to contain water on a piece of paper. A bucket works better as it has one more dimension than a flat sheet of paper and water is 3 dimensional so the tool matches the task.

Code7 · Jan 6, 2010

Quote from intradaybill:

What you are doing is called hindsight.
More...

Yes, evaluating walk-forward performance has to be done in hindsight. However, that's completely irrelevant for the question if the results were better than random. In this case, the results were amazingly close to random, essentially identical. As a matter of fact, the APS patterns had no edge in the out-of-sample period because random entries did equally well.

Harris made the strategic decision to trade long-only back in 2002 but that has nothing to do with the robustness of APS patterns. I did the same and also traded long-only to recreate his setup, only difference was not using APS patterns to expose their effect. They hade none.

Quote from intradaybill:

Your comparison and analysis is subject to hindsight. His was not...
More...

Wrong. Harris published the updates after the fact and I bet he would not have posted results in case of an overall loss.

Random.Capital · Jan 6, 2010

Quote from Jerry030:

The process is similar in some ways to learning in the human brain (which is why the software is called a neural network) through trial and error.
More...

The "trial and error" process is nothing more than iterating variations of the parameters over and over over again, against the same set of data, until the desired result is found. That should sound familiar, as it is also the description for curve-fitting.

This automated discovery of the unknown is where the term data mining comes from.
More...

Data-mining is a somewhat related concept, in that you're looking for correlations (in the conceptual sense), but it's a different process. Training a NN isn't mining, it's fitting.

It doesn't sound like you have a solid understanding of the math behind all this - be careful!

Jerry030 · Jan 6, 2010

Quote from Random.Capital:

The "trial and error" process is nothing more than iterating variations of the parameters over and over over again, against the same set of data, until the desired result is found. That should sound familiar, as it is also the description for curve-fitting.

Data-mining is a somewhat related concept, in that you're looking for correlations (in the conceptual sense), but it's a different process. Training a NN isn't mining, it's fitting.

It doesn't sound like you have a solid understanding of the math behind all this - be careful!
More...

RandomCapital,

Your understanding on data mining might have some random ideas mixed in. Perhaps you'd care to read the definition of data mining on Wikipedia?

Data mining
Data mining commonly involves four classes of task:[11]

Classification - Arranges the data into predefined groups. For example an email program might attempt to classify an email as legitimate or spam. Common algorithms include Decision Tree Learning, Nearest neighbor, naive Bayesian classification and Neural network.

Clustering - Is like classification but the groups are not predefined, so the algorithm will try to group similar items together.

Regression - Attempts to find a function which models the data with the least error.

Association rule learning - Searches for relationships between variables. For example a supermarket might gather data on customer purchasing habits. Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes referred to as "market basket analysis".

http://en.wikipedia.org/wiki/Data_mining

Note: Neural Networks as a Classification method in "data mining". Perhaps you want to rewrite the article for them?

Does anyone know why ET has so many people who think they have great understanding on many topics when they in fact are clueless? I don't see that on other similar groups.

In any case, while RandomCapital does a refresher on this concept, rest assured that the discovery of useful data in the markets, data mining using neural networks is possible and represents a kind of Good Curve fitting. Good because the result continues to work in live trading.

Muskoka Joe · Jan 6, 2010

As a curious observer I was wondering which institutions you are familiar with that are using NN to trade?

bashatrader · Jan 6, 2010

Quote from Code7:

Yes, evaluating walk-forward performance has to be done in hindsight. However, that's completely irrelevant for the question if the results were better than random. In this case, the results were amazingly close to random, essentially identical. As a matter of fact, the APS patterns had no edge in the out-of-sample period because random entries did equally well.
More...

I think you are not being fair for whatever reason. Maybe you are not very experienced with trading system analysis. Maybe you are a competitor of Harris, who knows. These are the facts though and I agree with Bill on this one:

(1) Harris, or better APS, designed the QQQQ system based on data prior to year 2003.

(2) You know the data after 2003 and your system is not random at all. You decide to go long only because you know the data and the system Harris developed. There is nothing random about the system you selected. The only randomness is in your selection of trades to evaluate the performance. I did my own analysis and I calculated a success rate slightly better than 50% in the case of long-only trades. This makes sense. It appears that you selected a sample biased enough to match the performance of the Harris system so you can declare it random.

(3) Long-only systems during an uptrend do not always quarantee profits as many of us have unfortunately experienced in the past. During uptrends prices do not go up on a straight line and to stay profitable a system must avoid buying the highs and getting stopped at the lows.

(4) I also think that your analysis if seriously flawed by both hindsight as Bill mentioned but also by the fact that you mix prior and posteriori probabilities in a peculiar fashion that is an indication you do not understand their significance.

Quote from Code7:

Wrong. Harris published the updates after the fact and I bet he would not have posted results in case of an overall loss.
More...

I do not understand what point you are trying to make here. Could Harris have published the performance of his QQQQ system before the fact? The point Harris tried to make was to show that some patterns from APS he has published already had a high survival rate after 6 years from their discovery. You are making a hypothesis contrary to the fact, the hypothesis that he would not have posted the resutls should they have not been profitable. This accounts to a serious logical fallacy. I may not agree with some parts of Harris work and I do not like the way he runs his company but your analysis is not fair and actually flawed, especially when confusing prior and posterior probabilites.

Quote from Code7:

As a matter of fact, the APS patterns had no edge in the out-of-sample period because random entries did equally well.
More...

No, you are completely off. An edge is the mean amount the performance of a system exceeds break-even performance. An edge is an absolute measure, not a comparison to some other system performance. You can say that a system has the same edge as a random system but you cannot claim that a system has no edge because it performed as well as a random system. Saying that a system has the same adge as a random system means nothing too. I am happy my current system has a success rate of 56% and a profit factor of 1.87 although some "random system" you can think of may do better than that.

I think you misunderstand some basic concepts in trading system design and I have to align with Bill on this matter.

Rocko Bonaparte · Jan 6, 2010

I've seen neural networks used for classification by encoding the outputs to represent different situations. Say, detecting letters in an image would be a classification problem, and the neural network could have a different output neuron for each letter.

The "regression" method as in the Wikipedia article is something I tried to do once by clobbering together some random indicators with their first and second level derivatives (derivatives in the differential equations sense, not in the financial sense). It didn't work, and I'm pretty sure it was because the indicators weren't good to begin with. None of them were profitable alone, nor did I ever find a combination that helped. The neural network was a red herring that side tracked me for months due to a few bugs in its implementation; it was fools gold in the end.

I could see a neural network basically curve fitting problematically if it's applied to a narrow set of data without having enough data to counter it. And that's where curve fitting would be a problem with anything else.

Rocko Bonaparte · Jan 6, 2010

The argument still seems to be good/bad curve fitting, and even if it's completely bad in the first place. And then you can complicate it by accounting for trading style. Maybe for one person's trading style it's a bad idea, but essential for another trader.

The strategy I'm trying to develop would basically run swing trades from market open to market open. I can take historical data and map out from that the "perfect" indicator that would, if I acted on a particular day, show what my percent gain would be. Of course I'm time shifting to make that function, so it's never available for forward testing. But if there's something to try to fit a curve to, it's that.

I think a big problem would be if I found a way to fit, say, 20 consecutive points, but outside of that it's complete nonsense. That would be curve fitting in the bad way. If instead I found something that generally was able to approximate the function, then I've found my magic button.

BTW I doubt I'd ever find that magic button. So long as it's "good enough" I'm happy.

Random.Capital · Jan 6, 2010

Quote from Jerry030:

Does anyone know why ET has so many people who think they have great understanding on many topics when they in fact are clueless?
More...

You need a mirror, badly. It is 100% clear you do not understand what you are talking about. Regression is exactly how "classification" NN are trained.

I'm out.

Cheers.

Wonder what the odds are of a product-pump coming next...?

Jerry030 · Jan 6, 2010

Quote from Muskoka Joe:

As a curious observer I was wondering which institutions you are familiar with that are using NN to trade?
More...

Youâre not going to find many large or small institutions who will reveal much about the specifics, either for the conceptual design or software used at the center of their automated trade decisions. Itâs a highly proprietary and competitive area. Your best bet is to find firms making the software like shown below and try to find investment firms listed on their web site as customers.

Also IT job boards are a good way. Position requirements often list the specific software experience as a plus.

I know a few companies from my own consulting work but confidentiality agreements prevent their disclosure.

http://www.prlog.org/10244909-addaptron-software-releases-neural-network-stock-predictor.html

Addaptron Software Releases Neural Network Stock Predictor

Addaptron Software released a new upgraded version of Neural Network Stock Trend Predictor, NNSTP-2. It is a tool for stock market investors and traders to predict stock prices for short terms and to find the best timing to buy and sell.

Forecast and back-testing by NNSTP-2
FOR IMMEDIATE RELEASE

PR Log (Press Release) â May 27, 2009 â Neural networks can discover patterns in data and successfully predict the future trend. A small Canadian company, Addaptron Software - the developer of decision support tools for stock investors and traders - has developed NNSTP-2 (http://www.addaptron.com/neural-network-forecast.htm), neural network computer tool, to help stock traders in predicting stock prices within 1-60 days. NNSTP-2 predicts future share prices using Fuzzy Neural Network (FNN). It operates automatically when creating the FNN, training it, and mapping to classify a new input vector. The input data â¦â¦..