Testing variables

OddTrader · May 26, 2004

The following article seems interesting.

http://www.cs.rpi.edu/~zaki/dmcourse/notes/9-8-03/9-8-03.pdf

Grob109 · May 26, 2004

Quote from OddTrader:

The following article seems interesting.

http://www.cs.rpi.edu/~zaki/dmcourse/notes/9-8-03/9-8-03.pdf

More...

How do I open this?

I am glad you restarted the thread and the RPI ref intrigues me (alma mater thing). I had a phyys prof whom I ran into at NFS years later. His name began with Z. We didn't hit it off.

OddTrader · May 26, 2004

Quote from Grob109:

How do I open this?

I am glad you restarted the thread and the RPI ref intrigues me (alma mater thing). I had a phyys prof whom I ran into at NFS years later. His name began with Z. We didn't hit it off.
More...

Just clicking the link works for me. Otherwise, Google "correlation zaki" will do, its the very first one.

Perhaps Zaki is the professor you met.

Grob109 · May 26, 2004

Quote from OddTrader:

Just clicking the link works for me. Otherwise, Google "correlation zaki" will do, its the very first one.

Perhaps Zaki is the professor you met.
More...

When I click it asks if I want to show it or stow it. I fill in show and it says a list of ways to show. I click other out of ignorance.

Then a menu comes down that has four choices all of which are related to nothing related to anything.

Naturally, I have an aversion to getting into programming because in my past it was all consuming and I was regarded as whatever at IBM.

OddTrader · May 26, 2004

Perhaps try the HTML version (which however is incomplete):

http://www.google.com.au/search?q=c...otes/9-8-03/9-8-03.pdf+Correlation+zaki&hl=en

BKuerbs · May 26, 2004

To illustrate my opinion about the differences (and commonalities) between the three areas, I want to use the example of pivot points.

Do pivot points define a profitable trading system? The idea is to enter a short trade, when e.g. the first level of resistance has been reached. Vice versa for the level of support. Positions are closed when the predefined stop loss is reached or a target level has been reached or the market closes.

This question is already formulated by a trader, s/he asks for profits: stable, with low drawdown etc. But these questions are secondary, when the system is not profitable it is of no further interest.

In statistics, the question will be formulated differently. The "system" defines a condition for entering a trade: Price must reach R1. Now you will ask, are the returns conditional on my entry (Price reaches R1) different from the returns, when the condition is not fulfilled? This may be done via using a contingency table. Preparation of the data is the work involved here (how do you define "condition not fulfilled"?).

Another way to test this may be to ask, whether R1 is "near" the high of the day. In such a case, the opening of a short position may be promising. You may do so, by comparing the distances |R1 - High| to the range of the day, i.e. |High - Low|. If these two distances are statistically different (using ANOVA or a rang test or whatever is appropriate), you may have an edge (Testing R1 for correlation with the high is not a valid test).

This is the crucial point where statistics differ from backtesting programs: the backtest program may tell you, you have a nice system with a lot of nice numbers, while the statistical test tells you, that the returns based upon your "condition" are nothing special. They are no better then entering the market at random.

But having a statistical edge does not necessarily translate into an economical edge: you still have to check for profits, drawdowns etc. One reason are the usual suspects. Commission, slippage (you will not be able to exit exactly at the close) and in some cases your limit order will simply not be filled: R1 may be equal to the open, you entered a limit sell order, but market dropped straight away, no fill for you.

In the pivot example, there is another reason: R1 may be above the high, so your limit order cannot be filled.

Unfortunately, using statistics is not easy. You do not have to understand all the mathematics behind it, but you certainly have to understand the concepts (and how to use the programs). E.g. the often quoted concept of (linear) correlation: there is much more to it, than just a number (the coefficient). Have a look at http://www.tufts.edu/~gdallal/anscombe.htm .

Pasting two columns into Excel and using the add-in to get a statistically significant number is not enough: you have to check all the prerequisites and even then you may be wrong. E.g. compare the height of people to the length of their hair: you will get a significant (Ãnverse) correlation. Do you see, why this is wrong? That is, you should have a "sound" idea before you start testing.

But, how do you get ideas for trading systems? Some may be more "natural", like buying, when the market opens above yesterday's high (or should you better sell? ;-))), most involve more or less intricate data mining: you have to research your data, to come up with ideas.

In a recent magazine, I read an article from someone, who tested pivot points for the FDAX. He used Tradestation, to test several ideas, like going short at R1, R2, or buying, when the open (of the next bar) was above the pivot point etc. Five of these systems were rejected: not profitable, lousy equity curve etc. Then he came up with a sixth variation, which looked "promising". Does it really?

I think he had fallen prey to the sin of "Data mining": you twist your ideas long enough to get a profitable system, i.e. till they fit your data. Which is nothing else but curve-fitting.

There are a lot of articles about data mining. One of them "Mining Fool's Gold" by Grant McQueen and Steven Thorley illuminates the problem without using too much math.

Another one is "Data-snooping, Technical trading Rule performance, and the Bootstrap" by Ryan Sullivan, Allan Timmermann, and Halbert White. Very mathematical, but just ignore the math and try to understand the ideas.

Regards

Bernd Kuerbs

OddTrader · May 26, 2004

Requirement of statistical control of a test method

Q
No precision, good or bad, can be ascribed to a test method unless instrucment and observer as a combination show statistical control. This is so regardless of the cost of testing equipment.

--- Out of the Crisis (W. Edwards Deming), P. 269
UQ

Statistical Control

Q
A stable process, one with no indication of a special cause of variation, is said to be, following Shewhart, In Statistical Control, or Stable. It is a random process. Its behavior in the near future is predictable.

--- --- Out of the Crisis (W. Edwards Deming), P. 321
UQ

A first lesson in application of statisical theory

Q
Courses in statistics often commence with study of distributions and comparison of distributions. Students are not warned in classes nor in the books that for analytic purposes (such as to improve a process), distributions and calculations of mean, mode, standard deviation, chi-square, t-test, etc. serve no useful purpose for improvement of a process unless the data were produced in a state of staistical control.

--- Out of the Crisis (W. Edwards Deming), P. 312
UQ
http://www.elitetrader.com/vb/showthread.php?s=&threadid=33025&perpage=6&pagenumber=11