Testing variables

Discussion in 'Strategy Development' started by acrary, Mar 7, 2003.

  1. acrary

    acrary

    Suppose you had an idea for a trading method. To find out if it had any validity you'd probably create a set of rules and backtest the idea. Even if the results look good, how do you know it's worth pursuing? One idea is to test each variable for correllation and dependency.

    For example, suppose I want to create a daytrading system using support and resistance. I'm thinking of using the pivot point work done by so many people. For those new to this, a pivot point for tomorrow = (today's high + low + close)/3. R1 or first resistance would indicate an area where an upmove might stop and turn (r1 = (2*pvt) - Low). S1 or first support would indicate an area where a downmove might stop and turn (s1 = (2*pvt)-high).

    My system might say buy at s1 (because it's a support point and sell at r1 because it's a resistance point).If the results are promising, how can I have more confidence in trading this?
    r1 and s1 are prediction points of possible highs and lows. A simple test is to use the r1 and s1 and check to see how close they correllate with tomorrow's high and low. In this example, I tested the SP market from 1996 - 2002 and did a simple correllation test in excel. r1 correllated to the next day high at .998909. Likewise s1 correllated to the next day low at .99839. These numbers sound impressive, however to find out if they are really good, we need to do another test using a variable that these should beat if the idea is sound. In this instance, I've chosen to use the open of tomorrow as a prediction for the high and low. This will help desribe how much better than a random point, r1 and s1 are. A correllation test performed using the open verus the high resulted in .999174. Likewise checking the correllation of the open versus low had a correllation of .999113. In both cases the open was more closely correllated to the high and low than r1 and s1. In this case, I could safely exclude r1 and s1 as possible variables to be used for creating a support and resistance trading model. (One interesting side note was the correllation between s1 and tomorrow's open was .999184 and the correllation between r1 and tomorrow's open was .999174. These might have some value at predicting whether tomorrow will start as an up or down day).

    Had that test done well, I would've moved on to do a test for dependency. Two types of dependency are typically found in time series. This first is deterministic. That is, for every point, we can predict with 100% accuracy the next point based on some parameter. For instance we could have a table with switch on/off represented by 1 = on and 0 = off. Next to it could be a table of light on and light off. After a brief check we'd find that when switch = 1, light = on. While we can dream about finding this type of dependency, it's not likely to be found in the markets. The other type of dependency is called stochastic dependency. It's based upon the idea that exact predictions are impossible and must be replaced by the idea that future values have a probability distribution based on a knowledge of past values.

    Since this post is so long, I'll just add a link to a paper on this topic for those interested in more detail. I'll come back to this with a example for doing stochastic dependency tests when I have more time.

    http://www.math.ethz.ch/~mcneil/ftp/pitfalls.pdf
     
  2. nitro

    nitro

    This is a nice paper. I have seen similar analsysis of the weakness of linear correlation vs cointegratrion.

    nitro
     
  3. H2O

    H2O

  4. man

    man

    acrary
    I try to get your way of thinking. You take whatever method comes around - in this case pivot pointing - and look for consistent dependencies. Then, if you found one/some, you build a trading system around that finding. Plus, as I know from previous posts of yours, you run random data against it to identify overfitting.
    I would like to know how consistent you are in trading with your results in these tests. The reason why I am asking is that I am always careful not to run things on one time series twice within one test. By this I mean that the backtest will necessarily be good and beat random since you started with the finding you already derived out of the time series itself. Do not get me wrong, I do not doubt the validity of your approach I just wonder about the amount of fitting involved.
    I mailed you once how astonished I was that somebody claims to have identified an inconsistency on weekly sp data. I am still puzzled whenever I think of it, so I would like to know how your ratio between reality and test would be.


    peace
     
  5. man

    man

    I changed the intention of my original post but left the subject line the same. I wanted to say that it might be better to look for changes than value itself for correlation. I thought that the very high readings in acrary's post might stem from this. I did the thing on excel myself and found that using moving correlations one could find much more variable correlations (between 0.5 and 0.999) and that only from 500 days onwards the readings would end very close to 1. So this initial idea is not valid and thus the subject line of my last post useless.


    peace
     
  6. I would like to ask if it is possible to program and backtest trendlines intersections with support and resistance levels like described by acrary. I believe that there is an answer to the most profitable method possible.
    Walter
     
  7. man

    man

    Walther
    the problem in programming is not the pivot variations but the trend lines. I for myself think that to define the starting point is the most challenging part in trend lines. I consider to use splines as a method to define highs and lows and use them as starting points for the trend lines. What is your concept on the trend lines?


    peace
     
  8. man

    man

    as acrary figured out the principle idea does not work - at least not on the spx.


    peace
     
  9. I am using 5 or 10 min candles and start drawing from where body starts. I used to take averages of 3 highs or lows but it didnot make too much of difference since I use geometry signal in most cases as a setup . For programming , I think, one would just use openings or closes of highs or lows.
    Walter
    I
     
  10. In maths you have to get around to significant figures to be able to post on the sustantive value of something. You can't post answers that show to be more accurate than you data's accuracy.

    Apparently none of this is known by posters in this thread.

    Anyone participating has to see this stuff as trivial because of it's basis as well.
     
    #10     Mar 12, 2003