how do u determine if something is statistically significant?

Discussion in 'Trading' started by Gordon Gekko, Jul 11, 2003.

  1. There are liars, outliers, and out-and-out liars.

    :p
     
    #11     Jul 11, 2003
  2. IN2WIN

    IN2WIN

    Personally I feel a system needs to test successfully over at least a six month period and better if six months are tested in each of at least 3 different years - bullish, bearish, and flat, as the nature of the market tends to change from time to time.

    A system that withstand the test of time is one worth learning, mastering and following!
     
    #12     Jul 11, 2003
  3. yabz

    yabz

    95% is considered statistically significant so 20 trades should be enough.

    If some situations (testing a nuclear power plant for instance) a higher confidence level would be required, probably 99%

    Testing a trading system, where all you've got to lose is something trivial like money, you can probably get away with 90% confidence.
     
    #14     Jul 12, 2003
  4. 30 has to do with the practical recommanded size for a sample of size n (n=30 items here) wich individual item follows any probability law BUT IS INDEPENDANT FROM ANY OTHER ITEM to tend (for the SAMPLE as a whole) towards the normal (Laplace-Gauss) law THAT IS TO SAY IT CONCERNS THE LAW OF THE MEAN not the law of the INDIVIDUAL ITEM.

    This implies that with only 30 trades you won't get much about consistency of mean if you decide to take a sample of a size of 30 since you have only one SINGLE SAMPLE of 30. It would be better in fact to reduce the number thirty for getting more samples than the contrary (in quality control one often uses MULTIPLE samples of only 4 or 5 items). Secondly the premisce is important: INDEPENDANCY. If your trades are too much consecutive in time for example independancy will be probably a fake. Even taking "diversified" contracts won't garantee you independancy all the time because they follow cycle that can hide their dependancy but in some risky situation this dependancy will exhibit by surprise - it is a rare event but that is just when they occur that all your statistical calculation are then proved to be false. It is not the fault of statistics it is your fault for not having taking account rigourously the conditions of application of statistical law.

    If you are an investor the mean is enough, if you are a speculator, the mean is not enough it is the variance that is the problem and the premium source of risk. And it is not the variance of the mean it is the variance of a single item and the variance of a single item is always greater than the variance of a mean. That's why if you don't have much capital you will undergo the maximum law of variation of a single item and risk the ruin with almost certainty if you only care about the mean. Even Nobel Prize B&S had fallen into the trap hee hee ! Because they thought that efficiency means mean reversion whereas efficiency in stock market is the contrary in my opinion: it means maximisation of uncertainty and so variance see http://www.elitetrader.com/vb/showthread.php?s=&threadid=19770&perpage=6&pagenumber=3
    "I will profit from this example to give you the intuition of what efficiency really means in truth."


     
    #15     Jul 12, 2003
  5. :D

    cheers to HarryTrader's sensible reply.
     
    #16     Jul 12, 2003
  6. Vishnu

    Vishnu

    There is no one sample size that can give you a comfort level. The way you determine statistical significance is by calculating the "z score" which tells you how many standard deviations away you are from the mean.

    Assume that the mean result of a random daytrade is 0. This is not an unreasonable assumption. The S&P has gone up about 0.04 a day since 1950.

    Calculate the z score as z = Average / SEM.

    SEM (the Standard Error) = standard deviation / sqrt(sample size)

    A z > 2 or < -2 is considered statistically significant (2 standard deviations covers about 95% of the area of the normal distribution). You can have statistically significant results on fairly low sample sizes if your volatility has been very low (the standard deviation is low). Or you can have a result on 1000 observations that is not statistically significant if the volatility has been very high (or if the return has been average).

     
    #17     Jul 12, 2003
  7. This is book's math at school. It is always true because they suppose all conditions are ideal. But in real world these conditions are not filled automatically, that makes the big difference between theory and practice. z score is just a normalisation procedure, a protocol for making things comparable. You can always make any statistical calculation on any sample that doesn't mean that YOU CAN USE THE INFERENTIAL CONCLUSION IF FOR EXAMPLE HOMOSCEDASCICITY OF VARIANCE(I explained the other day the contrary term heteroscedascicity which means instability of variance that is the official term since instability is too fuzzy and could be confused with variance itself !) IS NOT CHECKED. That's why one sample of 30, 100 1000 or even 10000000 can have any sense in PRACTICAL STATISTICAL INFERENCE ! In theory a sample of 1000000 is great but in reality a sample of 1000000 has a huge probability to contain heteroscedascicity more than a small sample that is why in practice it is better to have multiple samples with few items than a single sample with huge collection of items. It is only when you have assured that premisces of inferential theory are good that you can use at extreme one sample only.

    That's a general problem at school: they learn you how to calculate not really how to think and feel things relating in fact to common sense. It is when you get some experiences in real world indutries that you will know that what's in book is 1/4th of what you should know :D.

     
    #18     Jul 12, 2003
  8. Vishnu

    Vishnu

    Thanks for the reply Harry. I guess I need some more experience in real world industries.
    -James Altucher
     
    #19     Jul 12, 2003
  9. Statistically Significant...

    For me...

    Intraday:

    3 years of data + 10 years of out of sample.

    EOD (Daily+):

    10 years of data + 30 years of out of sample.

    Basically, I create systems first with 3 years and/or 10 years. Then I put the system under 10 and 30 years to see if the edge has been valid.

    My opinion and perception of the market is that market moves in cycles and also the changing market cycles have a flow so I don't like to Monte Carlo or do other stuff to mix things up.
     
    #20     Jul 12, 2003