Joe Doaks' Data Analysis

Discussion in 'Data Sets and Feeds' started by Joe Doaks, Jan 20, 2008.

  1. Hypo, differentiating or differencing? Surely you imply the latter.
     
    #11     Jan 22, 2008
  2. With discrete (not continuous) data, the same thing. Call it sample to sample change if you like. Keep me dishonest.
     
    #12     Jan 22, 2008
  3. Perhaps you could employ the use of frequency doubling to this signature.
     
    #13     Jan 22, 2008
  4. Very insightful! In fact, given that variable B never goes negative, to compare the difference data for A (which WILL have negative samples) we must in effect frequency double the difference values by taking their absolute value, in effect full-wave rectifying. I am several steps ahead of my posting to make sure I make no mis-steps. I thank you for your comment. And it is fun for me that you see where I am going. Do you already know the answer? Half of ET will believe it and the other half won't, so in the end I will have accomplished nothing but having a little mathemagical fun with the great innumerate unwashed.
     
    #14     Jan 22, 2008
  5. Ok, I'll bite.

    The chart of lagged correlations between variable A (Price) and variable B (Volume) confused me. So much so, that I decided to recreate your analysis.

    For the life of me, my brain can't comprehend why the values have such an orderly increase from lags 0 to 30.

    As our lag window increases, why should price level become a better predictor of volume? A price level 30 periods in the future predicts volume better than one 10 periods in the future? WTF?

    I suppose price level would have some predictive value, however I'd expect to see much greater noise in the correlations from lags 0 to 30.

    Perhaps it is because when we lag a time series, we are excluding a portion of the data? That portion happens to be the beginning part of the day when the volatility is greatest...
    Care to enlighten me? I need to brush up on my statistics.

    Thanks,
     
    #15     Jan 23, 2008
  6. Excel file for anyone interested...
     
    #16     Jan 23, 2008
  7. You're giving me the DT's, DT. I'll have to get back to you, setting up for trading right now. Thanks for posting!
     
    #17     Jan 23, 2008
  8. DT, you done broke de code! (Very old joke, called Four Roses, whose punch line is "Dass a likker, ain't it?")

    I believe you have correctly identified the mystery data as being the January 18th minute-by-minute closing price and volume data of the NASDAQ 100 CME E-Mini future with March 2008 expiration, known to some as NQ H8.

    That information sheds valuable light on my analysis consulting assignment. I'll get back to you!
     
    #18     Jan 23, 2008
  9. DT, there are several answers to the questions you pose:

    - as the analyses show, it is nearly meaningless to analyze pure price and volume, because what is important is sample-to-sample price change (you could look at volume change, but that's a waste of time, been there)

    - knowing that the sample interval is a minute, and recognizing that 30 minutes shift either way in time should be enough to show a tradeable causality, the correlation results clearly are too weak to suggest causality (I did not say, but I normailzed both series to a maximum value of 1 for numerical convenience, so when you see 0.3 correlation, it is at best only slightly better than random

    - analysis at shifts of five minute increments shows only a broad cyclicality for this data set

    - in pure price data, there is only the merest hint that price might lead volume, but a near certainty that it cannot be the other way 'round.

    Price chage and volume correlations to come.

    (A hint to impatient readers: it's really not worth hanging in here. If there were an edge I certainly wouldn't post it. My objectives are purely iconoclastic. Iconoplastic? For plastic icons?)
     
    #19     Jan 23, 2008
  10. Attached is the distribution of the bar-to-bar change in the close. The mean is -0.029, the standard deviation is 1.985, the kurtosis is 0.985, and the skewness is 0.371. That is quite close to a Gaussian (normal) distribution as you get with real-world data. So for this sample set, price change is normally distributed. That is not to say necessarily random. On ET the only thing truly random is opinions about market direction. Cross correlation of price change with volume to come.
     
    #20     Jan 23, 2008