what's better correlation - diffs or cumulative sums?

Discussion in 'Strategy Development' started by zedDoubleNaught, Dec 5, 2012.

  1. Which is better for correlations between price differences vs cumulative sum? In a course I took they covered efficient portfolios and optimization. A part of that was getting covariance and correlation matrices, and they used simple and logarithmic returns. Instead of price returns, I'd like to use price differences (see below) for shorter time-frame trading strategies instead of a portfolio. It seems to me, in this case, the correlation of the cumulative sum of the price differences would be better than the correlation of the differences. My reasoning is that the cumulative sums do a better job of reflecting the whole path of the price series than the individual differences. But I don't know if that is correct, or if it even makes a difference. What I have seen (from very preliminary first steps of work) is that the cumulative sums have higher correlation values than the price differences. Was wondering if anyone had more insight on this, thanks in advance for comments.

    price diff: price[bar] - price[previous bar]
    cumulative sum: price[bar] - price[first bar in series]
    shorter time frame: trade of 2-4 instruments, for 3-10 days
  2. Sergio77


    I see no difference. You are talking about changing the timeframe period from daily to N days. Usually that will be better if you also trade longer term positions. If you trade daily data then daily changes sound more appropriate to me. But I think it is a good question you asked. Another example: if you trade intraday then you must look at intraday correlations.
  3. Just occurred to me -- the correlation of the differences do not allow for lag; if asset 2 lags asset 1, the correlation of the differences will be low. But using the cumsums, the correlation will remain if there is a lag between the 2.

    I did this with a single asset against itself. With the differences, if the series were in sync, the correlation = 1. Shifted by 1 bar, correlation = -.04.
    With cumulative sums, in sync, correlation=1. Shifted by 1 bar, correlation=.9995.
  4. Thanks for sharing.

    My 2 cents ...

    The series price[bar] - price[previous bar] for each asset will show more % variation from bar to bar than price[bar] - price[first bar in series]...

    Wouldn't this fact alone give you better correlations for the two price[bar] - price[first bar in series] series? And this correlation would just get better as more bars are added (which I guess you can test as you did before)?

    As you add more bars, the correlation derives less and less from correlation between bar-to-bar variations between the two assets; instead it derives increasingly from the larger (in % terms) component relating the current bar to the first bar in the series.

    So, my 2 cents would be that if you are looking for a method to force the correlation of two price series, then price[bar] - price[first bar in series] would let you do this ... and the more bars you take the better.

    But if you are looking for a methodology to help you uncover an underlying correlation between two assets, then the technique introduces a false correlation that may hide what you are looking for.
  5. Thanks, I think I see your point, one is too small vs the whole series is too big. This leads me to think I should get an estimate how many bars my trades usually are, and use that number of bars for measuring a correlation estimate; ex, average trade = 30 bars, so split the whole series into an array, each holding 30 bars, for both assets. Maybe this will give me estimates closer to how my trades would play out.

    Also, as I have it in the opening post, just want to clarify it is not only the last bar less the first bar, but the difference for each bar in the series. After looking at it more closely in R, I think the description should be:

    price[each bar] - price[first bar]
  6. where did you take this course?
  7. self study on coursera -- https://www.coursera.org/course/compfinance
    the one on computational finance and econometrics. It was really good for covering probability, some statistics, some linear algebra to work up to portfolio optimization and the efficient frontier. Includes R with full samples of code, made it much faster to pick up.
  8. Thanks for the link, I'll check them out.
  9. Sergio77


    How much? I found this:

    Financial Data Modeling and Analysis in R (AMATH 542)
    Web, Online, Winter 2013, Coursera

    Instructor: Guy Yollin

    M, W, 1/7 - 3/15, 2013, 4:30-6:20 p.m.

    Cost: $3,600 | 4 Credits
  10. Sergio77


    How can correlation possibly help you with trading? I always wondered about that but I never figured it out.
    #10     Dec 16, 2012