Help with Correlation Analysis ->

Discussion in 'Strategy Building' started by Stok, Jan 19, 2012.

  1. Stok

    Stok

    Hey fellow ET'ers, need some help:

    1.) I want to run a correlation matrix/analysis on 7 markets going back at least 10 years. Any insight to a website or software that can pull that info together and run the analysis (based on daily or even weekly bars).

    2.) I would also like to run a correlation analysis on equity curves on seven markets. Meaning, I have individual equity curves and would like to run the correlation they have all together. I assume I need to build something in Excel, but looking for advise.

    Thanks!
     
  2. You might be able to get the data from yahoo or google finance, depending on the markets.

    I've never tried correlation analysis, other than seeing how the general stock index looked when I made long or short trades on individual stocks. I found huge correlation just by plotting my trades on the index chart and noting the success rate when I was aligned with the prevailing trend at the time.

    You may try building a correlation matrix in excel based on a simple trend indicator like a moving average.

    With a little intellectual exertion, this could be built in excel in a matter of a few days.

    If you're not into that sort of thing, an engineering intern could do it. They'll work for about $17 an hour.

    :D
     
  3. rosy2

    rosy2

    import pandas as pd
    from pandas.io.data import DataReader

    symbols = ['MSFT', 'GOOG', 'AAPL']
    data = dict((sym, DataReader(sym, "yahoo"))for sym in symbols)
    panel = pd.Panel(data).swapaxes('items', 'minor')
    close_px = panel['Close']
    rets = close_px / close_px.shift(1) - 1
    rets.corr()


    that will be $10,000
     
  4. Thanks for noting pandas, I had never heard of it. I've been looking for something to do with python to learn it.
     
  5. More like ten minutes.

    Put your 7 10-year price series in columns A through G.
    That should be about 2550 rows. Then highlight cells I1
    through O7. A 7x7 matrix of cells should be highlighted.
    Enter the following forumla in the top left cell in the
    highlighted range:

    MMULT(TRANSPOSE(LN(A2:G2550/A1:G2549)),LN(A2:G2550/A1:G2549))/2548

    This is an array formula, so press [ctrl][shift][enter]
    instead of just [enter] when you are done.

    This is your variance-covariance matrix Actually is is
    your scatter matrix, as the var-covar requires demeaned
    data. However I wouldn't worry about it as the mean
    of daily log returns is rarely significantly different
    from zero.

    Now highlight another 7x7 range in cells I9 through O15.
    In the top left cell enter the following formuala:

    SQRT(I1:O7*IF(ROW(OFFSET(INDIRECT("A1"),0,0,7,7))=COLUMN(OFFSET(INDIRECT("A1"),0,0,7,7)),1,0)

    Again, this is an array formua so use [ctrl][shift][enter]
    The numbers in the diagonal of this 7x7 matrix are the
    standard deviations of each column of log returns.

    Now highlight another 7x7 range in cells I17 through O23.
    In the top left cell enter the following formuala:

    MMULT(MINVERSE(I9:O17),MMULT(I1:O7,MINVERSE(I9:O17)))

    Use [ctrl][shift][enter]
    The off-diagonal elements are the pairwise correlations.

    You might be more interested in the partial or conditional
    correlations. However Excel, to my surprise does not have
    a matrix square root function. And since they have neither
    a Cholesky nor an Eigen depompostion built in, it is not
    easy to roll your own. You can, if you like, derive the
    partial correlations, the multiple r-squareds, and the
    beta coefficients on a cell-by-cell basis from the inverse
    correlation matrix. To do that highlight another 7x7 range
    in cells I25 through O31. In the top left cell enter the
    following formuala:

    MINVERSE(I17:O23)

    Use [ctrl][shift][enter]
    To get the partial correlation of off-diagonal element
    e(i,j) use p = -e(i,j)/sqrt(e(i,i)*e(j,j))
    For r-squared on diagonal elements e(i,i) use
    R = sqrt(1-1/e(i,i)
    For Beta coefficients on multiple regression
    of diagonal element e(i,i) vs any other e(j,j) nse
    use b = -1/e(i,j)


    There you go. And I charged less than $17 an hour
    for it.
     
  6. What a guy. It pays to know how to ask.

    :D
     
  7. I made at least one typo in that post and it is too late to edit.

    the formuaa:

    MMULT(MINVERSE(I9:O17),MMULT(I1:O7,MINVERSE(I9:O17)))

    should be

    MMULT(MINVERSE(I9:O15),MMULT(I1:O7,MINVERSE(I9:O15)))
     
  8. ronblack

    ronblack

    You're too good!
     
  9. rosy2

    rosy2

    looks complicated compared to my approach which took about 3 minutes.
     
    #10     Jan 24, 2012