Exactly right. Excel is a spreadsheet. It's intended for accountants who need to add columns of numbers. Many push Excel beyond its comfort zone and winding up with an undebuggable, unverifiable mess, then have to backtrack and start over in another programming environment. Re your pandas/py solution, the same thing is doable in a few lines of R also.
library(quantmod) getSymbols("YHOO",src="google") getSymbols("GOOG",src="google") getSymbols("MSFT",src="google") df = data.frame( dailyReturn(YHOO), dailyReturn(GOOG), dailyReturn(MSFT) ) cor(df) took less than 10 minutes but for me over 3. I am not quite as fast with R
The mere correlation matrix, leaving out the internal mechanics, can be got in Excel with a couple fo clicks, assuming the user has turned on the Analysis Toolpack. From the "Data" ribbon, click "Data Analysis" on the far right of the ribbon. In the pop-up dialog choose "Correlation" and click the OK button. Click in the "Input Range:" box and then highlight your data columns. Then click OK. The correlation matrix will appear in a new sheet. Less than 3 minutes and the OP would not have to install and learn Python. My previous complicated Excel reply illustrates the internal mechanics of calculating the correlation and conditional correlation matrices, something your Python script does not do.
The OP seems barely literate in Excel. Trying to learn R would be a complete waste of time for him and an excercise in frustration. According to this post from Xignite, OP has a lot of company: http://www.hftreview.com/pg/blog/xi...nology-challenges-all-blown-away-by-the-cloud Read down to Tech Challenge #5, where it is claimed that "Letâs face it most of the hedge fund industryâs important activities such as modeling and reporting are still done in Microsoft Excel."
I really like Python now. In just a few hours over the last few days I've gone from nothing to some lines of code that can get a value from a specific cell in a list of old spreadsheet files I made last year, then make a Series and get some stats on them, so they're no longer dead data. Before I didn't look at them, the spreadsheets looked like walls of numbers, and too cumbersome to open them all. Very fast and easy to get something working with Python, thanks again for pointing me in the right direction. It's probably doable in R too, I haven't looked into R though.
Wow .. that was quick. daily.returns daily.returns.1 daily.returns.2 daily.returns 1.0000000 0.3759001 0.3630186 daily.returns.1 0.3759001 1.0000000 0.5659315 daily.returns.2 0.3630186 0.5659315 1.0000000 I'm guessing the OP wants to long/short a combination of something and something else around an estimated mean. How about a 3 minute particle filter?
If you want to beautify a bit, you could replace the last line of the script with > names(df)<- c("YHOO","GOOG","MSFT"); round(cor(df),2) <pre> YHOO GOOG MSFT YHOO 1.00 0.38 0.36 GOOG 0.38 1.00 0.57 MSFT 0.36 0.57 1.00</pre>
I know this is an old thread, but hope some of the contributors are still around. I am looking at creating an indicator based on the correlation of a moving period between 2 futures contracts (same way a moving average works). This seems to be easily achievable in excel using correl function but I have a couple of questions regarding this. 1. will this provide relatively accurate calculations. I have read that excel correl function is not provide accurate results ( hence recommendations to use R squared or similar for more accurate results) 2. Am I better off using close to close as the comparison basis or daily returns? from memory I believe using the daily returns is the best way to compare futures contracts. (not that my memory is anything to go by...) Thanks in advance
May use the SCX index: http://www.datatime.eu/public/gbot/MetricsForAlgorithmicTrading.htm#SCX It captures any form dependence. Not only the "linear" component (as r does). (And it is *much* more efficient computationally speaking)