But if you have every month a similar high correlation number between the same 2 stocks then you've got something predictable, I would say.... And such findings are the essential building blocks for pairs trading, and similar methods... See also https://en.wikipedia.org/wiki/Pairs_trade I like especially how the NatGas tickers BOIL and KOLD behave (and the ticker names just fit very well! Btw they both are not stocks, nor indices, but ETFs): these both behave contrary to each other (cf. their charts), ie. if one rises 5% then at the same time the other falls 5%. I think this behavior can wonderfully be used in some trading method(s). Ie. go long one, and short the other.... Unfortunately, I myself couldn't test it yet, but I have it on my TODO list...
If there is good correlation between 2 or more stocks, does it have predictive value? They may be in sync in terms of up and down but the fundamental question is what come next. Up or down?
I dream of a "chained system" where you need to correctly predict only the first one in the chain... Ie. then causing a desired "multi gear chain reaction"...
That is some huge and complex gear system. I am not discouraging your effort. Just some heads up. In the real world, there is a lot of money wrenches being thrown. BTW, you may already aware of charting software that can do pairwise correlation analysis.
At HFT frequency the correlation between stocks at tick level will be effectively zero, and not informative (since the price changes will be dominated by the noise of the bid/ask bounce). Probably five minute data is the lowest frequency at which correlations make sense. Actually correlations are highly predictable. If you regress correlations in the next month over the previous 6 months you get pretty decent R squared. As long as you're not relying on them too much, assuming correlations are predictable is reasonably safe (eg doing a massively leveraged pairs trade because the correlation is 0.99 *cough* LTCM *cough*). Like the old saying goes, you can't trade correlation*, only cointegration.... GAT * except with some weird option basket trade
You would not get the same info at all. Correlation is not transitive so if X correlates to Y and Y to Z it does not imply X is correlated to Z.
Ok, I'll use the full number of matrix comparisons, ie. n*(n-1)/2 . As said I want to use EOD data for such a correlation analysis. Going backwards from current date, how much data should each stock have at minimum (ie. months or years)? And I think it makes sense to do it over a smoothed price curve, so then what MAperiod should best be used for this? What are the usual param values for such a correlation run? Ok, to recap: for short term trading (ie. DTE about up to max. 2 months) I'm interested in finding stock pairs with strong positive as well strong negative correlation. How much should the stocks have EOD data avail for such an analysis? Will the last 12 months be enough?
Ok, I think 1y EOD data should be sufficient for my above said use-case. Just got the data for 5513 tickers (all US). As next will write the code to perform the correlation stuff... This can take many days as I do also many other stuff in parallel...