Real-world Pairs Trading? Hi all, Happy holidays! I just had a question on pairs trading in real-world. How are pairs trading done in real-world? My understanding of the procedure: 1. Test stock A using ADF to decide whether A is I(0) or I(1). 2. Test stock B using ADF to decide whether B is I(0) or I(1). 3. Using Johansen or Granger-2-stage methods to decide if A and B are cointegrated. 4. Trade the pair if they are decided to be cointegrated. My questions are: 1. Are steps 1 and 2 neccessary? In real-world, most prices series are either I(0) or I(1). And if two stock prices series are I(0) and I(0), we can still do pairs trading on them, no? Do we really need both series to be I(1) to do the pairs trading? 2. In steps 1 and 2, there are different versions of the ADF tests: for example, in R, the function ur.df has "none", "const", "trend", "both" options. How can we select the best ADF options programatically and automatically, both cross-sectionally and along the time axis? For example, if we have a large universe of stocks and if we need to do rolling-window ADF tests, choosing the ADF options "automatically" and "programatically" become tricky... 3. In steps 1 and 2, does the option of "none", "const", "trend" and "both" matter? For example, if setting "trend" leads to the conclusion of stationary while setting to "const" leads to the conclusion of non-stationary, should this series be declared as "stationary" or "non-stationary" for trading purposes? 4. I read that cointegration is more or less a long term concept. How "long" is long term here? Suppose I am doing these tests on 15minute bars, how many data points shall I use in my rolling-window tests along the time axis? I am thinking of 500 data points. But maybe that's too much, remember the markets are changing and we have to be a bit adaptive... Any thoughts on this? 5. Johansen has the advantage of being symmetical in A and B. But Johansen is not stable at all if we look at the rolling-window cointegrated vectors(the eigenvectors). It seems that to get the hedge ratio, still one needs to use linear regression since it's more stable. But then would you regress A onto B and regress B onto A? They do make a difference, from my experiment... Or does that matter? 6. Some literature also mentioned using returns to do all these. My understanding is that returns are used to find the hedge-ratios approximately. Ultimately we are still trading prices, those are the tradables. We arenot trading the returns. So after we do all the tests and obtained hedge ratios using returns or other series, we still come back to prices to form a pair and pairs-trade the price levels... The hedge-ratio obtained from regressing returns of A onto returns of B is an approximation to the hedge-ratio obtained from regressing prices of A onto prices of B. When the prices of A and B are I(1), regressing prices onto prices will lead to spurious regression, but the estimate of Beta (the hedge-ratio) itself shouldn't be a problem. It's the inference that is messed up. Am I understanding this correctly? Thanks a lot! [CPed on Willmot etal]

I wonder why you think that pairs trading makes sense intraday. Someone tolds you that or you suddenly thought that you can apply it to intraday timeframes? Because AFAIK pairs trading timeframes are weeks or even months.

Yeah, there's a lot of noise intraday... the success rate is higher when you stretch out the time frame.

It's actually hours or days (sometimes weeks). Months is not trading... it's investing. In today's EFFICIENT markets pairs/basket trading... Is more a way to keep a portfolio market neutral... While one scalps continuously and captures the spread... Then the profitable cash cow it was 10 years ago. It's about 50 times harder to make money... Than when I started trading 15 years ago... So you have to make it up in automated volume.

I would bet that most intraday program trading is not momentum based and is betting on convergence. High frequency automated basket pairs is pretty much what most HFT firms are doing or some variation.

Steps 1 and 2 are necessary or you are open to spurious cointegration. Intra can work at different resolutions: http://www.ljmu.ac.uk/Images_Everyone/Jozef_1st(1).pdf Variable order matters under Engel Ganger but the software should be able to handle that. Hope this helps.

It's all about baskets, not pairs. If you don't have at least 50 pairs in big long/short baskets... And are not scalping it all like there's no tomorrow... You are totally screwed. Convergence used to make big money... But, today, it just helps you stay market neutral... So you can focus on capturing as much of the bid-ask spread as possible. Like if one of your 50 pairs is VXX/VXZ... Are you making money on the convergence... Or are you capturing that $0.06-0.07 VXZ spread all day long... And collecting roll yield on the former?