Thanks for the feedback Craig! I see you were there at one point (almost 10 years ago!). https://www.elitetrader.com/et/threads/pair-trading-question.171595/ I am looking for a simple approach, however, I have yet to be led astray by Kevin so I will most likely go down the route he is mentioning. Hey Kevin, I found this link really helpful in explaining how you went from PCA to TLS (if anyone else wanted further explanation on why you used princomp in the code) https://stats.stackexchange.com/que...ogonal-regression-total-least-squares-via-pca Minimizing orthogonal errors makes a lot of sense to me for this problem. The function worked great. It's not that I want to give up on VAR; It's that every time I see the model in use it's always Y1:Yt where Yt is usually greater than 3. I thought it might make more sense for dispersion trading where you are trading (say) SPY vs AAPL, MSFT, GOOGL, AMZN, NFLX. But you are saying that for a pair of vectors (Y1, Y2) it's still a viable model (preforms better than TLS/GLS)? Even with Diebold, he was looking at connectedness between multiple assets. But if you are saying its good for a pairs trade I will be more than happy to dive deeper. I really like the idea of just trading the error correcting leg. Maybe we can talk about this further down the road because I am sure I will have some questions in regards to selection. Ie. If long SPY is the error correcting leg on one pair and long QQQ is the error correcting leg on another pair, I end up with too much market exposure. Am I thinking about that right?
No problem, go what you're comfortable with. I wouldn't draw too many conclusions from my old posts, I was so clueless
To see why that might be the case, lets reduce the two vector case to a single vector Y* = Y1* - Y2*; where Y1* and Y2* are standard scores of Y1 and Y2 vs your smoothed individual instantaneous vol estimates. Y* is a single series conforming to what you would be trading -- when it goes down the pair is converging, when up, diverging. If you run a GLS of Y* on its lag(s), and fit an ARMA/ARIMA model on Y*, you should get the same or nearly the same Y*hat vector. But there are two main reasons to prefer the AR(I)MA model -- 1) in R it has a useful predict a few periods ahead feature; and 2) it is easier to calculate the impulse function (armimp in timsac package), which relates to my point about directional cross entropy (in the VAR case). The Y* single vector representation of the pair trade problem is also useful in case you want to fit a regime shift model on the process -- say a HMM with OU emissions where, in the Euler-Maruyama version of OU you'd have regimes of perhaps strongly positive, weakly positive, and negative lambda. Yes that is correct. That is why you want to trade enough of those low-volume ETF pairs so that you can balance out longs and shorts to neutralize the market factor at least.
Quick question as I study VECM. When I test for cointegration, Yt - BXt = e, where e is iid, should I be using OLS,TLS or GLS to find my error term to feed into the cointegration test? Do you have a preferable cointegration test/function? It seems there are quite a bunch in R. Originally I thought to use lm(y ~x) where x is stock1 px and y is stock2 px for the error term, but autocorrelation will effect my results. So maybe I should use GLS for the error term to test for cointegration?
As was mentioned previously - what I have personally found to work best as “quick and dirty” is to monetize the volatility and average trading range for each spread leg on a rolling 20 day average. That will get you plenty close enough. Especially for shorter term timeframes. I think you will find reliable and efficient execution to be a much bigger and more formidable challenge than getting your Beta ratio tuned to the Nth degree. Don’t assume that your broker will automatically have the shares available to you that you want to short at the precise time you need to enter a sell short order. And getting decent fills on simultaneous spread legs in the real live stock market (not a simulator) is QUITE a challenge - that’s where your real test will come.
I generally prefer Box-Tiao as elucidated by Bewley, Yang, et al in a series of papers starting in about 1985. In this case though, the particular test is not that important, take your pick. You'd be fine with the original Engle-Granger two stage method based on simple regression. Your choice of test dictates the answer to your first question (OLS vs TLS ...), but again, not that important because ... if you have a valid ECM, your vectors are cointegrated and there is Granger Causality lurking somewhere in there, even if time-varying, and you'll want to know the strength and direction of that causality. Unit root tests don't tell you those two things -- distance from a critical value is not a measure of the strength of the conintegrating relationship, just as p-value is neither power nor size. So, in the end, you'll have to fit a model of some kind anyway, so why not just test the model. I also suggest you look for group or basket (more than two assets) cointegration relationships. A good rule of thumb is that you will need at least one more asset than the number or risk factors you want neutralized in your trading. For example if you're looking for a cointegrated basket of rate instruments and you want to neutralize the usual three components (level, slope, curvature), you'll need at least 4 reasonably widely separated maturities to fit your cointegrating basket. In the ETF space, you might want to neutralize the market and industry factors, or perhaps some or all the Fama-French risk factors. Edit: obviously, if your arbing off-the-run vs on-the-run the maturities are so close that you don't have much exposure to the three components and my rule of thumb above doesn't hold, Also, I should have mentioned previously about TLS when you want to use a Kalman Filter to model the time-varying coefficents, as I think you mentioned in your original post. In that case don't use princomp or eigs for the TLS computation.
Kevin, I am finished my first round of studying and will try my first attempt at a simple VECM model. I would greatly appreciate if you could evaluate my work. Next will be SVECM and then basket cointegration. If you don't mind explaining it as if you were talking to a 10 year old that would really help solidify the information. Here are some links I found very useful (the edx course was great). https://courses.edx.org/courses/cou...9b5f9d8b832f4fa4837d09447db2dd2c/?child=first https://rpubs.com/simasiami/384720 https://ses.library.usyd.edu.au/bit...d=FC2AFBEDBA9C129F1427C06F6F834248?sequence=1 https://stats.stackexchange.com/questions/tagged/vecm https://stats.stackexchange.com/questions/tagged/var https://cran.r-project.org/web/packages/vars/vignettes/vars.pdf Going through the questions on cross validated really helped me. Richard does a great job at breaking var and vecm down. You originally mentioned to use SVECM so hopefully after I get the okay from you on VECM we can move on. On a side note, if I have two assets that are not cointegrated and non-stationary, I cant use VAR or VECM. So for pairs like KO/PEP or GDX/GLD where trends and large structural shifts exist, should I completely avoid them? Before I get into this I would also like to mention for future readers - This was a great exercise and I learnt a lot. The VAR family is applicable too many fields in finance. If any of you want my notes, send me a PM and I will share with you my Google doc. For simplicity we are going to look at SPY/QQQ. The goal is to identify when the spread is too large, how fast it will mean revert, which is the error correcting leg and what our hedge ratio is. Code: #lets import the data getSymbols(c("SPY", "QQQ"), from = "2010-01-01") df = merge(log(SPY$SPY.Adjusted), log(QQQ$QQQ.Adjusted)) ts.plot(df, col = c("red", "blue")) There is a bit of a trend in the residuals, but if we take a quick look at the spread using the tlsHedgeRatio function, there does seem to be a linear combination that turns the residuals into an I(0) process. Code: ksSpread = tlsHedgeRatio(df$SPY.Adjusted, df$QQQ.Adjusted) plot(ksSpread$spread) So right off the bat these assets seems cointegrated. Let's do a Johanson test just to make sure. Code: cointegration <- ca.jo(df, type="trace",ecdet="trend",spec="transitory") summary(cointegration) cointegration@teststat It seems we have more than 0 and less than one cointegrated relationship. r = 0 would mean both our assets were already I(0) Next we look for causality - Does SPY causes QQQ and vice versa. For this we will use the Granger Causality test. To do this we first calculate the stock returns. Code: diff.spy = diff(df$SPY.Adjusted) diff.qqq = diff(df$QQQ.Adjusted) df.rets = na.omit(cbind(diff.spy, diff.qqq)) We then fit a VAR model. For the VAR model we allow the function to minimize the AIC score. On a side note, I am not 100% sure about the intuition on choosing between c("both", "trend", "const"). But I have chosen to use "both" here. Code: rets.var = VAR(df.rets, type = "both", lag.max = 8, ic = "AIC") causality(rets.var, cause = "QQQ.Adjusted")$Granger causality(rets.var, cause = "SPY.Adjusted")$Granger It seems like we can not reject the NULL and neither SPY or QQQ cause each other. Next we estimate our VAR model take a look at the summary statistics and build our VECM. Code: #here we let the model choose the optimal lag length. We want to minimize the AIC so we chooes n = 2 VARselect(df, lag.max = 8, type = "both")$selection #Kevin could you give some intuition between parameters eigan and trace? I have only seen math formulas for the reasoning and I can't get my head around it. #next we build vecm with lag length 2 var1 = ca.jo(df, K = 2, type = "eigen", ecdet = "const", spec = "transitory") #The VECM spits out only 1 lag here because VECM must have 1 less lag than VAR vecm = cajorls(var1) summary(vecm$rlm) #since neither spy nor qqq cause eachother, i did not add any restriction what do you think? #should I be adding any restrictions to this model? v1.VAR = vec2var(var1) #lets make a forcast for 10 days ahead and plot v1.VAR.fcst = predict(v1.VAR, n.ahead = 10) Voila! We have our hedge ratio and constant! I am using ect1 for this so our hedge ratio if we go long SPY is short .7762 QQQ +1.6. Kev what's the intuition here? Is this dollar weight or stock weighted? If it is not dollar weight I am not to sure how to interoperate the constant coefficient for my hedge ratio. Here is how well our rlm does. And here is our (zoomed in) forecast for 10 days ahead. The main areas where I currently need some help is: 1) my code (am i correctly writing the code) 2) Where do I find the error correcting constant in the vecm output? Once I find it, how do I understand which one is more likely to correct (assuming neither are non 0, or is one always 0?)? 3)I would like to forecast the spread, however, it seems I am forecasting each leg on it's own. Does the var package have a function where I can graph the spread with it's forecast or do I have to construct the spread using the coefficients provided above and run a ARMA, ARIMA model? Sorry I have all these questions for you Kevin, but I am not comfortable taking advice about this stuff from my professors. Last time I asked about vol being in backwardation going into earnings, I got the answer "because farther dated options are less liquid and most of the demand is for the near dated options".
I'll answer your full post later when I have the time, but for now I'd like to clear up this misaprehension, If you regress y on X, both non-stationary and not cointegrated, and its lags (like we used to do with distributed lag models pre-VAR) without including lags of y on the right-hand-side, then yes, the model will be spurious and inconsistent. However, if lags of y are included, like in a VAR, the model is consistent and useful as-is for, esp, forecasting/prediction. It is perfectly ok to run a VAR in levels on non-stationarly not cointegrated series. The standard errors will need adjustment, IRF is tricky, and tests (for e.g. Granger Causality) are problematic, but forecasts are consistent and unbiased. In the econometrics literature, VAR in levels on non-stationary series is quite common. Modeling shocks/IRF's is difficult because theoretically the Wold decomposition doesn't exist (matrix is not invertible), but if it is close, you can still get decent IRF estimates. Tests can still be run jointly, Sims wrote on this 30 years ago and IIRC a couple of Japanese researchers expanded on this (don't remember their names). I wouldn't give up on them quite yet. I would be very surprised if there were no tradable cointegration relationship in those two pairs. For time, seasonal, or deterministic trends, remove them first. Stochastic trend is not a big problem with VAR ( and remember that any VECM has a VAR(1) representation). Model shocks like structural shifts in the VAR model (in fact the degenerate Wold case (MA representation) implies some shocks have permanent effect). Remember that most cointegration tests have low power when the series is very close to I(1) but actually not. A note on TLS -- it may be useful to think of TLS as a special case of ridge regression. Both are essentially a diagonal matrix of small numbers subtracted from the XX' matrix. Edit: you can always run a chain-weighted OU fit like Carr and Lopes de Prado do in a recent paper (complete with Python code!). These are two of my least favorite researchers, but even a blind squirrel finds an acorn occasionally. OU (Euler form) can be easily fit with simple OLS.
Attached is the Toda and Yamamoto paper I referenced above. The method is fully described in the abstract. It can be implemented in the vars package by including max-integration-order-lagged x and y (e.g. GLD and GDX) as exogenous variables in your VAR (exogen parameter in the VAR call) and then using the causality method to test for Granger causality (actually the null is no Granger causality) using the Wald test.
Thanks for the paper(s) Kevin. I am going to get started on the VAR one right now. It looks a bit dense so I might take me a few days to work out the equations. After that i'll check back in with you to pick your brain a bit more on this topic. In the mean time, I am going to migrate over to CV/QF to ask more questions (give you a bit of a breather )