Hello, I am looking at pairs trading from a very high level. I would like to build a simple model that gives me a decent hedge ratio. My first look was at a VAR (vector autoregressive model) because I played around with it a few months back. The problem is, it seems like it is most useful when we have more than 2 vectors. From recent readings, the Kalman filter looks like a very popular tool to measure the hedge ratio. However, we are adding more parameters to the model that need estimating. For my current situation, I think a rolling beta will do the job. With that being said, I am having a bit of a hard time with understanding it. Below is some R code with commentary as I try and work through an example. Before I begin I found this series on using the Kalman filter and I was wondering what others thought about it. https://robotwealth.com/kalman-filter-pairs-trading-r/ In my example, I will be looking at the GLD/GDX spread. Lets load the necessary packages and get the data Code: library(urca) library(MASS) library(quantmod) library(zoo) getSymbols(c("GDX", "GLD")) test for cointegration using Engle Granger first find error terms, we will use a robust regression with method set to "MM" Code: new.df = merge(GLD$GLD.Adjusted, GDX$GDX.Adjusted) mod = rlm(GLD.Adjusted ~ GDX.Adjusted, data = as.data.frame(new.df), method = "MM") plot(mod$residuals, type = "l") It looks like there was a trend in the residuals for the first 1500 data points before it flattened out. I am not going to even run the test. Due to the trend we will most likely see that the assets are not cointegrated. I could adjust for the trend but I do not think it's necessary for this example. Correct me if I am wrong. R has the dlm package which fits dynamic linear models. Hopefully someone has used this package before and can touch on this but for this example we will use a simple OLS regression to measure the hedge ratio. Code: regression.window = 60 GLD.rets = ROC(GLD$GLD.Adjusted) GDX.rets = ROC(GDX$GDX.Adjusted) rets.df = na.omit(merge(GLD.rets, GDX.rets)) head(rets.df) colnames(rets.df) = c("GLD", "GDX") mod.coef = rollapply(zoo(rets.df), width=regression.window, FUN = function(Z) { t = lm(formula=GLD~GDX, data = as.data.frame(Z), na.rm=T); return(t$coef) }, by.column=FALSE, align="right") tail(mod.coef) This is what the rolling beta looks like. The next part is where I am having the issue. If I flip GDX and GLD so that the formula is lm(GDX ~ GLD) (instead of GLD ~ GDX) our current Beta is 2.03. Why is it not 3(the inverse of .33)? Since we are in return space. The current beta of(.337) would mean that If I buy $10,000 worth of GLD, I sell short $3,370 worth of GDX?? If we now regress on GDX on GLD we have a current Beta of 2.03 which would mean if we are long $10,000 worth of GLD, we are short $5,000 worth of GDX. So in short, I am looking to construct a simple pairs strategy (I will be venturing into the micro-cap ETF space) where I can easily estimate a decent hedge ratio. I will also be keeping an eye with idiosyncratic and hidden-factor risks.
A couple of thoughts: 1) Maths: The estimation really doesn't matter, be it Kalman, OLS or simply some kind of ratio. A Kalman filter ends up being a sort of exponentially forgetting OLS anyway and has the disadvantage of being complex. 2) Residual "Trend": This being a problem depends on your holding time, if it's intra-day then it's not so much of a problem. If you're going to be holding to convergence over multiple days then your going to be taking a bath. 3) Regression Order: Remember lm(GDX ~ GLD) implies an intercept term in the regression. Try lm(GDX ~ GLD + 0) vs. lm(GLD ~ GDX + 0), this may give you more intuitive results.
Hi Craig, Can you elaborate a bit more on the "trend" factor? I will be looking at 5 minute - 30 minute time intervals across semi liquid etfs. Both regressions actually had an intercept very close to 0 (.00xx) so that's not the issue. Without a rolling beta, Here are the slopes of (GDX ~ 0 + GLD) and (GLD ~ 0 + GDX). I am curious to know why the slopes are not the inverse of each other and how I might set up the hedge ratio for this trade given the Beta's shown. For finding the hedge ratio, should I always be running my line through the origin? On the other hand if I am only trading cointegrated pairs, the intercept will be close to 0
I figured out the answer for why x~y is not the same as y~x. In y~x we minimized the vertical distance and x~y we minimize the horizontal distance. But I am still having a hard time wrapping my head around this intuitively (in regards to pairs trading and finding the hedge ratio) https://stats.stackexchange.com/que...ear-regression-on-y-with-x-and-x-with-y/22721
- "Trend Factor": I misread what you wrote, but you're correct, it's not worth running the test. - Origin vs. No Origin: No origin is a simpler model. - Hedge Ratio: Once again, it depends what you want to do. If you're just going to do pairs then $ neutral is probably sufficient. It's only when people do baskets that they start worrying about regressions etc. Regrading deriving the hedge ratio from a regression, regressing the prices will give you the inverse relationship between y~x & x~y.
Craig I don't want to come off as obtuse. I am generally curious here. Why would I want to be dollar neutral over beta neutral. If I am placing a bet on GLD/GDX dollar neutral is probably not the greatest idea considering GDX moves about 2x more than GLD. The problem I was having with regressing prices is autocorrelation is really high. I am trying to make the error term as independent and identically distributed (IID) as possible. You obviously have way more experience here than I do. Would you mind going over a pair of your choice and how you might hedge it?
Hedging GLD vs. GDX : You're correct! You can work this out for yourself really. Let's say GDX is at $30 and GLD is at $140. To be $ neutral you'd buy say 100 shares of GLD and 500 shares of GDX (roughly). As you say GDX is twice as volatile % wise, so how would you alter GDX's size to allow for that? It's pretty straight forward...half GDX's size. The next question you may want to consider is what is the best measure of volatility to hedge with? (percents, ranges etc). Because it does make a difference, testing will tell you. I guess the overall message I'm trying to get across (because I've been there) is keep it as simple as possible. Don't get bogged down in math, just do the simplest things that make sense and use testing to select between alternate ideas. Hedge ratios will always be approximate, remember that you're going to be buying some multiple of shares anyway.
Using regression to find your hedge ratios is probably not your best bet. If you insist on using regression than total least squares (orthogonal regression) is the way to go. Something along these lines: Code: # Pair Hedge Ratio using Total Least Squares # ========================================== tlsHedgeRatio <- function(p,q) { lstOut <- list() r <- princomp(~p+q) lstOut[[1]] <- r$loadings[1,1] / r$loadings[2,1] lstOut[[2]] <- p - lstOut[[1]] * q lstOut[[3]] <- r$center[2] - lstOut[[1]] * r$center[1] names(lstOut) <- c('beta','spread','intercept') return lstOut } However for a two asset portfolio, the regression marginalizes out and the hedge ratio is just the ratio of the individual anticipated vols. Forecast the anticipated vols, individually and jointly, and you have your hedge ratio. The insight here is that for two standardized time series the eigenvectors (note the "princomp" in the function above) of the covar matrix are always 0.7,0.7 and 0.7.-0.7 -- lines at 45 and -45 degrees. Then scale by inverse vols (to "de-standardize'). Overall though, you are better off fitting an SVECM (SVEC in vars package) or OU model and taking your hedge ratio from there. I think it is a bad idea to give up on VAR models. Remember that a VECM has a VAR representation in levels, and a VAR(p) has a VAR(1) representation (Companion Form) and from there an MA (Wold decomp, Phi call in vars package) representation -- which would allow you to use the Diebold directional flow method you posted about last week, because what you really want to know is which of the two is going to error-correct over the next few periods. So for your basket of low-liquidity ETF pairs you can allocate your scarce buying power over the error correcting halves and leave the non-error correcting halves at zero. Also with VAR models you can test for unit roots by looking at the complex eigenvalues of the companion A matix (you want the moduli of the roots to all be under one -- roots function in vars package). Edit: to deal with auto-correlated residuals in a regression model when you don't want to use anythings in the AR family (VAR, ARMA, ARIMA, etc...), then you probably want Generalized Least Squares (GLS). It is OLS where both sides of the regression equation have been multiplied by a transpose-square-root (Cholesky LL') of an estimated conditional error covar matrix (covar (e|X) Omega. The Gls function in the rms package does a pretty good job of it.
Couldn't you just 'dollar volatility adjust' the positions (historical vol/trailing vol/rolling vol) for a simple solution, generally?
Sure, that is using historical/rolling as a proxy for instantaneous or future vol, It's ok, but I'm sure that Big Short can do better than that.