Cointegration is one of the most
appealing and most controversial analyses. It is appealing since it is the
basis of the pair trading strategy. And is controversial since it has a
property to break, sometimes for an extended period and additionally there are
different techniques that from time to time produce conflicting results.
In a nutshell, the approach is:
(1)
run an OLS linear regression (the coefficient beta of the regression is
the hedge ratio)
(2)
test the residuals for presence
of unit root (via Augmented-Dickey-Fuller Unit Root Test). The residuals of the
regression represent the spread. And the spread is in fact: Dependent variable – Hedge ratio * Independent variable. If we reject the Null Hypothesis of Unit Root we conclude that the
spread is stationary and the two variables are mean-reverting. This also means
that two variables are cointegrated.
An alternative technique to the ADF test of the residuals is the
Johansen cointegration test (Trace and Eigenvalue test) applied to the analysed
variables. However, in most of the practical guides I’ have read the ADF
approach is preferred. I think because it is more intuitive and easier to
follow than the Johansen test.
Looking at five ETFs: SPY (SPDR S&P 500), IYY (iShares Dow Jones),
IWM (iShares Russel 2000), GLD (SPDR Gold Shares) and GDX (VanEck Vectors Gold
Miners). The last two pairs are viewed somehow as a textbook example.
Codes and results are highlighted.
library(quantmod)
library(fUnitRoots) #for the ADF test
tickers <- c('GDX','GLD', 'IWM' , 'IYY','SPY')
startDate="2013-01-01"
endDate="2016-05-04"
f <- function(x) {
x1
<- getSymbols(x[1], from=startDate, to=endDate, auto.assign=FALSE)
x2
<- getSymbols(x[2], from=startDate, to=endDate,auto.assign=FALSE)
y
<- merge(Ad(x1),Ad(x2))
eqn <- as.formula(paste(colnames(y), collapse=" ~ 0 + "))
m
<- lm(eqn, data=y)
adf<-adfTest(m$residuals, type="nc")
cat("Period is from:", startDate, "to:", endDate, "\n",
"Dependent variable and independent variable are:", colnames(y), "\n",
"The value of the test statistics is:", adf@test$statistic, "\n")
cat("ADF p-value is", adf@test$p.value, "\n")
}
p <- combn(tickers, 2, FUN=f, simplify=FALSE)
Period is from: 2013-01-01 to: 2016-05-04
Dependent variable and independent variable are: GDX.Adjusted GLD.Adjusted
The value of the test statistics is: -3.413356
ADF p-value is 0.01
Period is from: 2013-01-01 to: 2016-05-04
Dependent variable and independent variable are: GDX.Adjusted IWM.Adjusted
The value of the test statistics is: -3.717138
ADF p-value is 0.01
Period is from: 2013-01-01 to: 2016-05-04
Dependent variable and independent variable are: GDX.Adjusted IYY.Adjusted
The value of the test statistics is: -3.647598
ADF p-value is 0.01
Period is from: 2013-01-01 to: 2016-05-04
Dependent variable and independent variable are: GDX.Adjusted SPY.Adjusted
The value of the test statistics is: -3.629596
ADF p-value is 0.01
Period is from: 2013-01-01 to: 2016-05-04
Dependent variable and independent variable are: GLD.Adjusted IWM.Adjusted
The value of the test statistics is: -3.073779
ADF p-value is 0.01
Period is from: 2013-01-01 to: 2016-05-04
Dependent variable and independent variable are: GLD.Adjusted IYY.Adjusted
The value of the test statistics is: -2.953128
ADF p-value is 0.01
Period is from: 2013-01-01 to: 2016-05-04
Dependent variable and independent variable are: GLD.Adjusted SPY.Adjusted
The value of the test statistics is: -2.91571
ADF p-value is 0.01
Period is from: 2013-01-01 to: 2016-05-04
Dependent variable and independent variable are: IWM.Adjusted IYY.Adjusted
The value of the test statistics is: -0.9656891
ADF p-value is 0.3085453
Period is from: 2013-01-01 to: 2016-05-04
Dependent variable and independent variable are: IWM.Adjusted SPY.Adjusted
The value of the test statistics is: -0.8500551
ADF p-value is 0.3454008
Period is from: 2013-01-01 to: 2016-05-04
Dependent variable and independent variable are: IYY.Adjusted SPY.Adjusted
The value of the test statistics is: -0.9760423
ADF p-value is 0.3052455
Looking
at the values of test statistics of the ADF test if they are greater in
absolute terms than the critical values we can reject the Null Hypothesis of
unit root and conclude that the residuals are stationary (we also see the
p-values that are smaller than 1%, 5% or 10% levels of significance). Hence,
for our 10 pairs we can see there
is cointegration between 7 pairs:
GDX and GLD
GDX and IWM
GDX and IYY
GDX and SPY
GLD and IWM
GLD and IYY
GLD and SPY
(Please note that the regression distinguishes between
dependent and independent variable in the respective pair).
A way to look at the standard Dickey
Fuller test (differ with the
Augmented-Dickey Fuller test with the number of lags -> in the standard
Dickey Fuller test the number of lags is
0) of the residuals from the regression is to run again an OLS regression of
first difference of residuals on lagged residuals. Let’s play a bit step-by-step
with this and see what we get in R:
For the example, we will use ur.df function in R to get the ADF
and standard DF test values.
startDate="2013-01-01"
endDate="2016-05-04"
GLD <- getSymbols("GLD",
from=startDate, to=endDate, auto.assign=FALSE)
GDX <- getSymbols("GDX",
from=startDate, to=endDate,auto.assign=FALSE)
y <- merge(Ad(GDX), Ad(GLD))
eqn <- as.formula(paste(colnames(y),
collapse=" ~ 0 + "))
m <- lm(eqn, data=y)
ur_df_0<-ur.df(m$residuals, type="none",
lags=0) #now we run a standard Dickey Fuller test with 0 lags and no constant
summary(ur_df_0)
# to see the test stat and p-value
###############################################
# Augmented Dickey-Fuller Test Unit
Root Test #
###############################################
Test regression none
Call:
lm(formula = z.diff ~ z.lag.1 - 1)
Residuals:
Min 1Q Median
3Q Max
-1.46514 -0.29108 -0.03333 0.25579
1.60938
Coefficients:
Estimate Std. Error t value
Pr(>|t|)
z.lag.1 -0.01385 0.00370
-3.744 0.000193 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
Residual standard error: 0.4456 on
839 degrees of freedom
Multiple R-squared: 0.01643, Adjusted
R-squared: 0.01526
F-statistic: 14.02 on 1 and 839
DF, p-value: 0.0001935
Value of test-statistic is: -3.744
Critical values for test
statistics:
1pct
5pct 10pct
tau1 -2.58 -1.95 -1.62
#now the long way to do this – via an OLS
regression of the first difference of residuals against lagged residuals
(lag=1)
n<-lm(diff(m$residuals)
~ lag(m$residuals[-length(m$residuals)], k=1)
+0)
summary(n)
#to see full output
# or alternatively to see only t-value and
p-value:
summary(n)$coef[,3]
# to see t-value
-3.743962
summary(n)$coef[,4]
# to see p-value
0.0001934995
We have both approaches
producing the same result: value of the test statistics is -3.744 and p-value of 0.000193.
Now, the Augmented-Dickey
Fuller:
adf<-ur.df(m$residuals, type="none",
lags=1)
summary(adf)
# to see the test stat and p-value
###############################################
# Augmented Dickey-Fuller Test Unit
Root Test #
###############################################
Test regression none
Call:
lm(formula = z.diff ~ z.lag.1 - 1 +
z.diff.lag)
Residuals:
Min 1Q Median
3Q Max
-1.45579 -0.29098 -0.03797 0.24630
1.56707
Coefficients:
Estimate Std. Error t value
Pr(>|t|)
z.lag.1 -0.01263
0.00370 -3.413 0.000673 ***
z.diff.lag -0.09048 0.03408
-2.655 0.008089 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
Residual standard error: 0.4418 on
837 degrees of freedom
Multiple R-squared: 0.02144, Adjusted
R-squared: 0.01911
F-statistic: 9.171 on 2 and 837
DF, p-value: 0.0001148
Value of test-statistic is: -3.4134
Critical values for test
statistics:
1pct
5pct 10pct
tau1 -2.58 -1.95 -1.62
#now the long way to do this –
via an OLS regression of the first difference of residuals against lagged
residuals (lag=1) and the lagged first difference of residuals. Complicated? Yep,
a bit but here it is:
z.diff ~ z.lag.1 - 1 + z.diff.lag
n<-lm((diff(m$residuals))[-1] ~ lag(m$residuals[-length(m$residuals)], k=1)[-1] + lag(diff(m$residuals)[-length(diff(m$residuals))], k=1) +0)
summary(n) #to see full output
# or
alternatively to see only t-value and p-value:
summary(n)$coef[,3] # to see t-value
lag(m$residuals[-length(m$residuals)], k
= 1)[-1]
-3.413356
lag(diff(m$residuals)[-length(diff(m$residuals))],
k = 1)
-2.654686
summary(n)$coef[,4] # to see p-value
lag(m$residuals[-length(m$residuals)],
k = 1)[-1]
0.0006725011
lag(diff(m$residuals)[-length(diff(m$residuals))],
k = 1)
0.0080889750
The value of the test
statistics is -3.413 (from ur.df) and the value of the test stat of the term
of the lagged residuals from the linear regression is the same: -3.413356. (The same as the result of the ADF test of the 10 pairs).
It should also be noted, as Ernie Chan highlighted (http://epchan.blogspot.bg/2013/11/cointegration-trading-with-log-prices.html),
the difference between price spreads and log price spreads.
In the price spread the number of
shares should be kept
fixed, and short the
spread when it is much higher than average (say 1.5 standard deviations above the historical
average or some other number), and long the spread when it is much lower. For a stationary
log price spread themarket values of stocks should be kept fixed, which means that at the end
of every bar, we need to rebalance the shares due to price changes.