QuantX Research: 2016

Friday, 5 August 2016

Cointegration: ADF Test Critical Values

The second step in the Engle-Granger cointegration approach is to test if the residuals from the regression have unit root via ADF test. When we apply the ADF test on residuals (estimates) instead on actual time-series we can not use the Dickey and Fuller critical values and p-values that are reported. The critical values when we use the ADF test on residuals are stricter than the original critical values (this means that the critical values are lower and thus it is less likely to reject null hypothesis of unit root). There are several sets of critical values – as Engle and Yoo (1987), MacKinnon (1991), Phillips and Ouliaris (1990).

Phillips and Ouliaris critical values are available here: http://finpko.faculty.ku.edu/myssi/FIN938/Phillips%20%26%20Ouliaris_Asymp%20Props%20of%20Resid%20Based%20Tests%20for%20Coint_Econometrica_1990.pdf – Table IIa (no intercept and no trend), IIb (intercept but no trend) and IIc (both intercept and trend).

We can express formally the three equations as follows:

From the table above, if we have one explanatory variable, constant only at 5% level of significance, the critical value is -3.37 (regression b case); in the case of both constant and trend, the critical value is -3.8.

This is for Phillips and Ouliaris critical values. MacKinnon (http://qed.econ.queensu.ca/working_papers/papers/qed_wp_1227.pdf) approaches the situation differently – he estimates response surface regression and the function used to calculate the critical values is:

Here is a table with MacKinnon critical values, corresponding to the second case regression – intercept, no trend (when comparing the critical values, it should be noted that in MacKinnon approach N is the number of cointegrating variables, while in Phillips and Ouliaris N is number of explanatory variables):

Based on the same procedure I interpolated the critical values for the third case – intercept and trend:

From the table with MacKinnon critical values at 5% level of significance if we have two variables with 200 observations and we have only a constant included in the ADF-regression, the critical value is -3.368; in case of 500 observations, the critical value is -3.350.

If we have both constant and trend, at 5% level of significance and 200 observations, the MacKinnon critical value is -3.828.

It is generally accepted that if intercept is included in the cointegrating regression, it is omitted in the ADF equation.

Tuesday, 17 May 2016

Cointegration in R

Cointegration is one of the most appealing and most controversial analyses. It is appealing since it is the basis of the pair trading strategy. And is controversial since it has a property to break, sometimes for an extended period and additionally there are different techniques that from time to time produce conflicting results.

In a nutshell, the approach is:

(1) run an OLS linear regression (the coefficient beta of the regression is the hedge ratio)

(2) test the residuals for presence of unit root (via Augmented-Dickey-Fuller Unit Root Test). The residuals of the regression represent the spread. And the spread is in fact: Dependent variable – Hedge ratio * Independent variable. If we reject the Null Hypothesis of Unit Root we conclude that the spread is stationary and the two variables are mean-reverting. This also means that two variables are cointegrated.

An alternative technique to the ADF test of the residuals is the Johansen cointegration test (Trace and Eigenvalue test) applied to the analysed variables. However, in most of the practical guides I’ have read the ADF approach is preferred. I think because it is more intuitive and easier to follow than the Johansen test.

Looking at five ETFs: SPY (SPDR S&P 500), IYY (iShares Dow Jones), IWM (iShares Russel 2000), GLD (SPDR Gold Shares) and GDX (VanEck Vectors Gold Miners). The last two pairs are viewed somehow as a textbook example.

Codes and results are highlighted.

library(quantmod)

library(fUnitRoots) #for the ADF test

tickers <- c('GDX','GLD', 'IWM' , 'IYY','SPY')

startDate="2013-01-01"

endDate="2016-05-04"

f <- function(x) {

x1 <- getSymbols(x[1], from=startDate, to=endDate, auto.assign=FALSE)

x2 <- getSymbols(x[2], from=startDate, to=endDate,auto.assign=FALSE)

y <- merge(Ad(x1),Ad(x2))

eqn <- as.formula(paste(colnames(y), collapse=" ~ 0 + "))

m <- lm(eqn, data=y)

adf<-adfTest(m$residuals, type="nc")

cat("Period is from:", startDate, "to:", endDate, "\n", "Dependent variable and independent variable are:", colnames(y), "\n", "The value of the test statistics is:", adf@test$statistic, "\n")

cat("ADF p-value is", adf@test$p.value, "\n")

}

p <- combn(tickers, 2, FUN=f, simplify=FALSE)