Wednesday 21 June 2017

Output Gap in Romania and Bulgaria (Hodrick-Prescott Filter)

In this post I use Hodrick-Prescott filter (a very simple, widely used but also controversial filter) to estimate the output gap of Bulgaria and Romania during the period Q1 2000 - Q1 2017. The output gap is estimated in Eviews, but I also attach Excel step-by-step implementation for Bulgaria (https://drive.google.com/file/d/0B5OoJUuowhtZMzZLMlBBb1JhOTQ/view?usp=sharing)


I attach the pdf-file with theoretical part. https://drive.google.com/file/d/0B5OoJUuowhtZaDBZdWRLem9ZbHM/view?usp=sharing

Thursday 5 January 2017

Volatility Spillover between VIX and Brent Oil Price (Multivariate GARCH in R)


Volatility transmission is a very important feature of the financial markets. Harris and Pisedtasalasai (“Return and Volatility Spillovers Between Large and Small Stocks in the UK”, 2005) outline the following points:” (1) transmission mechanisms tell us something about market efficiency. In an efficient market, and in the absence of time-varying risk premia, it should not be possible to forecast the returns of one stock using the lagged returns of another stock. The finding that there are spillover effects in returns implies the existence of an exploitable trading strategy and, if trading strategy profits exceed transaction costs, potentially represents evidence against market efficiency. (2) transmission mechanisms may be useful for portfolio management, where knowledge of return spillover effects may be useful for asset allocation or stock selection. (3) information about volatility spillover effects may be useful for applications in finance that rely on estimates of conditional volatility, such as option pricing, portfolio optimization, value at risk and hedging.”


Stepping on this research basis we can try to answer the following questions:

Does volatility spillover effects exist between the CBOE VIX index and Brent oil price?

Which one (VIX or oil price) is the main volatility transmitter?


The model applied to answer these questions is Multivariate GARCH model – and in particular BEKK-GARCH.  BEKK-GARCH Model (named after Baba, Engle, Kraft and Kroner) is an extension of the bivariate GARCH model and is able to capture volatility transmission among different financial assets, as well as the persistence of volatility within each of the assets analysed.

A little bit of methodology to explain the idea:
 

The vector autoregressive stochastic process of the returns can be presented in the following form (Karunanayake et al., 2009):

 Then the conditional variance-covariance matrix H has n dimensions (where n is the number of assets analysed) with diagonal elements representing the variance and non-diagonal elements – the covariance:
 
 
where vech (H) is an operator that stacks the columns of the lower triangular part of its argument square matrix, H is the covariance matrix of the residuals.
C is the upper triangular matrix of constants.
A and B in the  equation are both symmetric matrices. The non-diagonal elements of matrix A (i.e. ARCH effects) measure the effect of innovation (shocks) in market i on  market j, while the diagonal elements measure own innovation effect (shocks) of market i. The non-diagonal elements of matrix B (i.e. GARCH effects) measure the persistence of conditional volatility spillover between markers (cross-volatility spillover), while diagonal elements measure own volatility persistence.
And the implementation in R:
 
library(Quandl)
library(PerformanceAnalytics)
library(MTS)
vix<-Quandl("YAHOO/INDEX_VIX", collapse="daily", start_date="2006-01-01", type="zoo")
vix<-vix[, "Adjusted Close"] #take only the adjusted close values
vix.ret<-CalculateReturns(vix, method="log")
vix.ret<-vix.ret[-1,] #removes the first raw since it is NA
 
brent<-Quandl("EIA/PET_RBRTE_D", start_date="2006-01-01", type="zoo")
brent.ret<-CalculateReturns(brent, method="log")
brent.ret<-brent.ret[-1,] #removes the first raw since it is NA
 
#merge VIX and BRENT returns and exclude NA cases:
data<-cbind(vix.ret, brent.ret)
data<-na.omit(data)
 
# Charting the 20-day rolling correlation and covariance:
 
chart.RollingCorrelation(vix.ret, brent.ret, width=20)
cor.fun = function(x){
  cor(x)[1,2]
}
cov.fun = function(x){
  cov(x)[1,2]
}
roll.cov = rollapply(as.zoo(data), FUN=cov.fun, width=20,
                     by.column=FALSE, align="right")
roll.cor = rollapply(as.zoo(data), FUN=cor.fun, width=20,
                     by.column=FALSE, align="right")

par(mfrow=c(2,1))
plot(roll.cov, main="20-day rolling covariances",
     ylab="covariance", lwd=2, col="blue")
grid()
abline(h=cov(data)[1,2], lwd=2, col="red")
plot(roll.cor, main="20-day rolling correlations",
     ylab="correlation", lwd=2, col="blue")
grid()
abline(h=cor(data)[1,2], lwd=2, col="red")
par(mfrow=c(1,1))
 
#Now make BEKK11 from MTS-package:
m1=BEKK11(data) #takes some time to calculate it
 

And we get the following result:
Coefficient(s):
                  Estimate   Std. Error   t value   Pr(>|t|)   
mu1.vix.ret   -6.65408e-05  3.06049e-03  -0.02174 0.98265383   
mu2.brent.ret -2.13211e-05  1.61574e-03  -0.01320 0.98947151   
A011           7.33477e-02  1.29265e-03  56.74214 < 2.22e-16 ***
A021          -3.92517e-03  1.06010e-03  -3.70265 0.00021336 ***
A022           2.16678e-02  4.40651e-04  49.17241 < 2.22e-16 ***
A11            1.00000e-01  3.31692e-02   3.01485 0.00257110 **
A21            2.00000e-02  1.54254e-02   1.29656 0.19478119   
A12            2.00000e-02  7.67921e-02   0.26044 0.79452162   
A22            1.00000e-01  4.28668e-02   2.33281 0.01965831 * 
B11            8.00000e-01  1.05377e-02  75.91781 < 2.22e-16 ***
B21            1.00000e-01  4.06332e-03  24.61044 < 2.22e-16 ***
B12            1.00000e-01  1.53339e-02   6.52152 6.9601e-11 ***
B22            8.00000e-01  6.95043e-03 115.10074 < 2.22e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

And it is important here to distinguish the diagonal and non-diagonal elements of the matrices A and B. Diagonal elements are A11, A22 and B11 and B22  and represent respectively the influence of own volatility shocks and the influence from past squared volatilities. The off-diagonal elements (A12, A21 and B12, B21) represent the cross-market effects of shocks and volatility spillover among the markets. While all B-coefficients are statistically significant, indicating that there is both own volatility and volatility spillover effects between VIX and Brent oil price, the off-diagonal A elements are not statistically significant, indicating that no significant cross-market effect of shocks exist. The diagonal elements of A are statistically significant.
Both off-diagonal elements of B are 0.1 indicating that 1% increase in VIX index transmit 10% volatility to Brent oil price and 1% increase in Bren oil price transmit 10% volatility to VIX. Nevertheless, B21 has lower p-value than B12, indicating that the transmission from VIX to Brent is stronger.


 

Friday 5 August 2016

Cointegration: ADF Test Critical Values


The second step in the Engle-Granger cointegration approach is to test if the residuals from the regression have unit root via ADF test. When we apply the ADF test on residuals (estimates) instead on actual time-series we can not use the Dickey and Fuller critical values and p-values that are reported. The critical values when we use the ADF test on residuals are stricter than the original critical values (this means that the critical values are lower and thus it is less likely to reject null hypothesis of unit root). There are several sets of critical values – as Engle and Yoo (1987), MacKinnon (1991), Phillips and Ouliaris (1990).

Phillips and Ouliaris critical values are available here: http://finpko.faculty.ku.edu/myssi/FIN938/Phillips%20%26%20Ouliaris_Asymp%20Props%20of%20Resid%20Based%20Tests%20for%20Coint_Econometrica_1990.pdf – Table IIa (no intercept and no trend), IIb (intercept but no trend) and IIc  (both intercept and trend).
We can express formally the three equations as follows:


From the table above, if we have one explanatory variable, constant only at 5% level of significance, the critical value is -3.37 (regression b case); in the case of both constant and trend, the critical value is -3.8.

This is for Phillips and Ouliaris critical values. MacKinnon (http://qed.econ.queensu.ca/working_papers/papers/qed_wp_1227.pdf) approaches the situation differently – he estimates response surface regression and the function used to calculate the critical values is:


Here is a table with MacKinnon critical values, corresponding to the second case regression – intercept, no trend (when comparing the critical values, it should be noted that in MacKinnon approach N is the number of cointegrating variables, while in Phillips and Ouliaris N is number of explanatory variables):


Based on the same procedure I interpolated the critical values for the third case – intercept and trend:


From the table with MacKinnon critical values at 5% level of significance if we have two variables with 200 observations and we have only a constant included in the ADF-regression, the critical value is -3.368; in case of 500 observations, the critical value is -3.350.
If we have both constant and trend, at 5% level of significance and 200 observations, the MacKinnon critical value is -3.828.
It is generally accepted that if intercept is included in the cointegrating regression, it is omitted in the ADF equation.




Tuesday 17 May 2016

Cointegration in R

Cointegration is one of the most appealing and most controversial analyses. It is appealing since it is the basis of the pair trading strategy. And is controversial since it has a property to break, sometimes for an extended period and additionally there are different techniques that from time to time produce conflicting results.
In a nutshell, the approach is:
(1)    run an OLS linear regression (the coefficient beta of the regression is the hedge ratio)
(2)    test the residuals for presence of unit root (via Augmented-Dickey-Fuller Unit Root Test). The residuals of the regression represent the spread. And the spread is in fact: Dependent variable – Hedge ratio * Independent variable. If we reject the Null Hypothesis of Unit Root we conclude that the spread is stationary and the two variables are mean-reverting. This also means that two variables are cointegrated.
An alternative technique to the ADF test of the residuals is the Johansen cointegration test (Trace and Eigenvalue test) applied to the analysed variables. However, in most of the practical guides I’ have read the ADF approach is preferred. I think because it is more intuitive and easier to follow than the Johansen test.


Looking at five ETFs: SPY (SPDR S&P 500), IYY (iShares Dow Jones), IWM (iShares Russel 2000), GLD (SPDR Gold Shares) and GDX (VanEck Vectors Gold Miners). The last two pairs are viewed somehow as a textbook example.
Codes and results are highlighted.
library(quantmod) 
library(fUnitRoots) #for the ADF test

tickers <- c('GDX','GLD', 'IWM' , 'IYY','SPY')

startDate="2013-01-01"
endDate="2016-05-04"

f <- function(x) {
        x1 <- getSymbols(x[1], from=startDate, to=endDate, auto.assign=FALSE)
        x2 <- getSymbols(x[2], from=startDate, to=endDate,auto.assign=FALSE)
        y <- merge(Ad(x1),Ad(x2))
        eqn <- as.formula(paste(colnames(y), collapse=" ~ 0 + "))
        m <- lm(eqn, data=y)
        adf<-adfTest(m$residuals, type="nc")
        cat("Period is from:", startDate, "to:", endDate, "\n", "Dependent variable and independent variable are:", colnames(y), "\n", "The value of the test statistics is:", adf@test$statistic, "\n")
        cat("ADF p-value is", adf@test$p.value, "\n")
}

p <- combn(tickers, 2, FUN=f, simplify=FALSE)

Period is from: 2013-01-01 to: 2016-05-04
 Dependent variable and independent variable are: GDX.Adjusted GLD.Adjusted
 The value of the test statistics is: -3.413356
ADF p-value is 0.01

Period is from: 2013-01-01 to: 2016-05-04
 Dependent variable and independent variable are: GDX.Adjusted IWM.Adjusted
 The value of the test statistics is: -3.717138
ADF p-value is 0.01

Period is from: 2013-01-01 to: 2016-05-04
 Dependent variable and independent variable are: GDX.Adjusted IYY.Adjusted
 The value of the test statistics is: -3.647598
ADF p-value is 0.01

Period is from: 2013-01-01 to: 2016-05-04
 Dependent variable and independent variable are: GDX.Adjusted SPY.Adjusted
 The value of the test statistics is: -3.629596
ADF p-value is 0.01

Period is from: 2013-01-01 to: 2016-05-04
 Dependent variable and independent variable are: GLD.Adjusted IWM.Adjusted
 The value of the test statistics is: -3.073779
ADF p-value is 0.01

Period is from: 2013-01-01 to: 2016-05-04
 Dependent variable and independent variable are: GLD.Adjusted IYY.Adjusted
 The value of the test statistics is: -2.953128
ADF p-value is 0.01

Period is from: 2013-01-01 to: 2016-05-04
 Dependent variable and independent variable are: GLD.Adjusted SPY.Adjusted
 The value of the test statistics is: -2.91571
ADF p-value is 0.01

Period is from: 2013-01-01 to: 2016-05-04
 Dependent variable and independent variable are: IWM.Adjusted IYY.Adjusted
 The value of the test statistics is: -0.9656891
ADF p-value is 0.3085453

Period is from: 2013-01-01 to: 2016-05-04
 Dependent variable and independent variable are: IWM.Adjusted SPY.Adjusted
 The value of the test statistics is: -0.8500551
ADF p-value is 0.3454008

Period is from: 2013-01-01 to: 2016-05-04
 Dependent variable and independent variable are: IYY.Adjusted SPY.Adjusted
 The value of the test statistics is: -0.9760423
ADF p-value is 0.3052455

Looking at the values of test statistics of the ADF test if they are greater in absolute terms than the critical values we can reject the Null Hypothesis of unit root and conclude that the residuals are stationary (we also see the p-values that are smaller than 1%, 5% or 10% levels of significance). Hence, for our 10 pairs we can see there is cointegration between 7 pairs:

GDX and GLD
GDX and IWM
GDX and IYY
GDX and SPY
GLD and IWM
GLD and IYY
GLD and SPY
(Please note that the regression distinguishes between dependent and independent variable in the respective pair).

A way to look at the standard Dickey Fuller test  (differ with the Augmented-Dickey Fuller test with the number of lags -> in the standard Dickey Fuller test the  number of lags is 0) of the residuals from the regression is to run again an OLS regression of first difference of residuals on lagged residuals. Let’s play a bit step-by-step with this and see what we get in R:

For the example, we will use ur.df function in R to get the ADF and standard DF test values.

startDate="2013-01-01"
endDate="2016-05-04"

GLD <- getSymbols("GLD", from=startDate, to=endDate, auto.assign=FALSE)
GDX <- getSymbols("GDX", from=startDate, to=endDate,auto.assign=FALSE)
y <- merge(Ad(GDX), Ad(GLD))
eqn <- as.formula(paste(colnames(y), collapse=" ~ 0 + "))
m <- lm(eqn, data=y)
ur_df_0<-ur.df(m$residuals, type="none", lags=0) #now we run a standard Dickey Fuller test with 0 lags and no constant
summary(ur_df_0) # to see the test stat and p-value
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################

Test regression none


Call:
lm(formula = z.diff ~ z.lag.1 - 1)

Residuals:
     Min       1Q   Median       3Q      Max
-1.46514 -0.29108 -0.03333  0.25579  1.60938

Coefficients:
        Estimate Std. Error t value Pr(>|t|)   
z.lag.1 -0.01385    0.00370  -3.744 0.000193 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4456 on 839 degrees of freedom
Multiple R-squared:  0.01643,  Adjusted R-squared:  0.01526
F-statistic: 14.02 on 1 and 839 DF,  p-value: 0.0001935


Value of test-statistic is: -3.744

Critical values for test statistics:
      1pct  5pct 10pct

tau1 -2.58 -1.95 -1.62


#now the long way to do this – via an OLS regression of the first difference of residuals against lagged residuals (lag=1)
n<-lm(diff(m$residuals) ~ lag(m$residuals[-length(m$residuals)], k=1) +0)
summary(n) #to see full output
# or alternatively to see only t-value and p-value:
summary(n)$coef[,3] # to see t-value
-3.743962

summary(n)$coef[,4] # to see p-value

0.0001934995


We have both approaches producing the same result: value of the test statistics is -3.744 and p-value of 0.000193.

Now, the Augmented-Dickey Fuller:
adf<-ur.df(m$residuals, type="none", lags=1)
summary(adf) # to see the test stat and p-value
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################

Test regression none


Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
     Min       1Q   Median       3Q      Max
-1.45579 -0.29098 -0.03797  0.24630  1.56707

Coefficients:
           Estimate Std. Error t value Pr(>|t|)   
z.lag.1    -0.01263    0.00370  -3.413 0.000673 ***
z.diff.lag -0.09048    0.03408  -2.655 0.008089 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4418 on 837 degrees of freedom
Multiple R-squared:  0.02144,  Adjusted R-squared:  0.01911
F-statistic: 9.171 on 2 and 837 DF,  p-value: 0.0001148


Value of test-statistic is: -3.4134

Critical values for test statistics:
      1pct  5pct 10pct

tau1 -2.58 -1.95 -1.62

#now the long way to do this – via an OLS regression of the first difference of residuals against lagged residuals (lag=1) and the lagged first difference of residuals. Complicated? Yep, a bit but here it is:
z.diff ~ z.lag.1 - 1 + z.diff.lag

n<-lm((diff(m$residuals))[-1] ~ lag(m$residuals[-length(m$residuals)], k=1)[-1] + lag(diff(m$residuals)[-length(diff(m$residuals))], k=1) +0)

summary(n) #to see full output

# or alternatively to see only t-value and p-value:
summary(n)$coef[,3] # to see t-value
       lag(m$residuals[-length(m$residuals)], k = 1)[-1]
                                                -3.413356
lag(diff(m$residuals)[-length(diff(m$residuals))], k = 1)
                                                -2.654686


summary(n)$coef[,4] # to see p-value
        lag(m$residuals[-length(m$residuals)], k = 1)[-1]
                                             0.0006725011
lag(diff(m$residuals)[-length(diff(m$residuals))], k = 1)

                                             0.0080889750 
The value of the test statistics is -3.413 (from ur.df) and the value of the test stat of the term of the lagged residuals from the linear regression is the same: -3.413356. (The same as the result of the ADF test of the 10 pairs).

It should also be noted, as Ernie Chan highlighted (http://epchan.blogspot.bg/2013/11/cointegration-trading-with-log-prices.html), the difference between price spreads and log price spreads. In the price spread the number of shares should be kept fixed, and short the spread when it is much higher than average (say 1.5 standard deviations above the historical average or some other number), and long the spread when it is much lower. For a stationary log price spread themarket values of stocks should be kept fixed, which means that at the end of every bar, we need to rebalance the shares due to price changes.