Tuesday 29 September 2015

Geweke and Porter-Hudak (GPH) approach for estimating the Hurst exponent: A Python Way

Geweke and Porter-Hudak (GPH) approach is is based on observations of the slope of the spectral density function of a fractionally integrated series. It is essentially a semi-parametric procedure to estimate the fractional differencing parameter. The GPH method is used to estimate the Hurst-exponent.
The estimation procedure begins with calculating the periodogram (Fourier Transformation) and the next step is to estimate the slope of the logarithms of power spectral density versus log frequency and after that to find the H exponent.
Here is a short exxample in Python (I use a cutoff value of 0.5, but it can be changed) that follows Rafal Weron Matlab implementation:

import numpy as np
import math
from datetime import datetime
from pandas.io.data import DataReader
from numpy import fft
from numpy import sin, linspace, pi
from numpy import cumsum, log, polyfit, sqrt, std, subtract
from scipy.stats import norm

cutoff = 0.5
aapl = DataReader("AAPL", "yahoo", datetime(2012,1,1), datetime(2015,9,18)) # get Apple historical stock price data
ret=np.log( aapl['Adj Close'] / aapl['Adj Close'].shift(1) ) # calculate log daily returns
ret=ret[1:] # remove first row since it contanins NA

Y = fft.fft(ret)
N=len(Y)
power=(np.absolute(Y[range(1,N/2)])**2)/len(ret)
freq=np.divide(range(1,N/2),(N/2)*2.0)
period=1.0/freq

T=[i for i in range(len(freq)) if freq[i] <N**-cutoff]
PT = polyfit(log(4*sin(freq[T]/2)**2),log(power[T]),1)
h = 0.5 - PT[0] # this is Hurst exponent

# compute the confidence intervals at 95%, with first computing the standard deviation (sigma, scale):
sig = math.pi/(np.sqrt(6*(len(T)-1)) * std(log(4*sin(freq[T]/2)**2)))
high_conf=norm.ppf(1-(1-0.95)/2, 0.5, scale=sig)
low_conf=norm.ppf((1-0.95)/2, loc=0.5, scale=sig)


Result:
h=0.8811
sig=0.14
high_conf=0.7791
low_conf=0.2209