QuantX Research: May 2015

Monday, 18 May 2015

Marginal and Component Value-at-Risk: A Python Example

Value-at-risk (VaR), despite its drawbacks, is a solid basis to understand the risk characteristics of the portfolio. There are many approaches to calculate VaR (historical simulation, variance-covariance, simulation). Marginal VaR is defined as the additional risk that a new position adds to the portfolio. The practical reading of the marginal VaR, as very nicely stated in Quant at risk http://www.quantatrisk.com/2015/01/18/applied-portfolio-value-at-risk-decomposition-1-marginal-and-component-var/ is: the highest the marginal VaR, the exposure to the respective asset should be reduced to lower the portfolio VaR. The component VaR shows the reduction of the portfolio value-at-risk resulting from removal of a position. The sum of component VaR of the shares in the portfolio equals the portfolio diversified VaR, while the sum of the individual VaR presents the undiversified portfolio VaR, assuming perfect correlation between assets in the portfolio.

Also it is worth noting that based on simple mathematics, the equations can be changed, i.e. there are several ways to calculate the component VaR.

We take the following steps in Python to come up with the marginal VaR of a 3-asset portfolio:

(This works for Python 2.7 for Python 3.4 -> change in print statements in the codes. Additionally, to some of the lines some formatting can be applied - for instance returns are not presented in percentages.)

import numpy as np

import pandas as pd

import pandas.io.data as web

from scipy.stats import norm

Value=1e6 # $1,000,000

CI=0.99 # set the confidence interval

tickers =['AAPL', 'MSFT', 'YHOO']

numbers=len(tickers)

data=pd.DataFrame()

for share in tickers:

data[share]=web.DataReader(share, data_source='yahoo', start='2011-01-01', end='2015-05-15')['Adj Close']

data.columns=tickers

ret=data/data.shift(1)-1 # calculate the simple returns

ret.mean()*252 #annualize the returns

covariances=ret.cov()*252 #gives the annualized covariance of returns

variances=np.diag(covariances) #extracts variances of the individual shares from covariance matrix

volatility=np.sqrt(variances) #gives standard deviation

weights=np.random.random(numbers)

weights/=np.sum(weights) # simulating random percentage of exposure of each share that sum up to 1; if we want to plug in our own weights use: weights=np.array([xx,xx,xx])

Pf_ret=np.sum(ret.mean()*weights)*252 #Portfolio return

Pf_variance=np.dot(weights.T,np.dot(ret.cov()*252,weights)) #Portfolio variance

Pf_volatility=np.sqrt(Pf_variance) #Portfolio standard deviation

USDvariance=np.square(Value)*Pf_variance

USDvolatility=np.sqrt(USDvariance)

covariance_asset_portfolio=np.dot(weights.T,covariances)

covUSD=np.multiply(covariance_asset_portfolio,Value)

beta=np.divide(covariance_asset_portfolio,Pf_variance)

def VaR():

# this code calculates Portfolio Value-at-risk (Pf_VaR) in USD-terms and Individual Value-at-risk (IndividualVaR) of shares in portfolio.

Pf_VaR=norm.ppf(CI)*USDvolatility

IndividualVaR=np.multiply(volatility,Value*weights)*norm.ppf(CI)

IndividualVaR = [ '$%.2f' % elem for elem in IndividualVaR ]

print 'Portfolio VaR: ', '$%0.2f' %Pf_VaR

print 'Individual VaR: ',[[tickers[i], IndividualVaR[i]] for i in range (min(len(tickers), len(IndividualVaR)))]

VaR() #call the function to get portfolio VaR and Individual VaR in USD-terms

def marginal_component_VaR():

# this code calculates Marginal Value-at-risk in USD-terms and Component Value-at-risk of shares in portfolio.

marginalVaR=np.divide(covUSD,USDvolatility)*norm.ppf(CI)

componentVaR=np.multiply(weights,beta)*USDvolatility*norm.ppf(CI)

marginalVaR = [ '%.3f' % elem for elem in marginalVaR ]

componentVaR=[ '$%.2f' % elem for elem in componentVaR ]

print 'Marginal VaR:', [[tickers[i], marginalVaR[i]] for i in range (min(len(tickers), len(marginalVaR)))]

print 'Component VaR: ', [[tickers[i], componentVaR[i]] for i in range (min(len(tickers), len(componentVaR)))]

marginal_component_VaR(): #call the function

Friday, 8 May 2015

How to Get High Frequency Stock Prices with Python (2): Netfonds Example

Getting high frequency and also real-time stock prices data (for US and Europe stocks in particular) with Python could be very simple indeed! Just use the http://hopey.netfonds.no site. Here is a simple code for generating the url-address (depending on ticker and date), reading and saving the data. Additionally, historical daily high-frequency data is available back for several days (about 15 days).

The url- format for the high-frequency data is (the example uses OMV shares, traded on Vienna Stock Exchange): http://hopey.netfonds.no/tradedump.php?date=20150507&paper=E-OMVV.BTSE&csv_format=txt. While for the depth the url-format is http://hopey.netfonds.no/posdump.php?date=20150507&paper=E-OMVV.BTSE&csv_format=txt. However, it seems that depth data is not available for US shares.

Here is the code and enjoy the data:

import urllib.request

# urllib.request for Python 3.4 and urllib for Python 2.7, respectively change the def get_data(date,ticker,exchange) code below in read_url line

def get_data(date,ticker,exchange):

base_url = 'http://hopey.netfonds.no/tradedump.php?'

search_query = 'date={}&paper={}.{}&csv_format=txt'.format(date,ticker,exchange)

search_url = '{}{}'.format(base_url, search_query)

read_url=urllib.request.urlopen(search_url).read()

return read_url

# for Python 3.4: read_url=urllib.request.urlopen(search_url).read(); for Python 2.7: read_url=urllib.urlopen(search_url).read()

if __name__ == '__main__':

ticker = 'AAPL' # the code-format for US stocks is using directly the ticker, for instance 'AAPL' is Apple, 'MSFT' is Micrrosoft; the format for European shares is 'E-XXXX', for instance 'E-BAYND' - Bayer, 'E-TKAV' - Telekom Austria, 'E-LLOYL' - Lloyds Banking Group

date=20150507 # date format is YYYYMMDD

exchange='O' # 'O' - US shares, 'BTSE' - for European shares

data=get_data(date,ticker,exchange)

with open ("trades.txt", "wb") as f:

f.write(data)

Tuesday, 5 May 2015

Generating Correlated Random Numbers: Cholesky Decomposition

When it comes to generate random numbers most of the people see this exercise as a black box. In order to add some colour to this. There are several approaches to generate correlated random numbers, of which two are mainly used, namely Cholesky decomposition and spectral (eigenvalue) decomposition. These types of decomposition are used in copula estimates.

Cholesky decomposition takes the correlation (or covariance) matrix along with randomly generated numbers and correlates them. Cholesky decomposition has two, let’s call it, versions: lower and upper triangular. The Cholesky decomposition can be done in Python via Numpy and SciPy linear algebra (linalg) libraries:

(1) np.linalg.cholesky(A) # using numply linear algebra library and

(2) scipy.linalg.cholesky(A, lower=True) # using SciPy linear algebra library with lower=True indicating we want lower triangular, if we want upper triangular: lower=False.

I prefer to use the lower triangular matrix. I also prefer working with array instead of matrix.

Put into practice and using Python, the things look like this (we work with 3 variables):

import numpy as np

from scipy.linalg import cholesky

from scipy.stats import norm

A=np.array([[1,0.1558, 0.2227], [0.1558,1,0.5118], [0.2227,0.5118,1]]) # set the correlation matrix

B= norm.rvs(size=(3, 10000)) # 10,000 simulations to generate 3 random normally distributed variables

Cholesky = cholesky(A, lower=True) # Cholesky decomposition - lower triangular matrix

result= np.dot(Cholesky, B) # generates correlated variables

correlation=np.corrcoef(result) #check the correlation of the simulated variables

By the way, checking if the initial matrix is positive definite is also quite easy on Python:

from scipy.linalg import eigh

eigh(A) # the first array is eigenvalues and the second is eigenvectors

In our example, the correlation matrix is positive definite, meaning that all its eigenvalues are positive.  The eigenvalues of the above correlation matrix are: 0.4832, 0.8903, 1.6265.



Cholesky decomposition can also be used in the opposite case - to uncorrelate variables that are correlated.