[R-SIG-Finance] cointegration
Paul Teetor
paulteetor at yahoo.com
Tue Oct 19 16:03:01 CEST 2010
Stephen,
It depends what you mean by "logic".
If you mean statistical logic, I'll defer to Eric Zivot and Sarbo who are far wiser than I am. I will note, however, that you are testing for a p-value of 0.05, so I expect 5% of your test results to be misleading. In other words, for every 20 pairs tested by your batch job, I expect one will be suspect.
"Spurious cointegration" is a serious problem. I suggest Googling that topic. You may be suprised what you learn. (The irony, of course, is that cointegration was supposed to cure "spurious correlation." Oh well.)
If you mean financial logic, I strongly suggest not blindly risking money on your statistical test. Some filtering is required. Look for trades that make sense.
For example, my software reports that the stocks of MSFT and GOOG form a mean-reverting pair. But I would not trade that spread: too much idiosyncratic risk. My software also reports that Corn futures and Soybean Oil futures form a mean-reverting pair. But I would not trade that spread because the economic connection between corn and bean oil is too weak.
Hope that helps.
Paul
_____
From: r-sig-finance-bounces at stat.math.ethz.ch [mailto:r-sig-finance-bounces at stat.math.ethz.ch] On Behalf Of Stephen Choularton
Sent: Monday, October 18, 2010 9:46 PM
To: r-sig-finance at stat.math.ethz.ch
Subject: [R-SIG-Finance] cointegration
Hi Folks
I'm using this to find cointegrated stocks on the AX.
library(xts)
library(quantmod)
# quickly re-source this file
s <- function() source('meanrev.R')
checkPairFromYahoo <- function(sym1, sym2, dateFilter='::')
{
t.xts <- getCombined(sym1, sym2, dateFilter=dateFilter)
cat("Date range is", format(start(t.xts)), "to", format(end(t.xts)), "\n")
# Build linear model
m <- buildLM(t.xts)
# Note beta -- http://en.wikipedia.org/wiki/Beta_(finance)
beta <- getBeta(m)
cat("Assumed hedge ratio is", beta, "\n")
# Build spread
sprd <- buildSpread(t.xts, beta)
# Test cointegration
ht <- testCoint(sprd)
cat("PP p-value is", as.double(ht$p.value), "\n")
if (as.double(ht$p.value) < 0.05)
{
cat("###############################################################\n", sym1 ,":", sym2 ," is likely mean-reverting.\n", "###########################################################\n" )
}
else
{
#cat(sym1 ,":", sym2 ," is not mean-reverting.\n")
}
}
getCombined <- function(sym1, sym2, dateFilter='::')
{
# Grab historical data for both symbols
one <- getSymbols(sym1, auto.assign=FALSE)
two <- getSymbols(sym2, auto.assign=FALSE)
# Give columns more usable names
colnames(one) <- c('Open', 'High', 'Low', 'Close', 'Volume', 'Adjusted')
colnames(two) <- c('Open', 'High', 'Low', 'Close', 'Volume', 'Adjusted')
# Build combined object
return(merge(one$Close, two$Close, all=FALSE)[dateFilter])
}
buildLM <- function(combined)
{
return(lm(Close ~ Close.1 + 0, combined))
}
getBeta <- function(m)
{
return(as.double(coef(m)[1]))
}
buildSpread <- function(combined, beta)
{
return(combined$Close - beta*combined$Close.1)
}
testCoint <- function(sprd)
{
return(PP.test(sprd, lshort = FALSE))
}
I run it on batches of stock-pairs and then have a look at those which are cointegrated. Assuming my code is right (and anyone who thinks there is something wrong with it please let me know ;-)
Just wondered if anyone simply goes with the results, or if a test of logic is required. I found, for example, that AGL ( a big gas company) was cointegrated with Bunnings Wharehouses (a hardware superstore chain). Can't see the reason for that. AMP (major insurer) cointegrates with AXA (another major insurer). That makes sense and it cointegrates with Westpac (major bank) still some logic but a bit thinner. It also cointegrates with Fortescue Metals (big iron ore operation). Not much logic there. Anyway question is: do you get better results by using informed judgement on these things or just trust the figures?
Any comments most welcome.
Stephen Choularton Ph.D., FIoD
9999 2226
0413 545 182
for insurance go to www.netinsure.com.au
for markets go to www.organicfoodmarkets.com.au
On 19/10/2010 12:35 PM, Yihao Lu aeolus_lu wrote:
I am doing rolling ADF test on some time series to check mean reversion. When I use short period rolling, I find the residue is not stationary at all. However, when I use horizon longer than 5 years, I find very significant stationary. On the other hand, I find the half life is only around 30 days.
Is there anyone who can give me some possible explanation or guide me to some reference? thanks
Best,
Yihao
________________________________
Date: Tue, 19 Oct 2010 09:03:55 +1100
From: stephen at organicfoodmarkets.com.au
To: r-sig-finance at stat.math.ethz.ch
CC: bjorn.skogtro at gmail.com
Subject: Re: [R-SIG-Finance] Ornstein-Uhlenbeck
Hi
I am still trying to sort this one out. Any comments from anyone would
be most welcome.
Stephen Choularton Ph.D., FIoD
On 14/10/2010 7:29 AM, Stephen Choularton wrote:
Thanks for this help.
Trying to make sense of it so I have added some notes to the code. I
have marked them #?#
Delighted if you can tell me if I am write or wrong, add any comments,
answers.
#?# This appears to be the function that is doing the 'Ornstein-Uhlenbeck
#?# process work' particularly via dcOU
#?# I have noted in several places that I am after:
#?# 'the half-life of the decay equals ln(2)/θ'
#?# 'The half-life is given as log(2)/mean-reversion speed.'
#?# and I see theta appearing at a number of points in the code.
#?# Can you tell me why 3 thetas viz theta1, theta2, theta3 and what they do?
#?# eg is one of these the theta I am after?
# ex3.01.R
OU.lik <- function(theta1, theta2, theta3){
n <- length(X)
dt <- deltat(X)
-sum(dcOU(X[2:n], dt, X[1:(n-1)], c(theta1,theta2,theta3), log=TRUE))
}
require(stats4)
require(sde)
#?# random numer generation seed
set.seed(123)
#?# creation of a data set
X <- sde.sim(model="OU", theta=c(3,1,2), N=1000, delta=1)
#?# If I Look at X its like this:
#?# Time Series:
#?# Start = 0
#?# End = 1000
#?# Frequency = 1
#?# [1] 1.00000000 etc
#?# What sort of data object is it and how would I coerce an object with one
#?# column from a read.csv into it?
mle(OU.lik, start=list(theta1=1, theta2=0.5, theta3=1),
method="L-BFGS-B", lower=c(-Inf,0,0)) -> fit
summary(fit)
#?# This gives:
#?# Maximum likelihood estimation
#?# Call:
#?# mle(minuslogl = OU.lik, start = list(theta1 = 1, theta2 = 0.5,
#?# theta3 = 1), method = "L-BFGS-B", lower = c(-Inf, 0, 0))
#?# Coefficients:
#?# Estimate Std. Error
#?# theta1 3.355322 0.28159504
#?# theta2 1.106107 0.09010627
#?# theta3 2.052815 0.07624441
#?# -2 log L: 3366.389
#?# What's this telling me?
# ex3.01.R (cont.)
prof <- profile(fit)
par(mfrow=c(1,3))
plot(prof)
par(mfrow=c(1,1))
vcov(fit)
confint(fit)
#?# This provides me with this output using 'fit' from before:
#?# > vcov(fit)
#?# theta1 theta2 theta3
#?# theta1 0.07929576 0.024620718 0.016634557
#?# theta2 0.02462072 0.008119141 0.005485549
#?# theta3 0.01663456 0.005485549 0.005813209
#?# > confint(fit)
#?# Profiling...
#?# 2.5 % 97.5 %
#?# theta1 2.8448980 3.960982
#?# theta2 0.9433338 1.300629
#?# theta3 1.9147136 2.216113
#?# and 'fit' is:
#?# Call:
#?# mle(minuslogl = OU.lik, start = list(theta1 = 1, theta2 = 0.5,
#?# theta3 = 1), method = "L-BFGS-B", lower = c(-Inf, 0, 0))
#?# Coefficients:
#?# theta1 theta2 theta3
#?# 3.355322 1.106107 2.052815
#?# plus some graphic output
#?# Again, what's this telling me.
#?# This looks like a further example?
# ex3.01.R (cont.)
set.seed(123)
X <- sde.sim(model="OU", theta=c(3,1,2), N=1000, delta=1e-3)
mle(OU.lik, start=list(theta1=1, theta2=0.5, theta3=1),
method="L-BFGS-B", lower=c(-Inf,0,0)) -> fit2
summary(fit2)
Please excuse the length of this email (and my lack of understanding)
Hope you can help and thanks.
Stephen Choularton Ph.D., FIoD
On 13/10/2010 2:41 AM, stefano iacus wrote:
just for completeness: OU process is gaussian and transitiion density is known in exact form. So maximum likelihood estimation works fine and I suggest to avoid GMM.
sde package contains exact transition density for this process (e.g. ?dcOU) which you can use to build the likelihood to pass to mle() function.
This example taken from the "inst" directory of the package sde. For the parametrization of the model see ?dcOU
# ex3.01.R
OU.lik <- function(theta1, theta2, theta3){
n <- length(X)
dt <- deltat(X)
-sum(dcOU(X[2:n], dt, X[1:(n-1)], c(theta1,theta2,theta3), log=TRUE))
}
require(stats4)
require(sde)
set.seed(123)
X <- sde.sim(model="OU", theta=c(3,1,2), N=1000, delta=1)
mle(OU.lik, start=list(theta1=1, theta2=0.5, theta3=1),
method="L-BFGS-B", lower=c(-Inf,0,0)) -> fit
summary(fit)
# ex3.01.R (cont.)
prof <- profile(fit)
par(mfrow=c(1,3))
plot(prof)
par(mfrow=c(1,1))
vcov(fit)
confint(fit)
# ex3.01.R (cont.)
set.seed(123)
X <- sde.sim(model="OU", theta=c(3,1,2), N=1000, delta=1e-3)
mle(OU.lik, start=list(theta1=1, theta2=0.5, theta3=1),
method="L-BFGS-B", lower=c(-Inf,0,0)) -> fit2
summary(fit2)
I hope this helps out
stefano
On 12 Oct 2010, at 12:33, Bjorn Skogtro wrote:
Hi Stephen,
You could take a look at
http://sitmo.com/doc/Calibrating_the_Ornstein-Uhlenbeck_model
for the linear regression method, or take a look at the package "sde" which
contains some examples using GMM (not for the Ornstein-Uhlenbeck process,
though, only the CIR).
The half-life is given as log(2)/mean-reversion speed.
Do keep an eye on the partition of the time-axis, e.g. what frequency you
are using (daily, yearly) for interpreting the half-life.
BR,
Bjørn
------------------------------
Message: 2
Date: Tue, 12 Oct 2010 05:43:32 -0400
From: Sarbo
To: r-sig-finance at stat.math.ethz.ch
Subject: Re: [R-SIG-Finance] Ornstein-Uhlenbeck
Message-ID:
Content-Type: text/plain; charset="utf-8"
By half-life, do you mean the speed of mean-reversion?
If so, there's a bit of algebraic tomfoolery that's required to
discretise the equation and then fit the data to it. I don't have the
time right now to go into all the details but it's not hard- you can
parameterise the process using simple linear regression. If you need
help with that I'll try and get back to you tonight about it.
On Tue, 2010-10-12 at 13:47 +1100, Stephen Choularton wrote:
Hi
Wonder if anyone could point me how I use this method to discover the
half life of a mean reverting process.
I am looking into pair trading and the time it takes for a
cointegrated pair to revert to the norm.
--
Stephen Choularton Ph.D., FIoD
9999 2226
0413 545 182
for insurance go to www.netinsure.com.au
for markets go to www.organicfoodmarkets.com.au
_______________________________________________
R-SIG-Finance at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions
should go.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20101012/26e32fc7/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CoS2010Winner.JPG
Type: image/jpeg
Size: 16091 bytes
Desc: not available
URL: <
https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20101012/26e32fc7/attachment-0001.jpe
------------------------------
_______________________________________________
R-SIG-Finance mailing list
R-SIG-Finance at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
End of R-SIG-Finance Digest, Vol 77, Issue 8
********************************************
[[alternative HTML version deleted]]
_______________________________________________
R-SIG-Finance at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
-----------------------------------
Stefano M. Iacus
Department of Economics,
Business and Statistics
University of Milan
Via Conservatorio, 7
I-20123 Milan - Italy
Ph.: +39 02 50321 461
Fax: +39 02 50321 505
http://www.economia.unimi.it/iacus
------------------------------------------------------------------------------------
Please don't send me Word or PowerPoint attachments if not
absolutely necessary. See:
http://www.gnu.org/philosophy/no-word-attachments.html
_______________________________________________
R-SIG-Finance at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
No virus found in this incoming message.
Checked by AVG - www.avg.com
_______________________________________________
R-SIG-Finance at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
No virus found in this incoming message.
Checked by AVG - www.avg.com
_______________________________________________
R-SIG-Finance at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance --
Subscriber-posting only. If you want to post, subscribe first. -- Also
note that this is not the r-help list where general R questions should
go.
_______________________________________________
R-SIG-Finance at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
No virus found in this incoming message.
Checked by AVG - www.avg.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20101019/08079fb9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 16091 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20101019/08079fb9/attachment.jpe>
More information about the R-SIG-Finance
mailing list