[R-SIG-Finance] Number of data points required for Cointigration

Mark Leeds markleeds2 at gmail.com
Tue Jan 27 20:05:50 CET 2015


Hi Eric: yes, multiple testing-data mining is another problem. The whole
pairs
thing is a messy undertaking that I never cracked. I want to go back to it
someday. Paul has a paper that discusses a different statistical
methodology that looks interesting but I think, however one approaches it
statistically,  it also needs "heursticy" techniques to add robustness.
All the best and thanks again for your
comments.


Mark







On Tue, Jan 27, 2015 at 1:58 PM, Eric Zivot <ezivot at u.washington.edu> wrote:

> Mark
>
> I completely agree with you. My comments were oriented to the "best case
> scenario" . There are obviously many real world considerations that make
> the issue very difficult as you point out. And, of course, you has to
> consider the dreaded "multiple testing issue" if you are searching for the
> "best" cointegrated pair of assets.
>
>
>
> *From:* Mark Leeds [mailto:markleeds2 at gmail.com]
> *Sent:* Tuesday, January 27, 2015 10:54 AM
> *To:* Eric Zivot
> *Cc:* amol gupta; Paul Teetor; r-sig-finance at r-project.org
>
> *Subject:* Re: [R-SIG-Finance] Number of data points required for
> Cointigration
>
>
>
> Hi Eric: Thanks for the educational and thorough explanation. But I think
> it's
> worse than that. Any econometrics test, whether asymptotic or finite,
> depends on a certain underlying DGP that often just doesn't hold. So, even
> in the asymptotic case, cointegration tests will break down due to mergers,
> buybacks, bankruptcies etc. There is no concept of cointegration or any DGP
> that can take these things into account.
>
> I'm not trying to start a flame-war and I use econometrics in finance so I
> don't think
> its bogus.  I'm  rather just pointing out that, particularly with respect
> to cointegration testing,  things can go haywire in a hurry because the
> underlying DGP assumption is just not true. So, any type of test is, to
> some extent, useless.
>
>
> Mark
>
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Jan 27, 2015 at 1:41 PM, Eric Zivot <ezivot at u.washington.edu>
> wrote:
>
> Some quick comments on this issue. From a statistical point of view, the
> phrase "how many data points are required for cointegration" is not well
> defined. Technically speaking, if two series are cointegrated then they are
> cointegrated for any number of observations. This issue is really about the
> size (probability of rejecting the null hypothesis when the null is true)
> and power (probability of rejecting the null when the alternative is true)
> of tests for cointegration for a given sample size. In intermediate
> statistics text books, the chapters on hypothesis testing usually have some
> discussion of the relationship between sample size and power. In simple toy
> examples you can work out the number of observations required to have power
> equal to some specified value (e.g. 0.90). In this case, if the alternative
> is true then you can say you can reject the null at the 5% level with
> probability 0.90 if the sample size is n=75 (say). Unfortunately, this
> exercise is extremely difficult to do with tests for cointegration
> (usually,
> the null is no cointegration so rejecting the null is evidence of
> cointegration). Why? Well there are only general asymptotic results (as
> sample size goes to infinity) for tests for no cointegration (e.g.
> Engle-granger two step, Johansen rank tests). There are no general finite
> sample results (for fixed sample sizes) for power functions. Hence, you
> cannot analytically compute a sample size that will give you a certain
> power. What to do? Well, you can try to set up some Monte Carlo experiments
> with fixed sample sizes to approximate power functions. The problem with
> this is that the results are not general. They will depend on the
> parameters
> used for the Monte Carlo set up (e.g. parameters for serial correlation,
> volatility etc). The best you can do is to try to carefully characterize
> the
> distributions of the series in question and try out some Monte Carlo
> experiments for these data. My guess is that you will have the best results
> when you use the class of tests that have been found to be optimal
> asymptotic tests (where the asymptotic power curve is tangent to the
> infeasible power curve of the optimal test at a set power). These tests
> have
> been developed by Graham Elliot at UCSD and Michael Jansen at UCB.
>
>
> -----Original Message-----
> From: R-SIG-Finance [mailto:r-sig-finance-bounces at r-project.org] On Behalf
> Of amol gupta
> Sent: Tuesday, January 27, 2015 10:09 AM
> To: Paul Teetor
> Cc: r-sig-finance at r-project.org
> Subject: Re: [R-SIG-Finance] Number of data points required for
> Cointigration
>
> Paul
>
> You say that ADF is not really stable . I agree. Other options to explore
> are
>
>    - Use other unit root and stationarity  test.
>    - Use other cointigration tests like johansen tests.
>    - Finding PCA and choosing one of the lower variance portfolio and test
>    for stationarity.(I need to understand PCA more.)
>
> I will take some time and test these ideas. Have you tried anyone of these?
> If yes, please share you experiences.
>
> Thank you for your insights.
>
> On Tue, Jan 27, 2015 at 5:29 AM, Paul Teetor <paulteetor at yahoo.com> wrote:
>
> > Amol,
> >
> > I don't have a formula or a guideline for determining the number of
> > data points. But I can share two experiences.
> >
> > First, when I traded mean-reverting spreads, I used 3 to 5 years of
> > daily data. That's 750 to 1,250 data points. Less data did not work
> > well for my spreads.
> >
> > Second, in my experience, the ADF test was quite unstable. That is, it
> > might fail to reject for a while, then start rejecting the null
> > hypothesis when the market showed some trending back to the mean. Then
> > it would fail to reject again as the market wandered away.
> >
> > Perhaps smarter people than I have had better luck trading with the
> > ADF, but for me, it did not provide a complete answer to the question
> > of mean-reversion.
> >
> > Paul
> >
> > Paul Teetor, Elgin, IL USA
> > http://quantdevel.com/public <http://quanttrader.info/public>
> >
> >   ------------------------------
> >  *From:* amol gupta <amolgupta87 at gmail.com>
> > *To:* "r-sig-finance at r-project.org" <r-sig-finance at r-project.org>
> > *Sent:* Friday, January 23, 2015 3:24 AM
> > *Subject:* [R-SIG-Finance] Number of data points required for
> > Cointigration
> >
> > Hi
> >
> > I need help in figuring out the length of historical data that I
> > should use. I took stock prices(daily close) for two tickers from
> > yahoo(200 days).I tried finding regression coefficient using PCA and I
> > use 150 points for PCA. I find a coefficient Beta.
> >
> > Now to see whether the spread is mean reverting or not I use ADF. If I
> > use
> > 150 point long spread, It comes out to be nonstationary. If I use 200
> > points data the outcome is stationary
> >
> > I again used 200 points to do the PCA and find regression. The spread
> > comes out to be non stationary. From all these observation *I think*
> > that this is not a stable relationship.
> >
> > So following are my questions
> >
> >    - Is there a way to decide length of historical data to use?
> >    - Some relationship may be more stable than others. Is there away to
> >    quantify it?
> >
> > Any other insight in this regard will be appreciated(time frame, pairs
> > vs basket). I have attached the plot and the script that was used to
> > generate the plot.
> >
> >
> > --
> > Regards
> > Amol
> >
> > If all the seas were ink,
> > And all the reeds were pens,
> > And all the skies were parchment,
> > And all the men could write,
> > These would not suffice
> > To write down all the red tape
> > Of this Government.
> >
> >
> > _______________________________________________
> > R-SIG-Finance at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> > -- Subscriber-posting only. If you want to post, subscribe first.
> > -- Also note that this is not the r-help list where general R
> > questions should go.
> >
> >
>
>
> --
> Regards
> Amol
>
> If all the seas were ink,
> And all the reeds were pens,
> And all the skies were parchment,
> And all the men could write,
> These would not suffice
> To write down all the red tape
> Of this Government.
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-SIG-Finance at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions
> should go.
>
> _______________________________________________
> R-SIG-Finance at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions
> should go.
>
>
>

	[[alternative HTML version deleted]]



More information about the R-SIG-Finance mailing list