[R-SIG-Finance] Number of data points required for Cointigration

Tue Jan 27 19:53:57 CET 2015

Hi Eric: Thanks for the educational and thorough explanation. But I think
it's
worse than that. Any econometrics test, whether asymptotic or finite,
depends on a certain underlying DGP that often just doesn't hold. So, even
in the asymptotic case, cointegration tests will break down due to mergers,
buybacks, bankruptcies etc. There is no concept of cointegration or any DGP
that can take these things into account.

I'm not trying to start a flame-war and I use econometrics in finance so I
don't think
its bogus.  I'm  rather just pointing out that, particularly with respect
to cointegration testing,  things can go haywire in a hurry because the
underlying DGP assumption is just not true. So, any type of test is, to
some extent, useless.

Mark

On Tue, Jan 27, 2015 at 1:41 PM, Eric Zivot <ezivot at u.washington.edu> wrote:

> Some quick comments on this issue. From a statistical point of view, the
> phrase "how many data points are required for cointegration" is not well
> defined. Technically speaking, if two series are cointegrated then they are
> cointegrated for any number of observations. This issue is really about the
> size (probability of rejecting the null hypothesis when the null is true)
> and power (probability of rejecting the null when the alternative is true)
> of tests for cointegration for a given sample size. In intermediate
> statistics text books, the chapters on hypothesis testing usually have some
> discussion of the relationship between sample size and power. In simple toy
> examples you can work out the number of observations required to have power
> equal to some specified value (e.g. 0.90). In this case, if the alternative
> is true then you can say you can reject the null at the 5% level with
> probability 0.90 if the sample size is n=75 (say). Unfortunately, this
> exercise is extremely difficult to do with tests for cointegration
> (usually,
> the null is no cointegration so rejecting the null is evidence of
> cointegration). Why? Well there are only general asymptotic results (as
> sample size goes to infinity) for tests for no cointegration (e.g.
> Engle-granger two step, Johansen rank tests). There are no general finite
> sample results (for fixed sample sizes) for power functions. Hence, you
> cannot analytically compute a sample size that will give you a certain
> power. What to do? Well, you can try to set up some Monte Carlo experiments
> with fixed sample sizes to approximate power functions. The problem with
> this is that the results are not general. They will depend on the
> parameters
> used for the Monte Carlo set up (e.g. parameters for serial correlation,
> volatility etc). The best you can do is to try to carefully characterize
> the
> distributions of the series in question and try out some Monte Carlo
> experiments for these data. My guess is that you will have the best results
> when you use the class of tests that have been found to be optimal
> asymptotic tests (where the asymptotic power curve is tangent to the
> infeasible power curve of the optimal test at a set power). These tests
> have
> been developed by Graham Elliot at UCSD and Michael Jansen at UCB.
>
>
> -----Original Message-----
> From: R-SIG-Finance [mailto:r-sig-finance-bounces at r-project.org] On Behalf
> Of amol gupta
> Sent: Tuesday, January 27, 2015 10:09 AM
> To: Paul Teetor
> Cc: r-sig-finance at r-project.org
> Subject: Re: [R-SIG-Finance] Number of data points required for
> Cointigration
>
> Paul
>
> You say that ADF is not really stable . I agree. Other options to explore
> are
>
>    - Use other unit root and stationarity  test.
>    - Use other cointigration tests like johansen tests.
>    - Finding PCA and choosing one of the lower variance portfolio and test
>    for stationarity.(I need to understand PCA more.)
>
> I will take some time and test these ideas. Have you tried anyone of these?
> If yes, please share you experiences.
>
> Thank you for your insights.
>
> On Tue, Jan 27, 2015 at 5:29 AM, Paul Teetor <paulteetor at yahoo.com> wrote:
>
> > Amol,
> >
> > I don't have a formula or a guideline for determining the number of
> > data points. But I can share two experiences.
> >
> > First, when I traded mean-reverting spreads, I used 3 to 5 years of
> > daily data. That's 750 to 1,250 data points. Less data did not work
> > well for my spreads.
> >
> > Second, in my experience, the ADF test was quite unstable. That is, it
> > might fail to reject for a while, then start rejecting the null
> > hypothesis when the market showed some trending back to the mean. Then
> > it would fail to reject again as the market wandered away.
> >
> > Perhaps smarter people than I have had better luck trading with the
> > ADF, but for me, it did not provide a complete answer to the question
> > of mean-reversion.
> >
> > Paul
> >
> > Paul Teetor, Elgin, IL USA
> > http://quantdevel.com/public <http://quanttrader.info/public>
> >
> >   ------------------------------
> >  *From:* amol gupta <amolgupta87 at gmail.com>
> > *To:* "r-sig-finance at r-project.org" <r-sig-finance at r-project.org>
> > *Sent:* Friday, January 23, 2015 3:24 AM
> > *Subject:* [R-SIG-Finance] Number of data points required for
> > Cointigration
> >
> > Hi
> >
> > I need help in figuring out the length of historical data that I
> > should use. I took stock prices(daily close) for two tickers from
> > yahoo(200 days).I tried finding regression coefficient using PCA and I
> > use 150 points for PCA. I find a coefficient Beta.
> >
> > Now to see whether the spread is mean reverting or not I use ADF. If I
> > use
> > 150 point long spread, It comes out to be nonstationary. If I use 200
> > points data the outcome is stationary
> >
> > I again used 200 points to do the PCA and find regression. The spread
> > comes out to be non stationary. From all these observation *I think*
> > that this is not a stable relationship.
> >
> > So following are my questions
> >
> >    - Is there a way to decide length of historical data to use?
> >    - Some relationship may be more stable than others. Is there away to
> >    quantify it?
> >
> > Any other insight in this regard will be appreciated(time frame, pairs
> > vs basket). I have attached the plot and the script that was used to
> > generate the plot.
> >
> >
> > --
> > Regards
> > Amol
> >
> > If all the seas were ink,
> > And all the reeds were pens,
> > And all the skies were parchment,
> > And all the men could write,
> > These would not suffice
> > To write down all the red tape
> > Of this Government.
> >
> >
> > _______________________________________________
> > R-SIG-Finance at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> > -- Subscriber-posting only. If you want to post, subscribe first.
> > -- Also note that this is not the r-help list where general R
> > questions should go.
> >
> >
>
>
> --
> Regards
> Amol
>
> If all the seas were ink,
> And all the reeds were pens,
> And all the skies were parchment,
> And all the men could write,
> These would not suffice
> To write down all the red tape
> Of this Government.
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-SIG-Finance at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions
> should go.
>
> _______________________________________________
> R-SIG-Finance at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions
> should go.
>

	[[alternative HTML version deleted]]