[R-SIG-Finance] ljung-box tests in arma and garch models

Mon Dec 31 11:46:43 CET 2007

I thought I'd start off with some background for those who
don't know what we are talking about.

The Ljung-Box test in this context is used to see if the model
that is fit has captured all of the signal.  So in hypothesis testing
terms, we have things backwards -- we are satisfied when we
see large p-values rather than wanting to see small p-values.

The working paper referred to below shows that the Ljung-Box
test is fantastically robust to the data being non-Gaussian.  However,
there is a practical setting in which it is not robust enough.  That is
when testing if a garch model captures all of the variation in variance
by squaring the residuals (which will themselves be long-tailed in
practice).

One symptom is seeing p-values for the Ljung-Box test that are very
close to 1, such as .998.  (This is essentially saying that the model has
overfit the data, but overfitting a couple thousand observations with a
handful of parameters is unlikely.)

A good remedy is to use the ranks of the squared residuals rather than
the actual squared residuals in the Ljung-Box test.

This thread is really about the degrees of freedom with which to use to
get the p-value from the test statistic.  In the big picture I regard 
this as
rather unimportant -- it doesn't matter much if the p-value is 3.3% or
3.4%.  However, I do believe in doing things as well as possible.

The asymptotics seem to be saying to use 'm - g' degrees of freedom rather
than 'm'.  Asymptotics are nice but the real question is what happens in
a finite sample with a long-tailed distribution.

Spencer, no I didn't look at degrees of freedom when I was doing the
simulations for the paper.

Pat

Spencer Graves wrote:

> Hi, Michal and Patrick:
>
> PATRICK:
>      In your 2002 paper on the "Robustness of the Ljung-Box Test and 
> its Rank Equivalent" 
> (http://www.burns-stat.com/pages/Working/ljungbox.pdf), do you 
> consider using m-g degrees of freedom, where  m = number of lags and g 
> = number of parameters estimated (ignoring an intercept)?  I didn't 
> read every word, but I only saw you using 'm' degrees of freedom, and 
> I did not notice a comment on this issue.
>      Your Exhibit 3 (p. 7) presents a histogram of the "Distribution 
> of the 50-lag Ljung-Box p-vallue under the Gaussian distribution with 
> 100 observations".  It looks to me like a Beta(a, b) distribution, 
> with a < b < 1 but with both a and b fairly close to 1.  The excess of 
> p-values in the lower tail suggests to me that the real degrees of 
> freedom for a reference chi-square should in this case be slightly 
> greater than 50.  Your Exhibit 10 shows a comparable histogram for the 
> "Distribution of the Ljung-Box 15 lag p-value for the square of a t 
> with 4 degrees of freedom with 10,000 observations."  This looks to me 
> like a Beta(a, b) distribution with b < a < 1 but with many fewer 
> p-values near 0 than near 1.  This in turn suggests to me that the 
> degrees of freedom of the reference chi-square test would be less than 
> 15 in this case.  Apart from this question, your power curves, 
> Exhibits 14-22 provide rather persuasive support for your recommended 
> use of the rank equivalent to the traditional Ljung-Box.
>
> MICHAL:
>      Thanks very much for your further comments on this.  The standard 
> asymptotic theory would support Enders' and Tsay's usage of m-g 
> degrees of freedom, with m = number of lags and g = number of 
> parameters estimated, apart from an intercept -- PROVIDED the 
> parameters were estimated using to minimize the Ljung-Box statistic.  
> However, the parameters are typically estimated to maximize a 
> likelihood.  The effect of this would likely be to understate the 
> p-value, which we generally want to avoid.
>      However, we never want to use these statistics infinite sample 
> sizes and degrees of freedom.  Therefore, the asymptotic theory is 
> only a guideline, preferably with some adjustment for finite sample 
> sizes and degrees of freedom.  Therefore, it is wise to evaluate the 
> adequacy of the asymptotics with appropriate simulations.  These may 
> have been done;  I have not researched the literature on this, apart 
> from Burns (2002).  If anyone knows of other relevant simulations, I'd 
> like to hear about them.
>
>      By the way, Tsay's second edition (2005, p. 44) includes a 
> similar comment:  "For an AR(p) model, the Ljung-Box statistic Q(m) 
> follows asymptotically a chi-square distribution with m-g degrees of 
> freedom, where g denotes the number of AR coefficients used in the 
> model."  This is similar to but different from your quote from the 
> first edition.
>
>      Best Wishes,
>      Spencer Graves
>
> michal miklovic wrote:
>
>> Hi,
>>
>> First, I would like to thank Patrick and Spencer for their comments 
>> and suggestions.
>>
>> Second, I did a literature search on the computation of degrees of 
>> freedom for the Ljung-Box Q-statistic when testing residuals from an 
>> arma model. I do not mean an optimum number of lags for the ACF or 
>> the LB Q-statistic but I tried to find an answer to the question: how 
>> do I determine degrees of freedom for a given LB Q-statistic from an 
>> arma(p,q) model?
>> Enders states the following in Applied Econometric Time Series (2nd 
>> edition, 2004, Wiley & Sons) on pp. 68 - 69: "The Box-Pierce and 
>> Ljung-Box Q-statistics also serve as a check to see if the residuals 
>> from an estimated arma(p,q) model behave as a white noise process. 
>> However, when the s correlations from an estimated arma(p,q) model 
>> are formed, the degrees of freedom are reduced by the number of 
>> estimated coefficients. Hence, using the residuals of an arma(p,q) 
>> model, Q has a chi-squared [distribution] with s - p - q degrees of 
>> freedom."
>> Tsay states the following in Analysis of Financial Time Series (1st 
>> edition, 2002, Wiley & Sons) on p. 52: "The Ljung-Box statistics of 
>> the residuals can be used to check the adequacy of a fitted model. If 
>> the model is correctly specified, then Q(m) follows asymptotically a 
>> chi-squared distribution with m - g degrees of freedom, where g 
>> denotes the number of parameters used in the model."
>>
>> The two above quotations are in line with mine and Spencer's 
>> opinions. Considering what the books say, I would suggest that the 
>> computation of the degrees of freedom and, consequently, p-values 
>> could be altered in the next release of fArma and fGarch.
>>
>> I did not find any exact formulations concerning the computation of 
>> degrees of freedom for the LB Q-statistics when testing squared 
>> standardised residuals from an estimated garch model.
>>
>> Best regards
>>
>> Michal Miklovic
>>
>>
>>
>> ----- Original Message ----
>> From: Patrick Burns <patrick at burns-stat.com>
>> To: Spencer Graves <spencer.graves at pdf.com>
>> Cc: michal miklovic <mmiklovic at yahoo.com>; 
>> r-sig-finance at stat.math.ethz.ch
>> Sent: Friday, December 28, 2007 11:21:33 AM
>> Subject: Re: [R-SIG-Finance] ljung-box tests in arma and garch models
>>
>> I heartily agree with Spencer that a simulation is the
>> way to answer the question.  However, my intuition is
>> the opposite of Spencer's regarding what the answer
>> will be.
>>
>> The Burns Statistics working paper on Ljung-Box tests
>> makes it clear that using rank tests for testing the garch
>> adequacy will be much more important than messing with
>> the degrees of freedom.
>>
>>
>> Patrick Burns
>> patrick at burns-stat.com <mailto:patrick at burns-stat.com>
>> +44 (0)20 8525 0696
>> http://www.burns-stat.com
>> (home of S Poetry and "A Guide for the Unwilling S User")
>>
>> Spencer Graves wrote:
>>
>> >Dear Michal:
>> >
>> >      The best way to check something like this is to do a simulation,
>> >tailored to your application.  If you do such, I'd like to hear the
>> >results.
>> >
>> >      Absent that, my gut reaction is to agree with you.  The 
>> chi-square
>> >distribution with k degrees of freedom is defined as distribution of 
>> the
>> >sum of squares of k independent N(0, 1) variates
>> >(http://en.wikipedia.org/wiki/Chi-square_distribution).  In 1900, Karl
>> >Pearson published "On the criterion that a given system of deviations
>> >from the probable in the case of a correlated system of variables is
>> >such that it can be reasonably supposed to have arisen from random
>> >sampling", Philosophical magazine, t.50
>> >(http://fr.wikipedia.org/wiki/Karl_Pearson).  In this test, Pearson
>> >assumed that the sums of squares of k N(0, 1) variates, independent or
>> >not, would follow a chi-square(k).  R. A. Fisher determined that the
>> >number of degrees of freedom should be reduced by the number of
>> >parameters estimated
>> >(http://www.mrs.umn.edu/~sungurea/introstat/history/w98/RAFisher.html 
>> <http://www.mrs.umn.edu/%7Esungurea/introstat/history/w98/RAFisher.html>). 
>>
>> >This led to a feud that continued after Pearson died.
>> >
>> >      The "Box-Pierce" and "Ljung-Box" tests are both available in
>> >'Box.test{stats}' and discussed in Tsay (2005) Analysis of "financial
>> >Time Series (Wiley, p. 27), which includes a comment that, "Simulation
>> >studies suggest that the choice of" the number of lags included in the
>> >Ljung-Box statistic should be roughly log(number of observations) for
>> >"better power performance."
>> >
>> >      Based on this, the "FinTS" package includes a function "ARIMA"
>> >that calls "arima", computes Box.test on the residuals and adjusts the
>> >number of degrees of freedom to match the examples in Tsay (2005).  I
>> >haven't looked at this in depth, but it would seem to conform with
>> >Eviews, etc., and not with fArma, etc., as you mentioned.
>> >
>> >      I haven't done a substantive literature search on this, but if
>> >anyone has evidence bearing on this issue beyond the original Ljung-Box
>> >paper, I'd like to know.
>> >
>> >      Hope this helps.
>> >      Spencer Graves
>> >
>> >michal miklovic wrote:
>> > >
>> >> Hi,
>> >>
>> >>I would like to ask/clarify how should degrees of freedom (and 
>> p-values) for the Ljung-Box Q-statistics in arma and garch models be 
>> computed. The reason for the question is that I have encountered two 
>> different approaches. Let us say we have an arma(p,q) garch(m,n) 
>> model. The two approaches are as follows:
>> >>
>> >>1) In R and fArma and fGarch packages, the arma and garch orders 
>> are disregarded in the computation of degrees of freedom for the 
>> Ljung-Box (LB) Q-statistics. In other words, regardless of p, q, m 
>> and n, the LB Q-statistic computed from the first x autocorrelations 
>> of (squared) standardised residuals has x degrees of freedom. Given 
>> the statistic and degrees of freedom, the corresponding p-value is 
>> computed.
>> >>
>> >>2) In EViews, TSP and other statistical software, the LB 
>> Q-statistic computed from the first x autocorrelations of 
>> standardised residuals has (x - (p+q)) degrees of freedom. Degrees of 
>> freedom and p-values are not computed for the first (p+q) LB 
>> Q-statistics. A similar method is applied to squared standardised 
>> residuals: the LB Q-statistic computed from the first x autocorrelations
>> >>of squared standardised residuals has (x - (m+n)) degrees of freedom.
>> >>Degrees of freedom and p-values are not computed for the first 
>> (m+n) LB
>> >>Q-statistics.
>> >>
>> >>I think the second approach is better because the first (p+q) 
>> orders in standardised residuals and the first (m+n) orders in 
>> squared standardised residuals should not exhibit any pattern and 
>> higher orders should be checked for any remaining arma and garch 
>> structures. Am I right or wrong?
>> >>
>> >>Thanks for answers and suggestions.
>> >>
>> >>Best regards
>> >>
>> >>Michal Miklovic
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>      
>> ____________________________________________________________________________________ 
>>
>> >>Be a better friend, newshound, and
>> >>
>> >>
>> >>    [[alternative HTML version deleted]]
>> >>
>> >>_______________________________________________
>> >>R-SIG-Finance at stat.math.ethz.ch 
>> <mailto:R-SIG-Finance at stat.math.ethz.ch> mailing list
>> >>https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> >>-- Subscriber-posting only.
>> >>-- If you want to post, subscribe first.
>> >>
>> >>   >>
>> >
>> >_______________________________________________
>> >R-SIG-Finance at stat.math.ethz.ch 
>> <mailto:R-SIG-Finance at stat.math.ethz.ch> mailing list
>> >https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> >-- Subscriber-posting only.
>> >-- If you want to post, subscribe first.
>> >
>> >
>> > >
>>
>>
>> ------------------------------------------------------------------------
>> Be a better friend, newshound, and know-it-all with Yahoo! Mobile. 
>> Try it now. 
>> <http://us.rd.yahoo.com/evt=51733/*http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ%20> 
>>
>
>
>