[R-SIG-Finance] ljung-box tests in arma and garch models
Patrick Burns
patrick at burns-stat.com
Mon Dec 31 11:46:43 CET 2007
I thought I'd start off with some background for those who
don't know what we are talking about.
The Ljung-Box test in this context is used to see if the model
that is fit has captured all of the signal. So in hypothesis testing
terms, we have things backwards -- we are satisfied when we
see large p-values rather than wanting to see small p-values.
The working paper referred to below shows that the Ljung-Box
test is fantastically robust to the data being non-Gaussian. However,
there is a practical setting in which it is not robust enough. That is
when testing if a garch model captures all of the variation in variance
by squaring the residuals (which will themselves be long-tailed in
practice).
One symptom is seeing p-values for the Ljung-Box test that are very
close to 1, such as .998. (This is essentially saying that the model has
overfit the data, but overfitting a couple thousand observations with a
handful of parameters is unlikely.)
A good remedy is to use the ranks of the squared residuals rather than
the actual squared residuals in the Ljung-Box test.
This thread is really about the degrees of freedom with which to use to
get the p-value from the test statistic. In the big picture I regard
this as
rather unimportant -- it doesn't matter much if the p-value is 3.3% or
3.4%. However, I do believe in doing things as well as possible.
The asymptotics seem to be saying to use 'm - g' degrees of freedom rather
than 'm'. Asymptotics are nice but the real question is what happens in
a finite sample with a long-tailed distribution.
Spencer, no I didn't look at degrees of freedom when I was doing the
simulations for the paper.
Pat
Spencer Graves wrote:
> Hi, Michal and Patrick:
>
> PATRICK:
> In your 2002 paper on the "Robustness of the Ljung-Box Test and
> its Rank Equivalent"
> (http://www.burns-stat.com/pages/Working/ljungbox.pdf), do you
> consider using m-g degrees of freedom, where m = number of lags and g
> = number of parameters estimated (ignoring an intercept)? I didn't
> read every word, but I only saw you using 'm' degrees of freedom, and
> I did not notice a comment on this issue.
> Your Exhibit 3 (p. 7) presents a histogram of the "Distribution
> of the 50-lag Ljung-Box p-vallue under the Gaussian distribution with
> 100 observations". It looks to me like a Beta(a, b) distribution,
> with a < b < 1 but with both a and b fairly close to 1. The excess of
> p-values in the lower tail suggests to me that the real degrees of
> freedom for a reference chi-square should in this case be slightly
> greater than 50. Your Exhibit 10 shows a comparable histogram for the
> "Distribution of the Ljung-Box 15 lag p-value for the square of a t
> with 4 degrees of freedom with 10,000 observations." This looks to me
> like a Beta(a, b) distribution with b < a < 1 but with many fewer
> p-values near 0 than near 1. This in turn suggests to me that the
> degrees of freedom of the reference chi-square test would be less than
> 15 in this case. Apart from this question, your power curves,
> Exhibits 14-22 provide rather persuasive support for your recommended
> use of the rank equivalent to the traditional Ljung-Box.
>
> MICHAL:
> Thanks very much for your further comments on this. The standard
> asymptotic theory would support Enders' and Tsay's usage of m-g
> degrees of freedom, with m = number of lags and g = number of
> parameters estimated, apart from an intercept -- PROVIDED the
> parameters were estimated using to minimize the Ljung-Box statistic.
> However, the parameters are typically estimated to maximize a
> likelihood. The effect of this would likely be to understate the
> p-value, which we generally want to avoid.
> However, we never want to use these statistics infinite sample
> sizes and degrees of freedom. Therefore, the asymptotic theory is
> only a guideline, preferably with some adjustment for finite sample
> sizes and degrees of freedom. Therefore, it is wise to evaluate the
> adequacy of the asymptotics with appropriate simulations. These may
> have been done; I have not researched the literature on this, apart
> from Burns (2002). If anyone knows of other relevant simulations, I'd
> like to hear about them.
>
> By the way, Tsay's second edition (2005, p. 44) includes a
> similar comment: "For an AR(p) model, the Ljung-Box statistic Q(m)
> follows asymptotically a chi-square distribution with m-g degrees of
> freedom, where g denotes the number of AR coefficients used in the
> model." This is similar to but different from your quote from the
> first edition.
>
> Best Wishes,
> Spencer Graves
>
> michal miklovic wrote:
>
>> Hi,
>>
>> First, I would like to thank Patrick and Spencer for their comments
>> and suggestions.
>>
>> Second, I did a literature search on the computation of degrees of
>> freedom for the Ljung-Box Q-statistic when testing residuals from an
>> arma model. I do not mean an optimum number of lags for the ACF or
>> the LB Q-statistic but I tried to find an answer to the question: how
>> do I determine degrees of freedom for a given LB Q-statistic from an
>> arma(p,q) model?
>> Enders states the following in Applied Econometric Time Series (2nd
>> edition, 2004, Wiley & Sons) on pp. 68 - 69: "The Box-Pierce and
>> Ljung-Box Q-statistics also serve as a check to see if the residuals
>> from an estimated arma(p,q) model behave as a white noise process.
>> However, when the s correlations from an estimated arma(p,q) model
>> are formed, the degrees of freedom are reduced by the number of
>> estimated coefficients. Hence, using the residuals of an arma(p,q)
>> model, Q has a chi-squared [distribution] with s - p - q degrees of
>> freedom."
>> Tsay states the following in Analysis of Financial Time Series (1st
>> edition, 2002, Wiley & Sons) on p. 52: "The Ljung-Box statistics of
>> the residuals can be used to check the adequacy of a fitted model. If
>> the model is correctly specified, then Q(m) follows asymptotically a
>> chi-squared distribution with m - g degrees of freedom, where g
>> denotes the number of parameters used in the model."
>>
>> The two above quotations are in line with mine and Spencer's
>> opinions. Considering what the books say, I would suggest that the
>> computation of the degrees of freedom and, consequently, p-values
>> could be altered in the next release of fArma and fGarch.
>>
>> I did not find any exact formulations concerning the computation of
>> degrees of freedom for the LB Q-statistics when testing squared
>> standardised residuals from an estimated garch model.
>>
>> Best regards
>>
>> Michal Miklovic
>>
>>
>>
>> ----- Original Message ----
>> From: Patrick Burns <patrick at burns-stat.com>
>> To: Spencer Graves <spencer.graves at pdf.com>
>> Cc: michal miklovic <mmiklovic at yahoo.com>;
>> r-sig-finance at stat.math.ethz.ch
>> Sent: Friday, December 28, 2007 11:21:33 AM
>> Subject: Re: [R-SIG-Finance] ljung-box tests in arma and garch models
>>
>> I heartily agree with Spencer that a simulation is the
>> way to answer the question. However, my intuition is
>> the opposite of Spencer's regarding what the answer
>> will be.
>>
>> The Burns Statistics working paper on Ljung-Box tests
>> makes it clear that using rank tests for testing the garch
>> adequacy will be much more important than messing with
>> the degrees of freedom.
>>
>>
>> Patrick Burns
>> patrick at burns-stat.com <mailto:patrick at burns-stat.com>
>> +44 (0)20 8525 0696
>> http://www.burns-stat.com
>> (home of S Poetry and "A Guide for the Unwilling S User")
>>
>> Spencer Graves wrote:
>>
>> >Dear Michal:
>> >
>> > The best way to check something like this is to do a simulation,
>> >tailored to your application. If you do such, I'd like to hear the
>> >results.
>> >
>> > Absent that, my gut reaction is to agree with you. The
>> chi-square
>> >distribution with k degrees of freedom is defined as distribution of
>> the
>> >sum of squares of k independent N(0, 1) variates
>> >(http://en.wikipedia.org/wiki/Chi-square_distribution). In 1900, Karl
>> >Pearson published "On the criterion that a given system of deviations
>> >from the probable in the case of a correlated system of variables is
>> >such that it can be reasonably supposed to have arisen from random
>> >sampling", Philosophical magazine, t.50
>> >(http://fr.wikipedia.org/wiki/Karl_Pearson). In this test, Pearson
>> >assumed that the sums of squares of k N(0, 1) variates, independent or
>> >not, would follow a chi-square(k). R. A. Fisher determined that the
>> >number of degrees of freedom should be reduced by the number of
>> >parameters estimated
>> >(http://www.mrs.umn.edu/~sungurea/introstat/history/w98/RAFisher.html
>> <http://www.mrs.umn.edu/%7Esungurea/introstat/history/w98/RAFisher.html>).
>>
>> >This led to a feud that continued after Pearson died.
>> >
>> > The "Box-Pierce" and "Ljung-Box" tests are both available in
>> >'Box.test{stats}' and discussed in Tsay (2005) Analysis of "financial
>> >Time Series (Wiley, p. 27), which includes a comment that, "Simulation
>> >studies suggest that the choice of" the number of lags included in the
>> >Ljung-Box statistic should be roughly log(number of observations) for
>> >"better power performance."
>> >
>> > Based on this, the "FinTS" package includes a function "ARIMA"
>> >that calls "arima", computes Box.test on the residuals and adjusts the
>> >number of degrees of freedom to match the examples in Tsay (2005). I
>> >haven't looked at this in depth, but it would seem to conform with
>> >Eviews, etc., and not with fArma, etc., as you mentioned.
>> >
>> > I haven't done a substantive literature search on this, but if
>> >anyone has evidence bearing on this issue beyond the original Ljung-Box
>> >paper, I'd like to know.
>> >
>> > Hope this helps.
>> > Spencer Graves
>> >
>> >michal miklovic wrote:
>> > >
>> >> Hi,
>> >>
>> >>I would like to ask/clarify how should degrees of freedom (and
>> p-values) for the Ljung-Box Q-statistics in arma and garch models be
>> computed. The reason for the question is that I have encountered two
>> different approaches. Let us say we have an arma(p,q) garch(m,n)
>> model. The two approaches are as follows:
>> >>
>> >>1) In R and fArma and fGarch packages, the arma and garch orders
>> are disregarded in the computation of degrees of freedom for the
>> Ljung-Box (LB) Q-statistics. In other words, regardless of p, q, m
>> and n, the LB Q-statistic computed from the first x autocorrelations
>> of (squared) standardised residuals has x degrees of freedom. Given
>> the statistic and degrees of freedom, the corresponding p-value is
>> computed.
>> >>
>> >>2) In EViews, TSP and other statistical software, the LB
>> Q-statistic computed from the first x autocorrelations of
>> standardised residuals has (x - (p+q)) degrees of freedom. Degrees of
>> freedom and p-values are not computed for the first (p+q) LB
>> Q-statistics. A similar method is applied to squared standardised
>> residuals: the LB Q-statistic computed from the first x autocorrelations
>> >>of squared standardised residuals has (x - (m+n)) degrees of freedom.
>> >>Degrees of freedom and p-values are not computed for the first
>> (m+n) LB
>> >>Q-statistics.
>> >>
>> >>I think the second approach is better because the first (p+q)
>> orders in standardised residuals and the first (m+n) orders in
>> squared standardised residuals should not exhibit any pattern and
>> higher orders should be checked for any remaining arma and garch
>> structures. Am I right or wrong?
>> >>
>> >>Thanks for answers and suggestions.
>> >>
>> >>Best regards
>> >>
>> >>Michal Miklovic
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> ____________________________________________________________________________________
>>
>> >>Be a better friend, newshound, and
>> >>
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >>_______________________________________________
>> >>R-SIG-Finance at stat.math.ethz.ch
>> <mailto:R-SIG-Finance at stat.math.ethz.ch> mailing list
>> >>https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> >>-- Subscriber-posting only.
>> >>-- If you want to post, subscribe first.
>> >>
>> >> >>
>> >
>> >_______________________________________________
>> >R-SIG-Finance at stat.math.ethz.ch
>> <mailto:R-SIG-Finance at stat.math.ethz.ch> mailing list
>> >https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> >-- Subscriber-posting only.
>> >-- If you want to post, subscribe first.
>> >
>> >
>> > >
>>
>>
>> ------------------------------------------------------------------------
>> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.
>> Try it now.
>> <http://us.rd.yahoo.com/evt=51733/*http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ%20>
>>
>
>
>
More information about the R-SIG-Finance
mailing list