[R] How does ccf() really work?
spencer.graves at pdf.com
Fri Apr 21 17:35:50 CEST 2006
The standard estimate of cross correlation uses the same denominator
for all lags AND ignores the reduction in the number of observations.
Consider the following:
a. <- a-mean(a)
b. <- b-mean(b)
SSa <- sum(a.^2)
SSb <- sum(b.^2)
SaSb <- sqrt(SSa*SSb)
# 0.618 = cor(a, b)
# 0.568 = cc lag 1
These numbers match the results you reported below. If I'm not
mistaken, this also matches the definition of the cross correlation
function in the original Box and Jenkins book [or the more recent Box,
Jenkins, Reinsel], "Time Series Analysis, Forecasting and Control". The
rationale, as I recall, is to reduce the false alarm rate by biasing
estimates with larger lags toward zero, thereby compensating slightly
for their increased random variability.
hope this helps.
p.s. Thanks for including such a simple, self-contained example. Posts
that don't include examples like this are typically much more difficult
to understand, which in turn increases the chances that a response will
not help the questionner.
Robert Lundqvist wrote:
> I can't understand the results from cross-correlation function ccf()
> even though it should be simple.
> Here's my short example:
>  1.4429135 0.8470067 1.2263730 -1.8159190 -0.6997260
>  -0.4227674 0.8602645 -0.6810602 -1.4858726 -0.7008563
> Autocorrelations of series 'X', by lag
> -4 -3 -2 -1 0 1 2 3 4
> -0.056 -0.289 -0.232 0.199 0.618 0.568 -0.517 -0.280 -0.012
> With lag 4 and vectors of length 5 there should as far as I can see
> only be 2 pairs of observations. The correlation would then be 1.
> Guess I am missing something really simple here. Anyone who could
> explain what is happening?
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help