[Rd] Bug in acf function?
steve at squaregoldfish.co.uk
Tue Jul 13 14:05:22 CEST 2010
I'm using R-2.9.1, so forgive me if this has already been resolved.
One element of the object returned from the acf function is n.used,
described in the man page as "The number of observations in the time
However, I've noticed that this value is set to nrow(x) via the sampleT
variable, i.e. the number of rows in the passed-in series. This forces
an assumption that all rows of the series contain values.
Since it's possible to calculate an acf from an incomplete series by
passing 'na.action=na.pass', I would suggest that the value of n.used in
this instance should be set to 'sum(!is.na(series))'.
This has knock-on effects too: the plot produced by the acf function
includes a horizontal line showing the threshold of statistical
significance, which is dependent on the number of measurements in the
qnorm((1 + 0.95)/2)/sqrt(corr$n.used)
For a given set of time series of fixed length, the threshold is
therefore constant regardless of the number of valid measurements in the
series, which I believe to be incorrect.
As a side note, I also think that this significance threshold should be
returned as part of the output of the acf function - as it stands, the
value is shown in a plot but there's no way to actually get the value.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 262 bytes
Desc: OpenPGP digital signature
More information about the R-devel