[R-sig-ME] time series with autcorrelated errors - matrix not invertible

Sat Jul 30 00:12:45 CEST 2016

  I am analyzing time series of a statistic (coefficient of variation or 
CV) generated by a Monte Carlo simulation of population dynamics. A 
given simulation might generate thousands of such time series.

For any given time series, the sequential estimates of CV are correlated 
with one another (because they are based on partially overlapping sets 
of the population numbers). Accordingly, I am using gls models with 
autocorrelated errors to analyze each time series.

Although the stochastic process generating all sample paths is the same, 
any given sample path might consist of a time series of CV for which 
AR(0), AR(1), AR(2), etc. generates the smallest AIC value, as produced 
by ar() (see below).

In practice, I have seen time series for which AR(0), AR(1), AR(2) 
generate the smallest AIC value and so always just using a specific 
order (e.g., AR(1)) does not seem appropriate. This is especially so 
because I am developing methods that would apply to time series for 
which the generating dynamics are not known.

For any given sample path y, the code is

#assess order of AR model using ar. use order with smallest AIC
arorder <- ar(moving_statistics$CV, order.max = 5)$order

#evaluate model with constant and time
if (arorder < 1) {
summary(int_model <- lm(CV ~ 1, data = moving_statistics))
summary(time_model <- lm(CV ~ 1 + time, data = moving_statistics))

#extract statistics for sample path y
pvalue_cv$pvalue[y] <- anova(int_model,time_model)$Pr
pvalue_cv$slope_time[y] <- time_model$coefficients[2]
} else {
summary(int_model <- gls(CV ~ 1, data = moving_statistics, correlation = 
corARMA(p=arorder), method="ML"))
summary(time_model <- gls(CV ~ 1 + time, data = moving_statistics, 
correlation = corARMA(p=arorder), method="ML"))
#extract statistics for sample path y
pvalue_cv$pvalue[y] <- anova(int_model,time_model)$p[2]
pvalue_cv$slope_time[y] <- time_model$coefficients[2]
}

When I run the simulation, there are some sample paths for which AR(2) 
has the lowest AIC and so this is the model called, as in

summary(time_model <- gls(CV ~ 1 + time, data = moving_statistics, 
correlation = corARMA(p=2), method="ML"))

This call appears to always generate this error message

Error in `coef<-.corARMA`(`*tmp*`, value = value[parMap[, i]]) :
   Coefficient matrix not invertible

Here are my questions:

1. how does ar generate an AIC value for p =2 even though the 
coefficient matrix is not invertible and so the model fit fails?

2. Is there a way to actually fit such a model, say, by adjusting 
tolerances?

3. a related thread 
(http://stackoverflow.com/questions/6706143/error-with-gls-function-in-nlme-package-in-r)

indicates that there are too many parameters being estimated given the 
length of the time series.

Is there a numerical criterion that can be queried before the gls call 
so that I can test for invertibility and fit, say, a lower order model 
(e.g., AR(1)), so as to avoid an error?

I cannot figure out from the code for gls what actually generates the 
error message. A pointer to the specific numerical criterion that 
generates this error would be very much appreciated.

In the stackoverflow thread, this is mentioned:

The general rule of thumb is that you should have at least 10 times as 
many data points as parameters, and that's for standard fixed 
effect/regression parameters. (Generally variance structure parameters 
such as AR parameters are even a bit harder/require a bit more data than 
regression parameters to estimate.)

Where in the literature is this rule discussed?

4. If the only option is to fit a lower order model of error, say, 
AR(1), instead of AR(2), what kind of bias does this generate in the 
analysis?

Many thanks in advance,

S.

-- 
Steven Orzack
Fresh Pond Research Institute
173 Harvey Street
Cambridge, MA 02140
617 864-4307

www.freshpond.org