# [R-sig-ME] time series with autcorrelated errors - matrix not invertible

Steven Orzack orzack at freshpond.org
Sat Jul 30 00:12:45 CEST 2016

```  I am analyzing time series of a statistic (coefficient of variation or
CV) generated by a Monte Carlo simulation of population dynamics. A
given simulation might generate thousands of such time series.

For any given time series, the sequential estimates of CV are correlated
with one another (because they are based on partially overlapping sets
of the population numbers). Accordingly, I am using gls models with
autocorrelated errors to analyze each time series.

Although the stochastic process generating all sample paths is the same,
any given sample path might consist of a time series of CV for which
AR(0), AR(1), AR(2), etc. generates the smallest AIC value, as produced
by ar() (see below).

In practice, I have seen time series for which AR(0), AR(1), AR(2)
generate the smallest AIC value and so always just using a specific
order (e.g., AR(1)) does not seem appropriate. This is especially so
because I am developing methods that would apply to time series for
which the generating dynamics are not known.

For any given sample path y, the code is

#assess order of AR model using ar. use order with smallest AIC
arorder <- ar(moving_statistics\$CV, order.max = 5)\$order

#evaluate model with constant and time
if (arorder < 1) {
summary(int_model <- lm(CV ~ 1, data = moving_statistics))
summary(time_model <- lm(CV ~ 1 + time, data = moving_statistics))

#extract statistics for sample path y
pvalue_cv\$pvalue[y] <- anova(int_model,time_model)\$Pr
pvalue_cv\$slope_time[y] <- time_model\$coefficients
} else {
summary(int_model <- gls(CV ~ 1, data = moving_statistics, correlation =
corARMA(p=arorder), method="ML"))
summary(time_model <- gls(CV ~ 1 + time, data = moving_statistics,
correlation = corARMA(p=arorder), method="ML"))
#extract statistics for sample path y
pvalue_cv\$pvalue[y] <- anova(int_model,time_model)\$p
pvalue_cv\$slope_time[y] <- time_model\$coefficients
}

When I run the simulation, there are some sample paths for which AR(2)
has the lowest AIC and so this is the model called, as in

summary(time_model <- gls(CV ~ 1 + time, data = moving_statistics,
correlation = corARMA(p=2), method="ML"))

This call appears to always generate this error message

Error in `coef<-.corARMA`(`*tmp*`, value = value[parMap[, i]]) :
Coefficient matrix not invertible

Here are my questions:

1. how does ar generate an AIC value for p =2 even though the
coefficient matrix is not invertible and so the model fit fails?

2. Is there a way to actually fit such a model, say, by adjusting
tolerances?

3. a related thread
(http://stackoverflow.com/questions/6706143/error-with-gls-function-in-nlme-package-in-r)

indicates that there are too many parameters being estimated given the
length of the time series.

Is there a numerical criterion that can be queried before the gls call
so that I can test for invertibility and fit, say, a lower order model
(e.g., AR(1)), so as to avoid an error?

I cannot figure out from the code for gls what actually generates the
error message. A pointer to the specific numerical criterion that
generates this error would be very much appreciated.

In the stackoverflow thread, this is mentioned:

The general rule of thumb is that you should have at least 10 times as
many data points as parameters, and that's for standard fixed
effect/regression parameters. (Generally variance structure parameters
such as AR parameters are even a bit harder/require a bit more data than
regression parameters to estimate.)

Where in the literature is this rule discussed?

4. If the only option is to fit a lower order model of error, say,
AR(1), instead of AR(2), what kind of bias does this generate in the
analysis?

Many thanks in advance,

S.

--
Steven Orzack
Fresh Pond Research Institute
173 Harvey Street
Cambridge, MA 02140
617 864-4307

www.freshpond.org

```