[R-sig-eco] Autoregressive modelling

Thu Nov 11 15:26:21 CET 2010

Dear Saskia,

1) Autoregressive residuals (for stationary AR processes) do not bias 
your estimates but will bias your confidence intervals. (Unfortunately 
mostly making significant variables insignificant). If the AR model for 
your residuals is not stationary you are in statistical trouble.

2) I think that there are different ways of proceeding. The most 
conservative way would be to do a two stage process: (A) first you 
identify your model (using detrended dependent and independent 
variables). The variables that are significant are thereafter used in 
the second stage. (B) You build the model using the variables found in 
A. This can be done in using gls (nlme package) or mixed models. That 
model the autocovariance structure.

Some people do not do A but use a gls model directly. I think in the 
case of strong trends in the data (both X and Y) A could be insightful. 
On the other hand the identification might fail for variables that 
increase evenly with a strong time trend without much variation in the 
rate of increase.

Frank

> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 10 Nov 2010 13:56:33 +0100
> From: Saskia Otto<Saskia.Otto at uni-hamburg.de>
> To: r-sig-ecology at r-project.org
> Subject: [R-sig-eco] handling autocorrelation in regression data
> Message-ID:<4CDA9681.4080405 at uni-hamburg.de>
> Content-Type: text/plain
>
> Dear list members,
>
> to identify the drivers of a temporal trend in fish abundance, I applied
> a linear regression including several covariates in the full model:
> Abun_t ~ b_0 + b_1 *cov1_t + b_2 *cov2_t + ... + e_t , where t
> represents the year.
>
> The problem is that the residuals of the full model show strong
> autocorrelation (order 1 autoregressive (AR(1) ) errors), thus violating
> the independence assumptions.
> My question is now: what can I do alternatively?
> Since I want to model explicitly the trend, I do not want to detrend the
> time series.
> Should I fit an autoregressive time series model where I include other
> covariates as well:
> Abun_t ~ b_0 + b_1 *Abund_t-1 + _b_2 *cov1_t + b_3 *cov2_t +...+ e_t ?
> Can I do this using the lm() function:lm(Abun_t ~ Abund_t-1 + cov1_t +
> cov2_t + ...)
> or do I have to use the function arima(): arima(Abund _t , order =
> c(1,0,0), xreg = cbind(cov1,cov2,...) )  ?
>
> What is the difference between an AR1 model (by using for instance the
> arima function where I define only the AR order) and the gls() function
> where I include an AR1 correlation structure:
> gls(Abun_t ~ cov1_t + cov2_t + ..., correlation = corAR1(form =~ 1 |
> Year) )  ?
>
> To me it seems quite similar, but I only came across literature that
> dealt with either one of these approaches.
>
> I would be glad if someone of you could help me out and get things
> clearer on this issue.
>
> Thanks a lot in advance!
> Saskia
>
>