[R-sig-eco] Autoregressive modelling (Gavin Simpson)

Gavin Simpson gavin.simpson at ucl.ac.uk
Wed Nov 24 17:31:30 CET 2010


On Wed, 2010-11-24 at 08:55 -0400, Highland Statistics wrote:
> ------------------------------
> 
<snip ?.
> 
> Whether you detrend or not (before fitting a model) is an important
> consideration - statistician colleagues of mine have told me *not* to
> detrend as you are throwing away information (amongst other reasons).
> instead, model the trend explicitly. Of course, you have to the posit a
> valid reason for the relationship between the response and your
> covariates to guard against spurious regressions - where you get a
> significant covariate because both it and the response have a trend but
> there is no mechanistic reason to presume that the covariate is
> controlling the response.
> 
> >  I still do not understand the difference between an AR1 model where
> >  other covariates are included as well (e.g. by using the arima()
> >  function) and a model where I included an AR1 correlation structure
> >  (by using e.g. gls() or lme() )
> 
> Zuur et al [1] suggest a different approach, along the lines of i)
> fitting the full model, ii) the fit something for the autocorrelation in
> the residuals of this full model, then iii) having included ii), refine
> the fitted model by getting rid of insignificant covariates etc.

> **
> My experience with this is that if you include both a trend component
> and an
> auto-regressive correlation structure on the residuals in the same
> model, AND you estimate them
> together (at the same time), then they are going to fight with each
> other who is going to get
> the information. Hence the suggestion to:
> 1. Fit a model without correlation
> 2. Get an impression of the strength of the correlation
> 3. Refit the model while keeping the autoregressive parameter(s)
> fixed.

Yes, indeed. I've been battling with some palaeoceanographic data of a
PhD student in our group. Sometimes these models work well and we can
partition into trend + autocorrelated noise. Other times, you wait a
week for the model to converge (yep lots of data!) and it either
interpolates the points and has an effectively zero correlation
parameter or it fits an almost flat, straight line through the data and
a large, well bounded away from zero, correlation parameter. (I'm using
gamm() models.)

> It is a bit dodgy I guess..well..pragmatic. Note....I would only do
> this with these AR and ARMA
> type structures. And the same for these spatial correlation
> structures. Things like a random intercept
> (and the associated correlation structure) is must easier to work
> with.

For the palaeo data I've been working with, I think I've given up with
trying to fit models with correlation structures directly. Instead I
looking at fitting the model I want (with some constraint on how
wiggly/smooth my fitted trend should be - i.e. I limit the df on the
spline used for my trend), then estimate a covariance matrix from the
residuals to use like a sandwich estimator and plug that in rather than
assumed covariance matrix. Well, at least before I switch back to trying
to get my head round DLMs...

Cheers Alain,

All the best,

G

> 
> Alain Zuur
> **
>   
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> I might modify this a bit (maybe Zuur et al already suggest this?), by
> thinking about what model I want to fit, what is plausible, and fit
> that. Then check the residuals for lack of independence. If residuals
> are dependent, fit a model that allows for autocorrelation in residuals
> directly by specifying a simple process for the covariance matrix (AR or
> ARMA say), such as via GLS.
> 
> Alternatively, we can make use of sandwich estimators for the covariance
> matrix. Recall that it is the standard errors of the coefficients that
> are too small. These standard errors come from the model covariance
> matrix. This covariance matrix is essentially a plug-in (several of the
> assumptions of OLS essentially arise because it assumes a particular
> form for the covariance matrix) and we can estimate a different
> covariance matrix that accounts for correlations between residuals, by
> estimating the parameters of an AR or ARMA process fitted to the model
> residuals, and use those parameters to form a new covariance matrix,
> from which we can get standard errors.
> 
> This latter approach is very flexible because it can be applied to lots
> of modelling situations, but you have to do all the heavy lifting as, in
> many cases, you will have to estimate the model for the residuals
> yourself, and then compute all the standard errors and tests on
> coefficients yourself.
> 
> [1] Zuur et al 2009 Mixed Effects Models and Extensions in Ecology with
> R. Springer.
> 
> An alternative book I very much recommend, but is not yet quite
> published is Chandler and Scott (2011) Statistical methods for trend
> detection and analysis in the environmental sciences. John Wiley and
> Sons. This book covers what I discuss above and a whole lot more.
> 
> HTH
> 
> G
> 
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-sig-ecology mailing list