[R-sig-eco] Mixed effect (intercept) model with AR1 autocorrelation structure

Thu Jul 18 16:34:19 CEST 2013

Gavin,

Welcome to the prairies...

I think this thread is getting off track... a few comments on the autocorrelation model, then bringing it back to the question I was trying to ask:

I agree completely that we *must* model the time-series because our readings are not independent and hence we are violating the assumptions of a standard model.  There is no clear deterministic trend to the temperature readings: we are only modelling summer temperature and the single arch of the season is lost in weekly stochasticity.  There is also no significant linear trend (p-value = 0.3).  As I state in my first paragraph I have modelled the summer trend with a gam as an alternative approach (highly significant with 9 df).  However, even the basic AR1 model gives cleaner residuals.  Also, as I mention in bullet 3, I have an ARMA(2,0) model working that does a better job of dealing with the autocorrelation (as shown by ACF); I will expand this model once the conceptual problems I presented are sorted out.

The problems I have with modelling the summer temperatures as a deterministic effect (fixed effect, gam) are outlined in my previous post: it limits our inference (as per Zuur 327, although he is discussing a slightly different context); and whatever gam I come up with this year won't be consistent with next year, whereas an autocorrelation model will be.  I also think it is more ecologically valid. I can see there could be different points of view here however and debate is healthy.

Coming back to what I see as the problem at hand...  I _have_ several models that adequately deal with the autocorrelation.  Normalized residuals for the AR1 model plotted through time are shown in Figure 2 (see end of paragraph)  However, residuals plotted against Plot# are shown in Figure 1.  I don't have (too much) of a problem with Figure 2 (although variance seems to be decreasing with time)... I do have a big problem with figure 1!  This is what I wanted to address in my new model -- a random intercept model to account for the different means (and variances) of the residuals in each plot.

[My plots don't include in the posting so I've put them on the web:
http://homepage.usask.ca/~jmh842/Figure%201.pdf
http://homepage.usask.ca/~jmh842/Figure%202.pdf
]

However, this model doesn't work! -- it runs, but gives no random effect (see original post for code and results...).  The residuals of this new model show an identical pattern to Figure 1.  Hence something is going wrong in the computation element within nlme::lme.  This is the question I wanted to address with this post.

I was up examining Zuur last night and found a passing reference to this problem on page 160 "It may be an option to extend the model with the AR1 correlation, with a random intercept on nest [a case which would correspond exactly to my scenario]. Such a model allows for the compound correlation between all observations from the same nest, and temporal correlation between observations from the same nest _and_ night.  *But the danger is that the random intercept and auto-correlation will fight with each other for the same information.* " [* are mine...] I think this is exactly what is happening: the two correlation specifications are fighting with each other and the mixed intercept is loosing.

I have had many suggestions about other ways to model this dataset.  I have been and will continue to pursue these ideas (I'd also like to build a simple Baysian model as Zuur does in chapter 23 to deal with a similar problem).  However, I have not had anyone comment directly on my two main questions:
1) is this analysis doable or is nlme perhaps the wrong package?; and 
2) Is this a reasonable question to ask?  

I think question 2) is answered given the citation above -- it is valid conceptually.  Perhaps nlme IS the wrong package, or there is a bug in nlme since the information _seems_ to be there -- can anyone comment on this.  But I do still wonder about suppostion 3: Is it valid to use a two-step approach: run a basic autocorrelation model to remove the autocorrelation effect (normalized residuals shown in Figs 1,2), then run a mixed intercept model on the residuals to deal with the nested structure (and also the heterogeneity between plots).  This runs, gives reasonable results that agree with the graphs, but is it statistically valid?

Thanks,
Jonathan
________________________________________
From: Gavin Simpson [gavin.simpson at ucl.ac.uk]
Sent: July 17, 2013 10:06 PM
To: Henkelman, Jonathan
Cc: r-sig-ecology at r-project.org; Johnstone, Jill
Subject: Re: [R-sig-eco] Mixed effect (intercept) model with AR1 autocorrelation structure

On Wed, 2013-07-17 at 16:10 +0000, Henkelman, Jonathan wrote:
> Perhaps I should clarify.  There is a time-series trend -- daily
> temperature fluctuates randomly throughout the summer.  But there is
> not a clear long-term signal.  I have modelled the time-series effect
> using a gam to see if that can adequately compensate for the effect.
> However, I believe this is a fundamentally flawed approach:
>
> 1) We are not interested in modelling the time-series; it merely is a
> way of estimating the temperature response in plot.  That is, I don't
> want to ask the question, can I predict the temperature given the
> treatment and the date, but rather, is there a treatment effect.  My
> inference ability is badly reduced if a model a gam.

It is *irrelevant* whether you want to model the time series. You *must*
model it either via the fixed effects or in the covariance matrix of the
residuals.

You have two options:

1) include a time effect (either linear or spline) plus possibly a
simple time series model for the residuals (AR(1) would be a start).
Your null would include the time effect thus you can assess the effect
of Treatment relative to this null.

2) proceed as you did but use a much more complex ARMA model for the
residuals. AR(1) is clearly inappropriate with such a high \phi
estimate.

Which is most useful will depend on the data; it sounds as if you don't
have a deterministic trend as you seem to be describing a stochastic
trend. However, I'd be very surprised if you didn't have a cyclic
temperature effect with temperatures varying smoothly through the year
in a seasonal signal/cycle, with noise superimposed. But I am just
guessing.

Perhaps start with the mixed effects and **no** correlation structure if
you don't want to do 1). Extract the normalised residuals and look at
the ACF and the PACF of these. That will help identify what order ARMA
model might be needed. Then add the correlation structure with
corARMA().

I didn't notice you were at USask; I move to the UofR a few months ago.
If it would help to discuss offlist, my UofR contact details are in the
footer.

HTH

G

> 2) The current analysis is for a single season.  In a few years we
> will be re-running this analysis of 5 years of data.  I do not expect
> the random fluctuation in seasonal temperature will be the same each
> year.  Hence, while this analysis sort of works now, it won't in the
> future.  However, it seems reasonable to model the autocorrelation
> effect within the time-series as constant through time.
>
> 3) When I look at the process of temperature I can say, yes today is
> more likely to be similar to yesterday than the day before.  There is
> autocorrelation and random fluctuation, hence it makes sense to model
> it this way.  For the record, as simple AR1 model better account for
> the seasonal fluctuation than a gam, and my ARMA(2,0) model does an
> even better job.
>
> Hope this helps, J
>

--
Gavin Simpson, PhD                          [t] +1 306 337 8863
Adjunct Professor, Department of Biology    [f] +1 306 337 2410
Institute of Environmental Change & Society [e] gavin.simpson at uregina.ca
523 Research and Innovation Centre          [tw] @ucfagls
University of Regina
Regina, SK S4S 0A2, Canada