[R-sig-eco] Mixed effect (intercept) model with AR1 autocorrelation structure

Wed Jul 17 17:59:11 CEST 2013

On Mon, 2013-07-15 at 23:37 +0000, Henkelman, Jonathan wrote:
> R-sig-ecology folks,
> 
> I am analyzing a dataset (dataframe 'df') structured as follows:
> - we have 12 plots (df$plot); 6 are warmed (the treatment df$Tmt) and 6 are not.
> - in each plot we have a single time-series (n=92) of mean daily temperature (actually an electronic average of 3 probes, df$avg).
> 
> Our aim is to find out whether we have a significant warming effect, recognizing our readings are not independent, and that the mean and variance structure within each plot/treatment is different.
> 
> I won't discuss my basic analysis but will enter this discussion with my AR1 model run with the following code (from the nlme library), and giving the (trimmed) output below:
> ===========
> summary(mod.gls <- gls (avg~Tmt, data=df, correlation = corAR1(form=~1 | plot)))
> 
> AIC  = 2676.283
> Correlation Structure: AR(1); Formula: ~1 | plot
> Phi = 0.914847
> 
> Coefficients:
>                Value Std.Error   t-value p-value
> (Intercept) 5.813455 0.3613071 16.090064  0.0000
> TmtW        1.383048 0.5109654  2.706735  0.0069
> 
>  Correlation:
>      (Intr)
> TmtW -0.707
> 
> Residual standard error: 1.988193
> Degrees of freedom: 1104 total; 1102 residual
> ===========
> 
> My interpretation:
> - we have a very strong autocorrelation effect! Phi = 0.914;
> - we have a significant treatment effect;
> - and when I calculate effective degrees of freedom (after Zuur et al "Mixed Effects Models and Extensions in Ecology with R" pg.113) I get 13.1; hence we aren't getting much extra information from each time-series given the level of autocorrelation, but at least we have dealt with data in an appropriate way.

With a phi that large it suggests you have a trend in the data. How
about modelling the Time effect as a fixed effect and see if you need
the autocorrelation to mop up residual autocorrelation in the residuals.

G

> However, an analysis of the residuals (see Figure 1.pdf attached) shows that each plot has it's own mean (and variance), hence I would like combine this model with a random effects (intercept) model to partition out that portion of the variance.  For the record I have attached a plot of the residuals with the AR1 structure removed (Figure 2.pdf)
> 
> What I'd like to do
> When I combine a (working but philosophically flawed) random effects model with the autocorrelation model above I get the following result:
> ===========
> summary (mod.MEcor <- lme(avg~Tmt, random=~1|plot, correlation=corAR1(form=~1|plot), data=df))
> 
> AIC = 2678.283
> Correlation Structure: AR(1); Formula: ~1 | plot
> Phi = 0.9148461
> 
> Random effects:
>  Formula: ~1 | plot
>          (Intercept) Residual
> StdDev: 0.0002816537 1.988183
> 
> Fixed effects: avg ~ Tmt
>                Value Std.Error   DF   t-value p-value
> (Intercept) 5.813457 0.3613038 1092 16.090217  0.0000
> TmtW        1.383047 0.5109608   10  2.706758  0.0221
> 
> Correlation:
>      (Intr)
> TmtW -0.707
> 
> Number of Observations: 1104
> Number of Groups: 12
> ===========
> 
> This output is nearly identical to the simple AR1 model (mod.gls) given above:
> - very little of the variance is being accounted for by the mixed effect;
> - intercept/treatment effect are nearly the same (although there is a loss of significance);
> - and AIC goes up slightly, presumably due to increased model complexity.
> Other examination shows the results are nearly identical, most notably that the plot of residuals is indistinguishable from the one shown above.  Hence I conclude we are not effectively dealing with the mixed effect in this model!
> 
> When I run a simple mixed-effects model without autocorrelation I get a much higher portion of the error attributed to between plot variability (recognizing this model is invalid because it assumes the incorrect variance structure for our data):
> ===========
> summary(mod.MEresid <- lme(avg~Tmt, random=~1|plot, data=df))
> Random effects:
>  Formula: ~1 | plot
>         (Intercept) Residual
> StdDev:   0.4748359 1.952297
> ===========
> 
> Questions...:
> 1. Is this type of analysis doable? Perhaps nlme is the wrong package?
> 2. Is this even a reasonable question to ask?
> 
> Some thoughts on why it might not be working (the way I expect) to spur the discussion:
> 1. Perhaps this is simply a problem with parameter estimation in nlme.  For example it is computing A before B and hence can't come back and estimate A in a reasonable way.  Can anyone recommend a different package that can do this type of analysis?
> 
> 2. There simply isn't enough information in this dataset to answer this questions.  Once we have removed the autocorrelation we have approximately one datapoint per plot and hence we don't have enough degrees of freedom left over to reasonably estimate the mixed effect. This was my first conclusion, but I have cooled on this idea considerably. Figure 1 shows there is a measurable mean and variance for each plot -- it is not hard to estimate by hand... and it is easy to see the values line up with those shown in Figure 1.
> ===========
> Plots <- as.character(levels(df$plot))
> M <- vector(length=12); for (i in 1:12) M[i] <- mean (En[df$plot==Plots[i]]); M
> [1]  0.833 -0.436  0.126 -0.849  0.536  0.077  0.877  0.240 -1.674  1.419  0.277  -1.146
> S <- vector(length=12); for (i in 1:12) S[i] <- sd (En[df$plot==Plots[i]]); S
> [1] 1.453  0.704  0.400  0.648  0.594  0.523  0.551  1.014  0.934  0.639  0.634  0.794
> ===========
> 
> Also a mixed effect model run on the residuals shows there is information that can yet be extracted: that is, there is still a stong mixed effect but no fixed effect left in the residuals.
> ===========
> df.gls <- df; df.gls$avg <- resid (mod.gls, type="normalized");
> summary (mod.MEgls <- lme(avg~Tmt, random=~1|plot, data=df.gls))
> 
> Random effects:
>  Formula: ~1 | plot
>         (Intercept)  Residual
> StdDev:   0.9317715 0.7889055
> 
> Fixed effects: avg ~ Tmt
>                  Value Std.Error   DF    t-value p-value
> (Intercept)  0.1529750 0.3818732 1092  0.4005911  0.6888
> TmtW        -0.2589626 0.5400503   10 -0.4795157  0.6419
> ===========
> 
> 3. Perhaps the problem is that I am trying to specify two different types of correlation structure and they just don't agree: the induced correlation structure from the mixed effect on intercept (compound symmetric); whereas I am also specifying an AR1 correlation structure with the correlation argument.  Still, it should be possible to say, remove the AR1 correlation structure, then I assume all the readings within a plot are more correlated to each other than to other plots (as is shown in Figure 1), that is, a compound symmetric structure. This is functionally what has been done in the last code block with the analysis of the residuals.  But is this a valid method?  I feel both structures apply in a theoretical sense; it's just a question of how to specify this in R.
> 
> 4. Finally, an easy one, can anyone comment on the strong correlation between the intercept and treatment parameters (-0.707)?  Is this a problem, or does it just reflect the nature of the model? It does not go away when I center the temperature data...
> 
> Thanks for your thoughts...
> Jonathan Henkelman
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%