[R-sig-ME] Ecological mixed model example

Murray Jorgensen maj at stats.waikato.ac.nz
Wed May 13 06:19:56 CEST 2009

I am trying to help a friend towards an appropriate lmer() analysis of
her data and help myself understand how to use lmer syntax in the right way.

She says "...here is what I think is a very standard situation in
ecology....

A normally distributed measure (say bird density) is collected for:

5 years of data (fixed factor),
2 seasons within each year (fixed factor),
3 months within each season (fixed factor)..."

The response is measured at the same six sites at every visit. What
follows is my interpretation of what she might want, which may not be
what she really wants, but this may not matter for the purpose of
creating an example.

I will simulate some data from a known model, then use lmer() to fit the
model and compare the estimated parameters with those used in the
simulation.

a)  whether I have simulated the data from my model correctly

and

b)  whether my lmer syntax is appropriate for my model.

The systematic part of my model is:

\mu + \beta_{is} + \gamma_{sm}

where i varies over years, s over seasons, and m over months.

(I assume that months are nested within seasons but crossed with years.)

Now let me decide how these coefficients should vary with site. I am
going to assume that the constant term and the year-season term vary
with site but that month-within-season does not.

The random part of the model needs to be generated from a 1 + 5*2 =
11-dimensional multivariate normal. I will assume 1 correlation
parameter, that between two seasons in the same year. All other
correlations are zero.

Values for fixed effects:

Suppose  \mu = 10

the beta_{i1} are   7, 8, 9, 6, 7.5
the beta_{i2} are   1, 3, 4, 2, 2.5

the gamma_{1m} are  0, 2, 1
the gamma_{2m} are  0, -2, -2

The s.d. parameters are  sigma_mu = 3, sigma_beta = 2, sigma = 0.1
The correlation parameter is 0.4.

Instead of 6 sites I will assume nsite = 60.

I have written the following to generate data and fit the model:

library(mvtnorm)
library(lme4)
V = diag(11)
for ( i in 2:10) {
V[i,i+1] = 0.4
V[i+1,i] = 0.4
}
sig = diag(c(3,rep(2,10)))
V = sig%*%V%*%sig
nsite = 60           #Why not?
sigma = 0.1
theta = c(10,7, 1, 8, 3, 9, 4, 6,2, 7.5,2.5)
P = rmvnorm(nsite,mean=theta,sigma=V)
a =  rep(1:10,rep(3,10))
TH =  cbind(rep(1,30),model.matrix(~factor(a)-1))
gg = rep(c(0,2,1,0,-2,-2),5)
y = as.vector(TH%*%t(P)) + rep(gg,nsite) + rnorm(30*nsite,0,sigma)
year = gl(5,2*3,30*nsite)
season = gl(2,3,30*nsite)
month = gl(3,1,30*nsite)
site = gl(nsite,30,30*nsite)
mod = lmer(y ~ year/season + season/month + (1 + year/season | site) )
summary(mod)

The output I get is

> library(mvtnorm)
> library(lme4)

Attaching package: 'Matrix'

The following object(s) are masked from package:stats :

xtabs

The following object(s) are masked from package:base :

colMeans,
colSums,
rcond,
rowMeans,
rowSums

> V = diag(11)
> for ( i in 2:10) {
+    V[i,i+1] = 0.4
+    V[i+1,i] = 0.4
+    }
> sig = diag(c(3,rep(2,10)))
> V = sig%*%V%*%sig

> nsite = 60
> sigma = 0.1
> theta = c(10,7, 1, 8, 3, 9, 4, 6,2, 7.5,2.5)
> P = rmvnorm(nsite,mean=theta,sigma=V)
>
> a =  rep(1:10,rep(3,10))
> TH =  cbind(rep(1,30),model.matrix(~factor(a)-1))
> gg = rep(c(0,2,1,0,-2,-2),5)
> y = as.vector(TH%*%t(P)) + rep(gg,nsite) + rnorm(30*nsite,0,sigma)
> year = gl(5,2*3,30*nsite)
> season = gl(2,3,30*nsite)
> month = gl(3,1,30*nsite)
> site = gl(nsite,30,30*nsite)
> mod = lmer(y ~ year/season + season/month + (1 + year/season | site) )
> summary(mod)
Linear mixed model fit by REML
Formula: y ~ year/season + season/month + (1 + year/season | site)
AIC  BIC logLik deviance REMLdev
1332 1717 -596.1     1153    1192
Random effects:
Groups   Name          Variance  Std.Dev.
Corr
site     (Intercept)   14.874181
3.85671
year2          8.955943 2.99265
-0.234
year3          8.751238 2.95825  -0.343
0.504
year4          8.671103 2.94467  -0.208  0.450
0.422
year5          7.388752 2.71823  -0.324  0.402  0.516
0.462
year1:season2  4.283068 2.06956  -0.156  0.638  0.322  0.517
0.358
year2:season2  5.046195 2.24637  -0.149 -0.469  0.223 -0.053
0.015 -0.308
year3:season2  5.490231 2.34312   0.078 -0.139 -0.418  0.243
-0.007  0.019 -0.294
year4:season2  5.353655 2.31380   0.135 -0.204 -0.056 -0.583
0.086 -0.254  0.009 -0.318
year5:season2  5.400134 2.32382  -0.053  0.047  0.068 -0.044
-0.472 -0.068  0.074 -0.018
Residual                0.010340
0.10169

-0.301

Number of obs: 1800, groups: site, 60

Fixed effects:
Estimate Std. Error t value
(Intercept)    17.473505   0.497980   35.09
year2           0.731252   0.386499    1.89
year3           1.965976   0.382058    5.15
year4          -1.014649   0.380309   -2.67
year5           0.688820   0.351081    1.96
season2        -6.255910   0.267479  -23.39
year2:season2   1.889409   0.451063    4.19
year3:season2   1.117068   0.400037    2.79
year4:season2   2.564384   0.448766    5.71
year5:season2   1.055936   0.415366    2.54
season1:month2  1.993234   0.008303  240.07
season2:month2 -2.029299   0.008303 -244.41
season1:month3  1.001748   0.008303  120.65
season2:month3 -2.027852   0.008303 -244.24

Correlation of Fixed Effects:
(Intr) year2  year3  year4  year5  seasn2 yr2:s2 yr3:s2
yr4:s2 yr5:s2 ssn1:2 ssn2:2 ssn1:3
year2
-0.235

year3       -0.343
0.504

year4       -0.208  0.450
0.422
year5       -0.324  0.402  0.516
0.462
season2     -0.157  0.637  0.322  0.516
0.358
year2:sesn2 -0.003 -0.680 -0.047 -0.341 -0.203
-0.790
year3:sesn2  0.163 -0.531 -0.531 -0.162 -0.245 -0.654
0.377
year4:sesn2  0.183 -0.515 -0.229 -0.696 -0.156 -0.765  0.575
0.343
year5:sesn2  0.062 -0.376 -0.159 -0.364 -0.572 -0.693  0.572  0.444
0.377
sesn1:mnth2 -0.008  0.000  0.000  0.000  0.000  0.016  0.000  0.000
0.000  0.000
sesn2:mnth2  0.000  0.000  0.000  0.000  0.000 -0.016  0.000  0.000
0.000  0.000  0.000
sesn1:mnth3 -0.008  0.000  0.000  0.000  0.000  0.016  0.000  0.000
0.000  0.000  0.500  0.000
sesn2:mnth3  0.000  0.000  0.000  0.000  0.000 -0.016  0.000  0.000
0.000  0.000  0.000  0.500  0.000
>

The variance estimates for the random effects look large, but maybe
because of the way I have parameterized the model I am estimating some
combination of sigma_mu and sigma_beta?

Cheers,

Murray Jorgensen

--
Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: maj at waikato.ac.nz                                Fax 7 838 4155
Phone  +64 7 838 4773 wk    Home +64 7 825 0441   Mobile 021 0200 8350