[R-sig-ME] Ecological mixed model example

Reinhold Kliegl reinhold.kliegl at gmail.com
Thu May 14 23:38:54 CEST 2009


The following parameterization recovers both the fixed effects and
also sigma_mu, sigma_beta, and sigma. The correlation does not look
bad either. There are probably also ways of forcing some of the
irrelevant correlations to zero.

Reinhold Kliegl

############################################

....

P = rmvnorm(nsite,mean=theta,sigma=V, method="chol")
...

# specifying seasons nested within year
ys  <- factor(paste(year, season, sep="_"))
cmat.ys <- matrix(c(          -1,  1, rep(0,8),
                            rep(0,2), -1,  1, rep(0,6),
    	                    rep(0,4), -1,  1, rep(0,4),
                            rep(0,6), -1,  1, rep(0,2),
	                    rep(0,8), -1,  1             ),  10,  5)
colnames(cmat.ys) <- c(".s2-s1|y1", ".s2-s1|y2", ".s2-s1|y3",
".s2-s1|y4",".s2-s1|y5")
contrasts(ys, 5) <- cmat.ys

# specifying months nested within season
# contrasting months 2 vs 1 and 3 vs 2 within season
sm  <- factor(paste(season, month, sep="_"))
cmat.sm <- matrix(c(-2/3,  1/3,  1/3,    0,     0,     0,
    	                          -1/3, -1/3,  2/3,    0,     0,     0,
	                              0,     0,     0, -2/3,  1/3,  1/3,
	                              0,     0,     0, -1/3, -1/3,  2/3  ),  6,  4)
colnames(cmat.sm) <- c(".m2-m1|s1", ".m3-m2|s1", ".m2-m1|s2", ".m3-m2|s2")
contrasts(sm, 4) <- cmat.sm

mod.2 = lmer(y ~ ys + sm + (1 | site) + (0 + ys | site) )
print(summary(mod.2), cor=FALSE)

Linear mixed model fit by REML
Formula: y ~ ys + sm + (1 | site) + (0 + ys | site)
  AIC  BIC logLik deviance REMLdev
 1377 1745 -621.4     1200    1243
Random effects:
 Groups   Name  Variance  Std.Dev. Corr
 site   (Intercept) 7.894963 2.80980
 site   ys1_1        7.401678 2.72060
          ys1_2        4.987042 2.23317   0.691
          ys2_1        3.228583 1.79683   0.126  0.256
          ys2_2        5.913934 2.43186  -0.177 -0.205  0.419
          ys3_1      11.414706 3.37857  -0.239 -0.242  0.018  0.651
          ys3_2      10.336715 3.21508   0.109  0.070  0.244  0.426
0.661
          ys4_1        4.113868 2.02827   0.392  0.180 -0.115 -0.159
-0.015  0.521
          ys4_2        4.801364 2.19120   0.391  0.108 -0.268 -0.215
-0.061  0.217  0.503
          ys5_1        2.867584 1.69339   0.186 -0.073 -0.356 -0.190
0.107  0.373  0.097  0.268
          ys5_2        3.745493 1.93533  -0.002 -0.107 -0.049  0.075
0.174  0.224 -0.069  0.046  0.241
 Residual             0.010422 0.10209
Number of obs: 1800, groups: site, 60

Fixed effects:
             Estimate Std. Error t value
(Intercept) 13.892119   0.375663   36.98
ys.s2-s1|y1 -4.447045   0.122316  -36.36
ys.s2-s1|y2 -4.164777   0.120216  -34.64
ys.s2-s1|y3 -3.161133   0.155749  -20.30
ys.s2-s1|y4 -3.047036   0.129606  -23.51
ys.s2-s1|y5 -3.863464   0.136543  -28.29
sm.m2-m1|s1  1.988407   0.008335  238.55
sm.m3-m2|s1 -0.981789   0.008335 -117.79
sm.m2-m1|s2 -1.990586   0.008335 -238.81
sm.m3-m2|s2 -0.008253   0.008335   -0.99

###########################################


On Wed, May 13, 2009 at 6:19 AM, Murray Jorgensen
<maj at stats.waikato.ac.nz> wrote:
> I am trying to help a friend towards an appropriate lmer() analysis of her
> data and help myself understand how to use lmer syntax in the right way.
>
> She says "...here is what I think is a very standard situation in
> ecology....
>
> A normally distributed measure (say bird density) is collected for:
>
>  5 years of data (fixed factor),
>  2 seasons within each year (fixed factor),
>  3 months within each season (fixed factor)..."
>
> The response is measured at the same six sites at every visit. What follows
> is my interpretation of what she might want, which may not be what she
> really wants, but this may not matter for the purpose of creating an
> example.
>
> I will simulate some data from a known model, then use lmer() to fit the
> model and compare the estimated parameters with those used in the
> simulation.
>
> I would be interested in comments about
>
> a)  whether I have simulated the data from my model correctly
>
> and
>
> b)  whether my lmer syntax is appropriate for my model.
>
>
> The systematic part of my model is:
>
>  \mu + \beta_{is} + \gamma_{sm}
>
> where i varies over years, s over seasons, and m over months.
>
> (I assume that months are nested within seasons but crossed with years.)
>
> Now let me decide how these coefficients should vary with site. I am
> going to assume that the constant term and the year-season term vary
> with site but that month-within-season does not.
>
> The random part of the model needs to be generated from a 1 + 5*2 =
> 11-dimensional multivariate normal. I will assume 1 correlation
> parameter, that between two seasons in the same year. All other
> correlations are zero.
>
> Values for fixed effects:
>
> Suppose  \mu = 10
>
>   the beta_{i1} are   7, 8, 9, 6, 7.5
>   the beta_{i2} are   1, 3, 4, 2, 2.5
>
>   the gamma_{1m} are  0, 2, 1
>   the gamma_{2m} are  0, -2, -2
>
> The s.d. parameters are  sigma_mu = 3, sigma_beta = 2, sigma = 0.1
> The correlation parameter is 0.4.
>
> Instead of 6 sites I will assume nsite = 60.
>
> I have written the following to generate data and fit the model:
>
> library(mvtnorm)
> library(lme4)
> V = diag(11)
> for ( i in 2:10) {
>  V[i,i+1] = 0.4
>  V[i+1,i] = 0.4
>  }
> sig = diag(c(3,rep(2,10)))
> V = sig%*%V%*%sig
> nsite = 60           #Why not?
> sigma = 0.1
> theta = c(10,7, 1, 8, 3, 9, 4, 6,2, 7.5,2.5)
> P = rmvnorm(nsite,mean=theta,sigma=V)
> a =  rep(1:10,rep(3,10))
> TH =  cbind(rep(1,30),model.matrix(~factor(a)-1))
> gg = rep(c(0,2,1,0,-2,-2),5)
> y = as.vector(TH%*%t(P)) + rep(gg,nsite) + rnorm(30*nsite,0,sigma)
> year = gl(5,2*3,30*nsite)
> season = gl(2,3,30*nsite)
> month = gl(3,1,30*nsite)
> site = gl(nsite,30,30*nsite)
> mod = lmer(y ~ year/season + season/month + (1 + year/season | site) )
> summary(mod)
>
>
> The output I get is
>
>> library(mvtnorm)
>> library(lme4)
>
>
>> V = diag(11)
>> for ( i in 2:10) {
>
> +    V[i,i+1] = 0.4
> +    V[i+1,i] = 0.4
> +    }
>>
>> sig = diag(c(3,rep(2,10)))
>> V = sig%*%V%*%sig
>
>> nsite = 60
>> sigma = 0.1
>> theta = c(10,7, 1, 8, 3, 9, 4, 6,2, 7.5,2.5)
>> P = rmvnorm(nsite,mean=theta,sigma=V)
>>
>> a =  rep(1:10,rep(3,10))
>> TH =  cbind(rep(1,30),model.matrix(~factor(a)-1))
>> gg = rep(c(0,2,1,0,-2,-2),5)
>> y = as.vector(TH%*%t(P)) + rep(gg,nsite) + rnorm(30*nsite,0,sigma)
>> year = gl(5,2*3,30*nsite)
>> season = gl(2,3,30*nsite)
>> month = gl(3,1,30*nsite)
>> site = gl(nsite,30,30*nsite)
>> mod = lmer(y ~ year/season + season/month + (1 + year/season | site) )
>> summary(mod)
>
> Linear mixed model fit by REML
> Formula: y ~ year/season + season/month + (1 + year/season | site)
>  AIC  BIC logLik deviance REMLdev
> 1332 1717 -596.1     1153    1192
> Random effects:
> Groups   Name          Variance  Std.Dev.
> Corr
> site     (Intercept)   14.874181
> 3.85671
>         year2          8.955943 2.99265
> -0.234
>         year3          8.751238 2.95825  -0.343
> 0.504
>         year4          8.671103 2.94467  -0.208  0.450
> 0.422
>         year5          7.388752 2.71823  -0.324  0.402  0.516
> 0.462
>         year1:season2  4.283068 2.06956  -0.156  0.638  0.322  0.517
> 0.358
>         year2:season2  5.046195 2.24637  -0.149 -0.469  0.223 -0.053
> 0.015 -0.308
>         year3:season2  5.490231 2.34312   0.078 -0.139 -0.418  0.243
> -0.007  0.019 -0.294
>         year4:season2  5.353655 2.31380   0.135 -0.204 -0.056 -0.583
> 0.086 -0.254  0.009 -0.318
>         year5:season2  5.400134 2.32382  -0.053  0.047  0.068 -0.044
> -0.472 -0.068  0.074 -0.018
> Residual                0.010340
> 0.10169
>
> -0.301
>
> Number of obs: 1800, groups: site, 60
>
> Fixed effects:
>               Estimate Std. Error t value
> (Intercept)    17.473505   0.497980   35.09
> year2           0.731252   0.386499    1.89
> year3           1.965976   0.382058    5.15
> year4          -1.014649   0.380309   -2.67
> year5           0.688820   0.351081    1.96
> season2        -6.255910   0.267479  -23.39
> year2:season2   1.889409   0.451063    4.19
> year3:season2   1.117068   0.400037    2.79
> year4:season2   2.564384   0.448766    5.71
> year5:season2   1.055936   0.415366    2.54
> season1:month2  1.993234   0.008303  240.07
> season2:month2 -2.029299   0.008303 -244.41
> season1:month3  1.001748   0.008303  120.65
> season2:month3 -2.027852   0.008303 -244.24
>
> Correlation of Fixed Effects:
>           (Intr) year2  year3  year4  year5  seasn2 yr2:s2 yr3:s2
> yr4:s2 yr5:s2 ssn1:2 ssn2:2 ssn1:3
> year2
> -0.235
>
> year3       -0.343
> 0.504
>
> year4       -0.208  0.450
> 0.422
> year5       -0.324  0.402  0.516
> 0.462
> season2     -0.157  0.637  0.322  0.516
> 0.358
> year2:sesn2 -0.003 -0.680 -0.047 -0.341 -0.203
> -0.790
> year3:sesn2  0.163 -0.531 -0.531 -0.162 -0.245 -0.654
> 0.377
> year4:sesn2  0.183 -0.515 -0.229 -0.696 -0.156 -0.765  0.575
> 0.343
> year5:sesn2  0.062 -0.376 -0.159 -0.364 -0.572 -0.693  0.572  0.444
> 0.377
> sesn1:mnth2 -0.008  0.000  0.000  0.000  0.000  0.016  0.000  0.000
> 0.000  0.000
> sesn2:mnth2  0.000  0.000  0.000  0.000  0.000 -0.016  0.000  0.000
> 0.000  0.000  0.000
> sesn1:mnth3 -0.008  0.000  0.000  0.000  0.000  0.016  0.000  0.000
> 0.000  0.000  0.500  0.000
> sesn2:mnth3  0.000  0.000  0.000  0.000  0.000 -0.016  0.000  0.000
> 0.000  0.000  0.000  0.500  0.000
>>
>
>
>
> The variance estimates for the random effects look large, but maybe
> because of the way I have parameterized the model I am estimating some
> combination of sigma_mu and sigma_beta?
>
>
> Cheers,
>
>
>
> Murray Jorgensen
>
>
> --
> Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
> Department of Statistics, University of Waikato, Hamilton, New Zealand
> Email: maj at waikato.ac.nz                                Fax 7 838 4155
> Phone  +64 7 838 4773 wk    Home +64 7 825 0441   Mobile 021 0200 8350
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>




More information about the R-sig-mixed-models mailing list