[R-sig-ME] Different random effects variances for outcomes and groups

Wed Sep 19 14:28:03 CEST 2007

Dear all,

I wonder if someone could give me some insight on coding lmer.  I have 
facial shape data on children in two groups at four time points (3,6,12 
and 24 months).  Each child has a set of coordinate positions measured 
on their face at each time point (the set of coordinates is the same 
across individuals and times).  Take coordinates 1 and 2 only for now 
(reproducible code at the bottom of this email for simulated data).

If I plot the trends for coordinates 1 and 2 for each individual over 
time, there is a different amount of variance amongst the individuals 
(at least in the intercept, and maybe in the slope) for the two 
coordinates and also within the two groups, with group 1 (cleft) having 
higher variation than group 0 (control).  I want to allow for these 
sources of variation in the model.  The other thing is that I would 
expect coordinate positions within an individual to be correlated so I 
also want to allow for this.  The model, therefore, would be (for 
coordinate r=1,2 measured on individual i at time t, group_i an 
indicator variable taking value one for group 1 and zero otherwise):

y_{ir}(t) = \beta_{0r} + \beta_{1r} group_i + \beta_{2r} t + \beta_{3r} 
group_i : t  + b_{ir0} + group_i * b_{ir1} + \epsilon_{ir}(t),

where \epsilon_{ir}(t) ~ N(0, \sigma2) and the random effect b_{irp} ~ 
N(0, \sigma_{rp}2), for p=0,1.   I think the following code is 
appropriate (model 1):

lmer2(y~-1+coord1+coord2+coord1:(time+group+time:group)+coord2:(time+group+time:group) 
+ (0+coord1+coord2+coord1:group+coord2:group|ID), data=simdata)

where coord1 and coord2 are indicator variables for coordinates 1 and 2, 
respectively, time is continuous and group is an indicator variable 
taking value one for the cleft group and zero for the controls.  Does 
the random effects part make sense?  I'm especially unsure about 
allowing correlations between all of the random effects terms, although 
I think that's it's appropriate under this parameterisation because each 
person has a value for both coordinates 1 and 2, and the group effect is 
additional. 

An alternative parameterisation is (model 2):

lmer2(y~-1+coord1+coord2+coord1:(time+group+time:group)+coord2:(time+group+time:group) 
+ (0+gp0:coord1+gp0:coord2|ID)+(0+coord1:group+coord2:group|ID), 
data=simdata),

where gp0 is an indicator variable taking value one if the individual is 
in group 0 and zero otherwise.  It seems to me that this should be 
equivalent to model 1, but it doesn't appear to be (perhaps this just 
comes down to fewer correlations estimated in model 2).

If a correlation between random effects is estimated to be 1 or -1, is 
this generally because the model is over-parameterised?

Reproducible code is below.

set.seed(100)
n.subj <- 200
n.times <- 3
n.coords <- 2
simdata <- data.frame(coord1=c(rep(1,n.subj*n.times), 
rep(0,n.subj*n.times)))
simdata$coord2 <- c(rep(0,n.subj*n.times), rep(1,n.subj*n.times))
simdata$coord <- ifelse(simdata$coord1==1, 1, 2)
simdata$ID <- rep(rep(1:n.subj, each=n.times),2)
simdata$time <- rep(1:n.times, n.subj*n.coords)
simdata$group <- rep(c(1,0,1,0), each=n.subj*n.times/2)
simdata$gp0 <- 1-simdata$group
simdata$y <- rep(NA, dim(simdata)[1])
randeff <- c(rep(rnorm(n.subj/2, 0, 15),each=n.times), 
rep(rnorm(n.subj/2, 0, 10), each=n.times), rep(rnorm(n.subj/2, 0, 25), 
each=n.times), rep(rnorm(n.subj/2, 0, 20), each=3))
for (i in 1:dim(simdata)[1]) simdata$y[i] <- rnorm(1, 
randeff[i]+simdata$time[i]+simdata$coord[i]+10*simdata$group[i], 16)

lmer2(y~-1+coord1+coord2+coord1:(time+group+time:group)+coord2:(time+group+time:group) 
+ (0+coord1+coord2+coord1:group+coord2:group|ID), data=simdata)
lmer2(y~-1+coord1+coord2+coord1:(time+group+time:group)+coord2:(time+group+time:group) 
+ (0+gp0:coord1+gp0:coord2|ID)+(0+coord1:group+coord2:group|ID), 
data=simdata),

Many thanks for any help.

Best regards,
Sarah

-- 
Sarah Barry, MSc
Department of Statistics
University of Glasgow
Tel: +44 (0)141 330 2474
Fax: +44 (0)141 330 4814
www.stats.gla.ac.uk/~sarah