[R-sig-ME] Another question about model specification
Douglas Bates
bates at stat.wisc.edu
Tue Mar 4 21:42:42 CET 2008
On Tue, Mar 4, 2008 at 11:46 AM, Martin Eklund
<martin.eklund at farmbio.uu.se> wrote:
> Hello again,
> Thank you very much for the quick reply on my earlier question!
> Unfortunately, I still have problems to fully understand how to
> specify models in lme4. Specifically, how do you specify interactions
> between factors?
> Consider the example data below (also attached for convenience).
> 'treatment' and 'family' are factors, 'treatment' is fixed and
> 'family' is random. Fitting the models
> (i) lmer(weight ~ treatment + (1|family))
> (ii) lmer(weight ~ treatment * (1|family))
> (iii) lmer(weight ~ (treatment + (1|family))^2)
> (iv) lmer(weight ~ treatment : (1|family))
> produce exactly the same results. I assume that this is the expected
> behavior, but to me it is a bit confusing.
I started writing a kind of a grumpy response to this inquiry because
the documentation for lmer specifically states that terms in the
formula are separated by '+' symbols, so exploring the possible
effects of other types of operators is not likely to be productive.
Nevertheless I will concede that specifying an interaction between a
fixed-effects factor and a random factor is not a transparent
operation. There are two ways that such an interaction can be
specified.
First, an interaction between a fixed factor and a random factor is a
random effect. That is, randomness propagates through interactions.
To specify an interaction between a fixed factor and a random factor
you must consider not only the form of the model matrix that results
but also the form of the variance-covariance matrix for the resulting
random effects. You can specify the interaction as
lmer(weight ~ treatment + (1| family) + (1|family:treatment))
or as
lmer(weight ~ treatment + (treatment | family))
In the first specification you will estimate two variance components,
one for family and one for the family:treatment interaction. This
assumes that the random effects for each level of the treatment within
a family are independent and have the same variance.
In the second specification you estimate a vector of random effects
for each family where the length of the vector is the number of
treatments. Let's say there are m treatments. Then you will estimate
a full m by m variance-covariance matrix for the random effects. This
becomes problematic when m is large.
That is, the first model is simpler than the second - in fact, it is a
submodel of the second model.
Here is an example using the (probably fictitious) Machines data from
the MEMSS package.
> data(Machines, package = "MEMSS")
> str(Machines)
'data.frame': 54 obs. of 3 variables:
$ Worker : Factor w/ 6 levels "1","2","3","4",..: 1 1 1 2 2 2 3 3 3 4 ...
$ Machine: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
$ score : num 52 52.8 53.1 51.8 52.8 53.1 60 60.2 58.4 51.1 ...
> (Mm1 <- lmer(score ~ Machine + (1|Worker) + (1|Worker:Machine), Machines))
Linear mixed model fit by REML
Formula: score ~ Machine + (1 | Worker) + (1 | Worker:Machine)
Data: Machines
AIC BIC logLik deviance REMLdev
227.7 239.6 -107.8 225.5 215.7
Random effects:
Groups Name Variance Std.Dev.
Worker (Intercept) 22.85892 4.78110
Worker:Machine (Intercept) 13.90845 3.72940
Residual 0.92465 0.96159
Number of obs: 54, groups: Worker, 6; Worker:Machine, 18
Fixed effects:
Estimate Std. Error t value
(Intercept) 52.356 2.486 21.062
MachineB 7.967 2.177 3.660
MachineC 13.917 2.177 6.393
Correlation of Fixed Effects:
(Intr) MachnB
MachineB -0.438
MachineC -0.438 0.500
> (Mm2 <- lmer(score ~ Machine + (Machine|Worker), Machines))
Linear mixed model fit by REML
Formula: score ~ Machine + (Machine | Worker)
Data: Machines
AIC BIC logLik deviance REMLdev
228.3 248.2 -104.2 216.6 208.3
Random effects:
Groups Name Variance Std.Dev. Corr
Worker (Intercept) 16.64051 4.07928
MachineB 34.54670 5.87764 0.484
MachineC 13.61398 3.68971 -0.365 0.297
Residual 0.92463 0.96158
Number of obs: 54, groups: Worker, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 52.356 1.681 31.151
MachineB 7.967 2.421 3.291
MachineC 13.917 1.540 9.037
Correlation of Fixed Effects:
(Intr) MachnB
MachineB 0.463
MachineC -0.374 0.301
> anova(Mm2, Mm1)
Data: Machines
Models:
Mm1: score ~ Machine + (1 | Worker) + (1 | Worker:Machine)
Mm2: score ~ Machine + (Machine | Worker)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
Mm1.p 6 237.47 249.40 -112.73
Mm2.p 10 236.61 256.50 -108.31 8.8515 4 0.06492
> If we pretend that
> 'family' is a fixed factor and use lm to fit the models above we get
> different results (or, to be precise, (ii) = (iii) ≠ (i) ≠ (iv)),
> which is the expected behavior to me. So, my question really boils
> down to: if I want to fit the model
>
> weight ~ treatment + (1|family) + (an interaction term between
> treatment and (1|family))
>
> how would I specify that? And in general, how are interactions
> specified?
>
> Thank you!
>
> Best regards,
>
> Martin.
>
> ==========
> R commands
> ==========
> data <- read.delim(file="/path/to/test.txt", sep="\t", header=TRUE)
> data$treatment <- as.factor(data$treatment)
> data$family <- as.factor(data$family)
> attach(data)
> lmer(weight ~ treatment + (1|family))
> lmer(weight ~ treatment * (1|family))
> lmer(weight ~ (treatment + (1|family))^2)
> lmer(weight ~ treatment : (1|family))
>
> ===========
> Example data
> ===========
> treatment family weight
> 2 1 0.099
> 3 1 0.099
> 1 1 0.105
> 1 1 0.112
> 1 1 0.113
> 1 1 0.114
> 1 1 0.119
> 3 1 0.121
> 1 1 0.123
> 1 1 0.124
> 3 1 0.130
> 3 1 0.130
> 2 1 0.142
> 2 1 0.143
> 2 1 0.144
> 2 1 0.164
> 2 2 0.086
> 1 2 0.097
> 1 2 0.097
> 1 2 0.105
> 1 2 0.106
> 2 2 0.107
> 1 2 0.109
> 2 2 0.113
> 2 2 0.114
> 1 2 0.114
> 3 2 0.115
> 2 2 0.132
> 3 2 0.141
> 2 2 0.131
>
> ￼
>
> ========================================
> Martin Eklund
> PhD Student
> Department of Pharmaceutical Biosciences
> Uppsala University, Sweden
> Ph: +46-18-4714281
> ========================================
>
>
>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list