[R-sig-ME] Another question about model specification

Tue Mar 4 21:42:42 CET 2008

On Tue, Mar 4, 2008 at 11:46 AM, Martin Eklund
<martin.eklund at farmbio.uu.se> wrote:
> Hello again,

>  Thank you very much for the quick reply on my earlier question!
>  Unfortunately, I still have problems to fully understand how to
>  specify models in lme4. Specifically, how do you specify interactions
>  between factors?

>  Consider the example data below (also attached for convenience).
>  'treatment' and 'family' are factors, 'treatment' is fixed and
>  'family' is random. Fitting the models

>  (i)     lmer(weight ~ treatment + (1|family))
>  (ii)    lmer(weight ~ treatment * (1|family))
>  (iii)   lmer(weight ~ (treatment + (1|family))^2)
>  (iv)    lmer(weight ~ treatment : (1|family))

>  produce exactly the same results. I assume that this is the expected
>  behavior, but to me it is a bit confusing.

I started writing a kind of a grumpy response to this inquiry because
the documentation for lmer specifically states that terms in the
formula are separated by '+' symbols, so exploring the possible
effects of other types of operators is not likely to be productive.

Nevertheless I will concede that specifying an interaction between a
fixed-effects factor and a random factor is not a transparent
operation.  There are two ways that such an interaction can be
specified.

First, an interaction between a fixed factor and a random factor is a
random effect.  That is, randomness propagates through interactions.
To specify an interaction between a fixed factor and a random factor
you must consider not only the form of the model matrix that results
but also the form of the variance-covariance matrix for the resulting
random effects.  You can specify the interaction as

lmer(weight ~ treatment + (1| family) + (1|family:treatment))

or as

lmer(weight ~ treatment + (treatment | family))

In the first specification you will estimate two variance components,
one for family and one for the family:treatment interaction.  This
assumes that the random effects for each level of the treatment within
a family are independent and have the same variance.

In the second specification you estimate a vector of random effects
for each family where the length of the vector is the number of
treatments.  Let's say there are m treatments.  Then you will estimate
a full m by m variance-covariance matrix for the random effects.  This
becomes problematic when m is large.

That is, the first model is simpler than the second - in fact, it is a
submodel of the second model.

Here is an example using the (probably fictitious) Machines data from
the MEMSS package.

> data(Machines, package = "MEMSS")
> str(Machines)
'data.frame':	54 obs. of  3 variables:
 $ Worker : Factor w/ 6 levels "1","2","3","4",..: 1 1 1 2 2 2 3 3 3 4 ...
 $ Machine: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
 $ score  : num  52 52.8 53.1 51.8 52.8 53.1 60 60.2 58.4 51.1 ...
> (Mm1 <- lmer(score ~ Machine + (1|Worker) + (1|Worker:Machine), Machines))
Linear mixed model fit by REML
Formula: score ~ Machine + (1 | Worker) + (1 | Worker:Machine)
   Data: Machines
   AIC   BIC logLik deviance REMLdev
 227.7 239.6 -107.8    225.5   215.7
Random effects:
 Groups         Name        Variance Std.Dev.
 Worker         (Intercept) 22.85892 4.78110
 Worker:Machine (Intercept) 13.90845 3.72940
 Residual                    0.92465 0.96159
Number of obs: 54, groups: Worker, 6; Worker:Machine, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept)   52.356      2.486  21.062
MachineB       7.967      2.177   3.660
MachineC      13.917      2.177   6.393

Correlation of Fixed Effects:
         (Intr) MachnB
MachineB -0.438
MachineC -0.438  0.500
> (Mm2 <- lmer(score ~ Machine + (Machine|Worker), Machines))
Linear mixed model fit by REML
Formula: score ~ Machine + (Machine | Worker)
   Data: Machines
   AIC   BIC logLik deviance REMLdev
 228.3 248.2 -104.2    216.6   208.3
Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Worker   (Intercept) 16.64051 4.07928
          MachineB    34.54670 5.87764   0.484
          MachineC    13.61398 3.68971  -0.365  0.297
 Residual              0.92463 0.96158
Number of obs: 54, groups: Worker, 6

Fixed effects:
            Estimate Std. Error t value
(Intercept)   52.356      1.681  31.151
MachineB       7.967      2.421   3.291
MachineC      13.917      1.540   9.037

Correlation of Fixed Effects:
         (Intr) MachnB
MachineB  0.463
MachineC -0.374  0.301
> anova(Mm2, Mm1)
Data: Machines
Models:
Mm1: score ~ Machine + (1 | Worker) + (1 | Worker:Machine)
Mm2: score ~ Machine + (Machine | Worker)
      Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
Mm1.p  6  237.47  249.40 -112.73
Mm2.p 10  236.61  256.50 -108.31 8.8515      4    0.06492

>  If we pretend that
>  'family' is a fixed factor and use lm to fit the models above we get
>  different results (or, to be precise, (ii) = (iii) ≠ (i) ≠ (iv)),
>  which is the expected behavior to me. So, my question really boils
>  down to: if I want to fit the model
>
>  weight ~ treatment + (1|family) + (an interaction term between
>  treatment and (1|family))
>
>  how would I specify that? And in general, how are interactions
>  specified?
>
>  Thank you!
>
>  Best regards,
>
>  Martin.
>
>  ==========
>  R commands
>  ==========
>  data <- read.delim(file="/path/to/test.txt", sep="\t", header=TRUE)
>  data$treatment <- as.factor(data$treatment)
>  data$family <- as.factor(data$family)
>  attach(data)
>  lmer(weight ~ treatment + (1|family))
>  lmer(weight ~ treatment * (1|family))
>  lmer(weight ~ (treatment + (1|family))^2)
>  lmer(weight ~ treatment : (1|family))
>
>  ===========
>  Example data
>  ===========
>  treatment       family  weight
>  2       1       0.099
>  3       1       0.099
>  1       1       0.105
>  1       1       0.112
>  1       1       0.113
>  1       1       0.114
>  1       1       0.119
>  3       1       0.121
>  1       1       0.123
>  1       1       0.124
>  3       1       0.130
>  3       1       0.130
>  2       1       0.142
>  2       1       0.143
>  2       1       0.144
>  2       1       0.164
>  2       2       0.086
>  1       2       0.097
>  1       2       0.097
>  1       2       0.105
>  1       2       0.106
>  2       2       0.107
>  1       2       0.109
>  2       2       0.113
>  2       2       0.114
>  1       2       0.114
>  3       2       0.115
>  2       2       0.132
>  3       2       0.141
>  2       2       0.131
>
>  
>
>  ========================================
>  Martin Eklund
>  PhD Student
>  Department of Pharmaceutical Biosciences
>  Uppsala University, Sweden
>  Ph: +46-18-4714281
>  ========================================
>
>
>
>
>  _______________________________________________
>  R-sig-mixed-models at r-project.org mailing list
>  https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>