[R-sig-ME] R-sig-mixed-models Digest, Vol 136, Issue 41

Fri May 4 09:56:55 CEST 2018

On 2 May 2018 at 17:50, Maarten Jung <Maarten.Jung at mailbox.tu-dresden.de> wrote:
> Thank you for explaining this. This is *very* interesting.
> As far as I understand, m_zcp5 is the model Reinhold Kliegl uses in
> this RPub article[1] (actually m_zcp7 which should be identical). Also
> Barr et al. (2013)[2], Bates et al. (2015)[3] and Matuschek et al.
> (2017)[4] suggest similar models as the first step for model
> reduction. However, their categorical independent variables have only
> two levels and they work with crossed random effects.

I haven't read those articles recently in enough detail that I can comment.

>
> cake3 <- cake
> cake3 <- subset(cake3, recipe != "C")
> cake3$recipe <- factor(cake3$recipe)
> contrasts(cake3$recipe) <- c(0.5, -0.5)  # Barr and Matuschek use effect coding
> m_zcp5 <- lmer_alt(angle ~ recipe  + (recipe || replicate), cake3)
> VarCorr(m_zcp5)
>  Groups      Name        Std.Dev.
>  replicate   (Intercept) 5.8077
>  replicate.1 re1.recipe1 1.8601
>  Residual                5.5366
>
> cake3$recipe_numeric <- ifelse(cake3$recipe == "A", 0.5, -0.5)
> m_zcp7 <- lmer(angle ~ recipe_numeric + (1|replicate) + (0 +
> recipe_numeric|replicate), cake3)
> VarCorr(m_zcp7)
>  Groups      Name           Std.Dev.
>  replicate   (Intercept)    5.8077
>  replicate.1 recipe_numeric 1.8601
>  Residual                   5.5366

So m_zcp5 and m_zcp7 are identical but I don't see how they are
meaningful in this context. Looking at the random-effect design matrix

image(getME(m_zcp5, "Z")) # identical to image(getME(m_zcp7, "Z"))

you can see that this model estimates a random main effect for
replicate (i.e. (1 | replicate)) and then a random _slope_ for recipe
at each replicate (i.e. recipe in '(recipe || replicate)' is treated
as numeric rather than factor). As far as I can tell this random slope
model is _unrelated_ to models where recipe is treated as a factor
that we have discussed previously: It is a completely different model
and I don't see how it is relevant for this design. (Notice that
'recipe' is equivalent to 'recipe_numeric' in the fixed-effects, but
not so in the random-effects!)

>
> Besides that, Reinhold Kliegl reduces m_zcp5 to Model3b - i.e. (recipe
> || replicate) to (1 | replicate) + (1 | recipe:replicate).
> Whereas you, If I understand correctly, suggest reducing/comparing (0
> + recipe || replicate) to (1 | recipe:replicate).
> Why is that? Am I missing something?

If anything I would say that you should look at all relevant models
and choose the one that represents the best compromise between fit to
data and complexity :-) Likelihood ratio tests can be a helpful guide,
but take care not to formally compare/test models that are not nested.

Here is an example of a set of models and sequences in which they can
be compared with LR tests:

# Random main effect of replicate, no interaction:
fm1 <- lmer(angle ~ recipe + (1 | replicate), data=cake)
# Random interaction recipe:replicate; same variance across recipes;
no main effect:
fm2 <- lmer(angle ~ recipe + (1 | recipe:replicate), data=cake)
# Random interaction with different variances across recipes; no main effect:
fm3 <- lmer(angle ~ recipe +
              (0 + dummy(recipe, "A") | replicate) +
              (0 + dummy(recipe, "B") | replicate) +
              (0 + dummy(recipe, "C") | replicate), data=cake)
# Random main effect and interaction with same variance across recipes:
fm4 <- lmer(angle ~ recipe + (1 | replicate) + (1 | recipe:replicate),
data=cake)
# Random main effect and interaction with different variances across recipes:
fm5 <- lmer(angle ~ recipe + (1 | replicate) +
              (0 + dummy(recipe, "A") | replicate) +
              (0 + dummy(recipe, "B") | replicate) +
               (0 + dummy(recipe, "C") | replicate), data=cake)
# Multivariate structure that contains both main and interaction effects with
# different variances and correlations:
fm6 <- lmer(angle ~ recipe + (0 + recipe | replicate), data=cake)
# Same model, just re-parameterized:
# fm6b <- lmer(angle ~ recipe + (recipe | replicate), data=cake)
# fm6c <- lmer(angle ~ recipe + (1 | replicate) + (0 + recipe |
replicate), data=cake)
# fm6d <- lmer(angle ~ recipe + (1 | replicate) + (recipe |
replicate), data=cake)
# fm6e <- lmer(angle ~ recipe + (1 | recipe:replicate) + (recipe |
replicate), data=cake)
# anova(fm6, fm6b, fm6c, fm6d, fm6e, refit=FALSE) # fm6 = fm6b = fm6c
= fm6d = fm6e

Note that in fm4 and fm5 the random main and interaction effects are
independent, but in fm6 they are not.

No. parameters and log-likelihood/deviance of these models:
as.data.frame(anova(fm1, fm2, fm3, fm4, fm5, fm6,
refit=FALSE))[paste0("fm", 1:6), 1:5]
    Df      AIC      BIC    logLik deviance
fm1  5 1741.019 1759.011 -865.5097 1731.019
fm2  5 1776.967 1794.959 -883.4835 1766.967
fm3  7 1779.571 1804.760 -882.7857 1765.571
fm4  6 1741.067 1762.658 -864.5337 1729.067
fm5  8 1743.437 1772.224 -863.7185 1727.437
fm6 10 1741.003 1776.987 -860.5016 1721.003

The following nesting structure indicate sequences in which models can be
compares with LR tests (arrows indicate model simplification):
fm6 -> fm5 -> fm4 -> fm2
fm6 -> fm5 -> fm4 -> fm1
fm6 -> fm5 -> fm3 -> fm2

Note that fm3 and fm4 are not nested and simply represent different structures
and so no formal LR test is available. The same is true for fm1 and
fm2 as well as
fm1 and fm3.

In addition to these models there are others which are just not as
easily fitted with lmer (to the best of my knowledge) for example a
version of fm5 where the interaction random effect is specified with a
common covariance parameter on top of the 3 variances. Theoretically
there are many options here but obtaining the fits is often not
straight forward and usually no single fit is uniquely better than the
rest.

Cheers
Rune

>
> Cheers,
> Maarten
>
> [1] https://rpubs.com/Reinhold/22193
> [2] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3881361/
> [3] https://arxiv.org/abs/1506.04967 vignettes here:
> https://github.com/dmbates/RePsychLing/tree/master/vignettes
> [4] https://arxiv.org/abs/1511.01864
>
> On Wed, May 2, 2018 at 11:56 AM, Rune Haubo <rune.haubo at gmail.com> wrote:
>> On 2 May 2018 at 00:27, Maarten Jung <Maarten.Jung at mailbox.tu-dresden.de> wrote:
>>> Sorry, I forgot that lmer() (unlike lmer_alt() from the afex package)
>>> does not convert factors to numeric covariates when using the the
>>> double-bar notation!
>>> The model I was talking about would be:
>>>
>>> m_zcp5 <- lmer_alt(angle ~ recipe  + (recipe || replicate), cake)
>>> VarCorr(m_zcp5)
>>>  Groups      Name        Std.Dev.
>>>  replicate   (Intercept) 6.2359
>>>  replicate.1 re1.recipe1 1.7034
>>>  replicate.2 re1.recipe2 0.0000
>>>  Residual                5.3775
>>>
>>> This model seems to differ (and I don't really understand why) from
>>> m_zcp6 which I think is equivalent to your m_zcp4:
>>> m_zcp6 <- lmer_alt(angle ~ recipe  + (0 + recipe || replicate), cake)
>>> VarCorr(m_zcp6)
>>>  Groups      Name        Std.Dev.
>>>  replicate   re1.recipeA 5.0429
>>>  replicate.1 re1.recipeB 6.6476
>>>  replicate.2 re1.recipeC 7.1727
>>>  Residual                5.4181
>>>
>>> anova(m_zcp6, m_zcp5, refit = FALSE)
>>> Data: data
>>> Models:
>>> m_zcp6: angle ~ recipe + ((0 + re1.recipeA | replicate) + (0 + re1.recipeB |
>>> m_zcp6:     replicate) + (0 + re1.recipeC | replicate))
>>> m_zcp5: angle ~ recipe + ((1 | replicate) + (0 + re1.recipe1 | replicate) +
>>> m_zcp5:     (0 + re1.recipe2 | replicate))
>>>        Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
>>> m_zcp6  7 1781.8 1807.0 -883.88   1767.8
>>> m_zcp5  7 1742.0 1767.2 -863.98   1728.0 39.807      0  < 2.2e-16 ***
>>>
>>
>> Yes, m_zcp4 and m_zcp6 are identical.
>>
>> For m_zcp5 I get:
>> m_zcp5 <- lmer_alt(angle ~ recipe  + (recipe || replicate), cake)
>> VarCorr(m_zcp5)
>>  Groups      Name        Std.Dev.
>>  replicate   (Intercept) 6.0528e+00
>>  replicate.1 re1.recipeB 5.8203e-07
>>  replicate.2 re1.recipeC 2.1303e+00
>>  Residual                5.4693e+00
>>
>> and if we change the reference level for recipe we get yet another result:
>> cake2 <- cake
>> cake2$recipe <- relevel(cake2$recipe, "C")
>> m_zcp5b <- lmer_alt(angle ~ recipe  + (recipe || replicate), cake2)
>> VarCorr(m_zcp5b)
>>  Groups      Name        Std.Dev.
>>  replicate   (Intercept) 6.5495e+00
>>  replicate.1 re1.recipeA 2.5561e+00
>>  replicate.2 re1.recipeB 1.0259e-07
>>  Residual                5.4061e+00
>> This instability indicates that something fishy is going on...
>>
>> The correlation parameters are needed in the "default" representation:
>> (recipe | replicate) and (0 + recipe | replicate) are equivalent
>> because the correlation parameters make the "appropriate adjustments",
>> but (recipe || replicate) is _not_ the same as (0 + recipe ||
>> replicate) with afex::lmer_alt. I might take it as far as to say that
>> (recipe | replicate) is meaningful because it is a re-parameterization
>> of (0 + recipe | replicate). On the other hand, while the diagonal
>> variance-covariance matrix parameterized by (0 + recipe || replicate)
>> is meaningful, a model with (recipe || replicate) using afex::lmer_alt
>> does _not_ make sense to me (and does not represent a diagonal
>> variance-covariance matrix).
>>
>>> Do m_zcp5 and Model3b estimate the same random effects in this case?
>>
>> Well, Model3b makes sense while m_zcp5 does not, but Model3b estimates
>> more random effects than the others:
>> Model3b <- lmerTest::lmer(angle ~ recipe + (1 | replicate) + (1 |
>> recipe:replicate),
>>                           data=cake)
>> length(unlist(ranef(Model3b))) # 60
>> length(unlist(ranef(m_zcp4))) # 45 - same for m_zcp, m_zcp2 and m_zcp6
>> and Model2
>>
>>> If not, what is the difference between m_zcp5 and Model3b (except for
>>> the fact that the variance depends on the
>>> recipe in m_zcp5) and which one is the more complex model?
>>
>> There is no unique 'complexity' ordering, for example, Model3b use 2
>> random-effect variance-covariance parameters to represent 60 random
>> effects, while m_zcp4 (m_zcp2) use 3 (6) random-effect
>> variance-covariance parameters to represent 45 random effects. But
>> usually the relevant 'complexity' scale is the number of parameters,
>> cf. likelihood ratio tests, AIC, BIC etc. There are corner-cases,
>> however; if x1 and x2 are continuous then (1 + x1 + x2 | group) and
>> '(1 + x1 | group) + (1 + x2 | group)' both use 6 random-effect
>> variance-covariance parameters, but the models represent different
>> structures and you can argue that the latter formulation is less
>> complex than the former since it avoids the correlation between x1 and
>> x2.
>>
>> Cheers,
>> Rune
>>
>>> I would be glad if you could elaborate on this and help me and the
>>> others understand these models.
>>>
>>> Cheers,
>>> Maarten
>>>