[R-sig-ME] What is the appropriate zero-correlation parameter model for factors in lmer?

Tue May 22 09:45:36 CEST 2018

Ok, I figured out the answer to the question about fm2.

fm2 is indeed a very nice baseline for fm3 and fm4. So I distinguish
between min1LMM and min2LMM.

 fm1 = y ~ 1 + f + (1 | g)               # minimal LMM version 1  (min1LMM)
 fm2 = y ~ 1 + f + (1 | f:g)             # minimal LMM version 2  (min2LMM)
 fm3 = y ~ 1 + f + (0 + f || g)          # zcpLMM with 0 in RE (zcpLMM_RE0)
 fm4 = y ~ 1 + f + (1 | g) + (1 | f:g)   # LMM w/ f x g interaction (intLMM)
 fm5 = y ~ 1 + f + (1 | g) + (0 + f | g) # N/A
 fm6 = y ~ 1 + f + (1 + f |  g)          # maximal LMM (maxLMM)
 fm7 = y ~ 1 + f + (1 + f || g)          # zcpLMM with 1 in RE (zcpLMM_RE1)

(1) maxLMM_RE1 -> intLMM     -> min1LMM  # fm6 -> fm4 -> fm1
(2) maxLMM_RE1 -> intLMM     -> min2LMM  # fm6 -> fm4 -> fm2
(3) maxLMM_RE0 -> zcpLMM_RE0 -> min2LMM  # fm6 -> fm3 -> fm2
(4) maxLMM_RE1 -> zcpLMM_RE1 -> min1LMM  # fm6 -> fm7 -> fm1  (new sequence)

On Tue, May 22, 2018 at 12:21 AM, Reinhold Kliegl <reinhold.kliegl at gmail.com
> wrote:

> Sorry, I am somewhat late to this conversation. I am responding to this
> thread, because it fits my comment very well, but it was initially
> triggered by a previous thread, especially Rune Haubo's post here [1]. So I
> hope it is ok to continue here.
>
> I have a few comments and questions. For details I refer to an RPub I put
> up along with this post [2]. I start with a translation between Rune
> Haubo's fm's and the terminology I use in the RPub:
>
>  fm1 = y ~ 1 + f + (1 | g)            # minimal LMM (minLMM)
>  fm3 = y ~ 1 + f + (0 + f || g)       # zero-corr param LMM with 0 in RE
> (zcpLMM_RE0)
>  fm4 = y ~ 1 + f + (1 | g) + (1|f:g)  # LMM w/ fixed x random factor
> interaction (intLMM),
>  fm6 = y ~ 1 + f + (1 + f |  g)       # maximal LMM (maxLMM)
>  fm7 = y ~ 1 + f + (1 + f || g)       # zero-corr param LMM with 1 in RE
> (zcpLMM_RE1)
>
> Notes: f is a fixed factor, g is a group (random) factor; fm1 to fm6 are
> in Rune Haubo's post; fm7 is new (added by me). I have not used fm2 and fm5
> so far (see below).
>
> (I) The post was triggered by the question whether intLMM is nested under
> zcpLMM. I had included this LRT in my older RPub cited in the thread, but I
> stand corrected and agree with Rune Haubo that intLMM is not nested under
> zcpLMM. For example, in the new RPub, I show that slightly modified
> Machines data exhibit smaller deviance for intLMM than zcpLMM despite an
> additional model parameter in the latter. Thanks for the critical reading.
>
>
> (II) Here are Runo Haubo's sequences (left, resorted) augmented with my
> translation (right)
>
> (1) fm6 -> fm5 -> fm4 -> fm1  # maxLMM_RE1 -> fm5 -> intLMM     -> minLMM
> (2) fm6 -> fm5 -> fm4 -> fm2  # maxLMM_RE1 -> fm5 -> intLMM     -> fm2
> (3) fm6 -> fm5 -> fm3 -> fm2  # maxLMM_RE1 -> fm5 -> zcpLMM_RE0 -> fm2
>
> and here are sequences I came up with (left) augmented with translation
> into RH's fm's.
>
> (1) maxLMM_RE1 -> intLMM     -> minLMM  # fm6 -> fm4 -> fm1
> (3) maxLMM_RE0 -> zcpLMM_RE0            # fm6 -> fm3
> (4) maxLMM_RE1 -> zcpLMM_RE1 -> minLMM  # fm6 -> fm7 -> fm1  (new sequence)
>
>
> (III) I have questions about fm2 and fm5.
>    fm2: fm2 redefines the levels of the group factor (e.g., in the cake
> data there are 45 groups in fm2 compared to 15 in the other models). Why is
> fm2 nested under fm3 and fm6? Somehow it looks to me that you include an
> f:g interaction without the g main effect (relative to fm4). This looks
> like an interesting model; I would appreciate a bit more conceptual support
> for its interpretation in the model hierarchy.
>    fm5: fm5 specifies 4 variance components (VCs), but the factor has only
> 3 levels. So to me this looks like there is redundancy built into the
> model. In support of this intuition, for the cake data, one of the VCs is
> estimated with 0. However, in the Machine data the model was not
> degenerate. So I am not sure. In other words, if the factor levels are A,
> B, C, and the two contrasts are c1 and c2, I thought I can specify either
> (1 + c1 + c2) or (0 + A + B + C). fm5 specifies (1 + A + B + C) which is
> rank deficient in the fixed effect part, but not necessarily in the
> random-effect term. What am I missing here?
>
> [1] https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q2/026775.html
> [2] http://rpubs.com/Reinhold/391027
>
> Best,
> Reinhold Kliegl
>
>
> On Thu, May 17, 2018 at 12:43 PM, Maarten Jung <Maarten.Jung at mailbox.tu-
> dresden.de> wrote:
> >
> > Dear list,
> >
> > When one wants to specify a lmer model including variance components but
> no
> > correlation parameters for categorical predictors (factors) afaik one has
> > to convert the factors to numeric covariates or use lme4::dummy(). Until
> > recently I thought m2a (or equivalently m2b using the double-bar syntax)
> > would be the correct way to specify such a zero-correlation parameter
> model.
> >
> > But in this thread [1] Rune Haubo Bojesen Christensen pointed out that
> this
> > model does not make sense to him. Instead he suggests m3 as an
> appropriate
> > model.
> > I think this is a *highly relevant difference* for everyone who uses
> > factors in lmer and therefore I'm bringing up this issue again. But maybe
> > I'm mistaken and just don't get what is quite obvious for more
> experienced
> > mixed modelers.
> > Please note that the question is on CrossValidated [2] but some consider
> it
> > as off-topic and I don't think there will be an answer any time soon.
> >
> > So here are my questions:
> > How should one specify a lmm without correlation parameters for factors
> and
> > what are the differences between m2a and m3?
> > Is there a preferred model for model comparison with m4 (this model is
> also
> > discussed here [3])?
> >
> > library("lme4")
> > data("Machines", package = "MEMSS")
> >
> > d <- Machines
> > contrasts(d$Machine)  # default coding: contr.sum
> >
> > m1 <- lmer(score ~ Machine + (Machine | Worker), d)
> >
> > c1 <- model.matrix(m1)[, 2]
> > c2 <- model.matrix(m1)[, 3]
> > m2a <- lmer(score ~ Machine + (1 | Worker) + (0 + c1 | Worker) + (0 + c2
> |
> > Worker), d)
> > m2b <- lmer(score ~ Machine + (c1 + c2 || Worker), d)
> > VarCorr(m2a)
> >  Groups   Name        Std.Dev.
> >  Worker   (Intercept) 5.24354
> >  Worker.1 c1          2.58446
> >  Worker.2 c2          3.71504
> >  Residual             0.96256
> >
> > m3 <- lmer(score ~ Machine + (1 | Worker) + (0 + dummy(Machine, "A") |
> > Worker) +
> >                                             (0 + dummy(Machine, "B") |
> > Worker) +
> >                                             (0 + dummy(Machine, "C") |
> > Worker), d)
> > VarCorr(m3)
> >  Groups   Name                Std.Dev.
> >  Worker   (Intercept)         3.78595
> >  Worker.1 dummy(Machine, "A") 1.94032
> >  Worker.2 dummy(Machine, "B") 5.87402
> >  Worker.3 dummy(Machine, "C") 2.84547
> >  Residual                     0.96158
> >
> > m4 <- lmer(score ~ Machine + (1 | Worker) + (1 | Worker:Machine), d)
> >
> >
> > [1] https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q2/026775.html
> > [2] https://stats.stackexchange.com/q/345842/136579
> > [3] https://stats.stackexchange.com/q/304374/136579
> >
> > Best regards,
> > Maarten
> >
> >         [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>

	[[alternative HTML version deleted]]