[R-sig-ME] Removing random intercepts before random slopes

Jake Westfall j@ke@@@we@tf@ll @ending from gm@il@com
Sat Sep 1 16:52:10 CEST 2018


Hi Maarten,

I should point out that all of this, both what I said and what the Barr et
al. paper said, is contingent on the fixed predictor being
contrast/deviation coded, NOT treatment/dummy coded. This is sort of
mentioned in the Barr paper in footnote 12 (attached to the paragraph you
cited on p. 276), but it's not 100% clear, and I probably should have
reminded about it too.

If you add `options(contrasts=c("contr.helmert", "contr.poly"))` to the top
of your script you'll see the expected results.

The reason the coding matters in this way is that iff we're using
contrast/deviation codes and the design is approximately balanced, then
removing the random intercepts is the same as constraining all units to
have the same overall mean response -- visually, this just vertically
shifts each of the unit-specific regression lines (actually planes in this
case) so that they all intersect X=0 at the same Y -- but this shift
doesn't have much impact on any of the unit-specific slopes, and thus
doesn't change much the random slope variance. Since the random slope
variance enters the standard errors of the fixed slopes while the random
intercept variance does not (because the fixed slopes are effectively
difference scores that subtract out the unit-specific means), this means
that the standard errors are mostly unchanged by this shift.

As you point you, the residual variance does expand a little bit to soak up
some of the ignored random intercept variance. But this has very little
impact on the standard errors because, in the standard error expression,
the residual variance is divided by the total number of observations, so
its contribution to the entire expression is negligible except for tiny
data sets (which to some extent is true of the Machines dataset).

Now on the other hand, if we use treatment/dummy codes, removing the random
intercepts corresponds to a completely different constraint, specifically
we constrain all units to have the same response *in one of the
experimental conditions*, and the random slopes are left to be whatever
they now need to be to fit the other experimental conditions. This can have
a big impact on the unit-specific regression lines, generally increasing
their variance, possibly by a lot, which has a big impact on the standard
errors of the fixed slopes.

This is a lot easier to understand using pictures (and maybe a few
equations) rather than quickly typed words, but it's Saturday morning and I
want to do fun stuff, so... anyway, I hope this helps a little.

Finally, in answer to your follow-up question of why it might be
conceptually strange to have random slopes but not random intercepts, maybe
this image from the Gelman & Hill textbook of what that implies for the
unit-specific regression lines will make that more clear. I hope you agree
that the middle panel is strange. The image is specifically in a dummy
coding context, but it's not much less strange even if we use
contrast/deviation codes.


Jake



On Sat, Sep 1, 2018 at 8:45 AM Maarten Jung <
Maarten.Jung using mailbox.tu-dresden.de> wrote:

> > thanks for your answer, makes sense to me. I think removing the random
> intercepts should mostly increase the residual error and thus even
> > increase the SEs for the fixed effects. Is this correct?
>
> Fwiw: this quick test with the Machines data seems to support my
> speculation:
>
> data("Machines", package = "MEMSS")
> d <- Machines
> xtabs(~ Worker + Machine, d)  # balanced
>
> mm <- model.matrix(~ 1 + Machine, d)
> c1 <- mm[, 2]
> c2 <- mm[, 3]
>
> summary(lmerTest::lmer(score ~ 1 + c1 + c2 + (1 + c1 + c2 | Worker), d))
> # Fixed effects:
> #             Estimate Std. Error     df t value Pr(>|t|)
> # (Intercept)   52.356      1.681  5.000  31.151  6.4e-07 ***
> # c1             7.967      2.421  5.000   3.291 0.021693 *
> # c2            13.917      1.540  5.000   9.036 0.000277 ***
>
> summary(lmerTest::lmer(score ~ 1 + c1 + c2 + (0 + c1 + c2 | Worker), d))
> ### SEs increased:
> # Fixed effects:
> #             Estimate Std. Error      df t value Pr(>|t|)
> # (Intercept)  52.3556     0.6242 41.0000  83.880  < 2e-16 ***
> # c1            7.9667     3.5833  5.3172   2.223 0.073612 .
> # c2           13.9167     1.9111  6.2545   7.282 0.000282 ***
>
> summary(lmerTest::lmer(score ~ 1 + c1 + c2 + (1 + c1 + c2 || Worker), d))
> # Fixed effects:
> #             Estimate Std. Error     df t value Pr(>|t|)
> # (Intercept)   52.356      1.679  5.004  31.188 6.31e-07 ***
> # c1             7.967      2.426  5.002   3.284 0.021833 *
> # c2            13.917      1.523  5.004   9.137 0.000262 ***
>
> summary(lmerTest::lmer(score ~ 1 + c1 + c2 + (0 + c1 + c2 || Worker), d))
> ### SEs increased:
> # Fixed effects:
> #             Estimate Std. Error      df t value Pr(>|t|)
> # (Intercept)  52.3556     0.6242 41.0000  83.880  < 2e-16 ***
> # c1            7.9667     3.5833  5.3172   2.223 0.073612 .
> # c2           13.9167     1.9111  6.2545   7.282 0.000282 ***
>
> Still, I would be glad to hear any thoughts on this question:
>
> > Why exactely would it be conceptually strange to have random slopes but
> not random intercepts?
> > Because intercepts often represent some kind of baseline and, say
> subjects, will probably have different baselines (and thus a corresponding
> variance component estimated as > 0) if their slopes (i.e. effects) vary,
> or is there any other statistical reason why most people remove the random
> slopes first?
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list