[R-sig-ME] Removing random intercepts before random slopes

Sat Sep 1 20:45:07 CEST 2018

When you're using lmer() or lme(), actually you don't fit a mixed model, but the implied marginal model. That is, under maximum likelihood estimation, you integrate out the random effects, and you obtain a marginal model of the form

Y ~ N(Xbeta, ZDZ + sigma^2 I),

where X is the design matrix for the fixed effects beta, Z the design matrix of the random effects, D the covariance matrix of the random effects, and sigma^2 the measurement error variance.

Hence, it all boils down to how this marginal covariance matrix looks like. Typically, a bottom up approach is used. That is, you start with intercepts because it is the simplest structure corresponding to compound symmetry (i.e., constant correlation over time), and you continue with including random slopes that allow correlations to decay with the time lag, and even include higher order terms (e.g., in case of longitudinal data often you can include a spline of your time variable in the random effects). To judge if you need to include an extra random effect a likelihood ratio test is typically used.

Now, if you're only interest in the marginal model, that is in the fixed effects and their standard errors, it is perfectly fine to exclude the random intercepts even if you have slopes in your model, because you just see it as a particular (perhaps more parsimonious) choice for your marginal covariance matrix.

However, if you're also interested in producing subject-specific predictions that will use the empirical Bayes estimates of the random effects that you obtain in a second step, then having slopes without intercepts will look strange in most of the cases (even though I know a couple of example in this makes sense).

I hope it helps.

Best,
Dimitris

-----Original Message-----
From: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> On Behalf Of Jake Westfall
Sent: Saturday, September 1, 2018 4:52 PM
To: Maarten.Jung using mailbox.tu-dresden.de
Cc: r-sig-mixed-models <r-sig-mixed-models using r-project.org>
Subject: Re: [R-sig-ME] Removing random intercepts before random slopes

Hi Maarten,

I should point out that all of this, both what I said and what the Barr et al. paper said, is contingent on the fixed predictor being contrast/deviation coded, NOT treatment/dummy coded. This is sort of mentioned in the Barr paper in footnote 12 (attached to the paragraph you cited on p. 276), but it's not 100% clear, and I probably should have reminded about it too.

If you add `options(contrasts=c("contr.helmert", "contr.poly"))` to the top of your script you'll see the expected results.

The reason the coding matters in this way is that iff we're using contrast/deviation codes and the design is approximately balanced, then removing the random intercepts is the same as constraining all units to have the same overall mean response -- visually, this just vertically shifts each of the unit-specific regression lines (actually planes in this
case) so that they all intersect X=0 at the same Y -- but this shift doesn't have much impact on any of the unit-specific slopes, and thus doesn't change much the random slope variance. Since the random slope variance enters the standard errors of the fixed slopes while the random intercept variance does not (because the fixed slopes are effectively difference scores that subtract out the unit-specific means), this means that the standard errors are mostly unchanged by this shift.

As you point you, the residual variance does expand a little bit to soak up some of the ignored random intercept variance. But this has very little impact on the standard errors because, in the standard error expression, the residual variance is divided by the total number of observations, so its contribution to the entire expression is negligible except for tiny data sets (which to some extent is true of the Machines dataset).

Now on the other hand, if we use treatment/dummy codes, removing the random intercepts corresponds to a completely different constraint, specifically we constrain all units to have the same response *in one of the experimental conditions*, and the random slopes are left to be whatever they now need to be to fit the other experimental conditions. This can have a big impact on the unit-specific regression lines, generally increasing their variance, possibly by a lot, which has a big impact on the standard errors of the fixed slopes.

This is a lot easier to understand using pictures (and maybe a few
equations) rather than quickly typed words, but it's Saturday morning and I want to do fun stuff, so... anyway, I hope this helps a little.

Finally, in answer to your follow-up question of why it might be conceptually strange to have random slopes but not random intercepts, maybe this image from the Gelman & Hill textbook of what that implies for the unit-specific regression lines will make that more clear. I hope you agree that the middle panel is strange. The image is specifically in a dummy coding context, but it's not much less strange even if we use contrast/deviation codes.

Jake

On Sat, Sep 1, 2018 at 8:45 AM Maarten Jung < Maarten.Jung using mailbox.tu-dresden.de> wrote:

> > thanks for your answer, makes sense to me. I think removing the 
> > random
> intercepts should mostly increase the residual error and thus even
> > increase the SEs for the fixed effects. Is this correct?
>
> Fwiw: this quick test with the Machines data seems to support my
> speculation:
>
> data("Machines", package = "MEMSS")
> d <- Machines
> xtabs(~ Worker + Machine, d)  # balanced
>
> mm <- model.matrix(~ 1 + Machine, d)
> c1 <- mm[, 2]
> c2 <- mm[, 3]
>
> summary(lmerTest::lmer(score ~ 1 + c1 + c2 + (1 + c1 + c2 | Worker), 
> d)) # Fixed effects:
> #             Estimate Std. Error     df t value Pr(>|t|)
> # (Intercept)   52.356      1.681  5.000  31.151  6.4e-07 ***
> # c1             7.967      2.421  5.000   3.291 0.021693 *
> # c2            13.917      1.540  5.000   9.036 0.000277 ***
>
> summary(lmerTest::lmer(score ~ 1 + c1 + c2 + (0 + c1 + c2 | Worker), 
> d)) ### SEs increased:
> # Fixed effects:
> #             Estimate Std. Error      df t value Pr(>|t|)
> # (Intercept)  52.3556     0.6242 41.0000  83.880  < 2e-16 ***
> # c1            7.9667     3.5833  5.3172   2.223 0.073612 .
> # c2           13.9167     1.9111  6.2545   7.282 0.000282 ***
>
> summary(lmerTest::lmer(score ~ 1 + c1 + c2 + (1 + c1 + c2 || Worker), 
> d)) # Fixed effects:
> #             Estimate Std. Error     df t value Pr(>|t|)
> # (Intercept)   52.356      1.679  5.004  31.188 6.31e-07 ***
> # c1             7.967      2.426  5.002   3.284 0.021833 *
> # c2            13.917      1.523  5.004   9.137 0.000262 ***
>
> summary(lmerTest::lmer(score ~ 1 + c1 + c2 + (0 + c1 + c2 || Worker), 
> d)) ### SEs increased:
> # Fixed effects:
> #             Estimate Std. Error      df t value Pr(>|t|)
> # (Intercept)  52.3556     0.6242 41.0000  83.880  < 2e-16 ***
> # c1            7.9667     3.5833  5.3172   2.223 0.073612 .
> # c2           13.9167     1.9111  6.2545   7.282 0.000282 ***
>
> Still, I would be glad to hear any thoughts on this question:
>
> > Why exactely would it be conceptually strange to have random slopes 
> > but
> not random intercepts?
> > Because intercepts often represent some kind of baseline and, say
> subjects, will probably have different baselines (and thus a 
> corresponding variance component estimated as > 0) if their slopes 
> (i.e. effects) vary, or is there any other statistical reason why most 
> people remove the random slopes first?
>

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models using r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models