[R-sig-ME] Help with interpreting one fixed-effect coefficient

Sun Sep 26 19:39:42 CEST 2021

It kind of bugs me to see people get unduly fixated on interpreting regression coefficients. To me, it is like driving a car down the highway while intently focused on the instrument panel instead of where we are going. Let's see -- the tachometer looks OK and we're just slightly above the speed limit -- but did you notice that you are passing a truck and you're entering a construction zone?

Speaking of construction... for starters, the model is problematic. I can't imagine that those two factors don't interact; yet the model doesn't include interaction. Is that because the coefficients would be even harder to interpret? Because they will be.

I suggest looking instead at what the (improved) model predicts. That may be done via an expression like 

    new <- expand.grid(sex = c('boys', 'girls', schgender = c('boy-only', 'girl-only', 'mixed')

which constructs a data frame with all combinations of the factors. Then use 'predict(model, newdata = new)` and you will see what the model predicts for all those combinations. It does not require much expertise or experience to interpret those. Moreover, they can be plotted so you can visualize the factor effects and their joint effects.

Or (forgive me for self-promotion) you could use a package like `emmeans', or 'effects' or 'ggeffects' to facilitate this kind of exploration.

Just my 2 cents worth.

Russ Lenth

-----Original Message-----

Date: Sun, 26 Sep 2021 09:39:25 +0300
From: Juho Kristian Ruohonen <juho.kristian.ruohonen using gmail.com>
To: Simon Harmel <sim.harmel using gmail.com>
Cc: r-sig-mixed-models <r-sig-mixed-models using r-project.org>
Subject: Re: [R-sig-ME] Help with interpreting one fixed-effect
	coefficient
Message-ID:
	<CAG_dBVep4WSVRaOwRkZLKF8zrVBZMZ-_4X=_X63sJw9C1ZEKfw using mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

In my view, your logic is slightly oversimplified (i.e. incorrect).
Regression models do not estimate coefficients by holding predictors
constant exclusively at the reference category. They do something more
general, namely estimate coefficients by holding predictors constant at any
value at which variation is observed in the values of the other predictors.

su 26. syysk. 2021 klo 9.03 Simon Harmel (sim.harmel using gmail.com) kirjoitti:

> Dear Juho and other List Members,
>
> My problem is the logic of interpretation. Assuming no interaction, a
> categorical-predictors-only model, and aside from the intercept which
> captures the mean for reference categories (in this case, boys in the
> mixed schools), I have learned to interpret any main effect coef for a
> categorical predictor by thinking of that coef. as something that can
> differ from its reference category to affect "y" ***holding any other
> categorical predictor in the model at its reference category***.
>
> By this logic, "schgendboy-only" main effect coef should mean diff.
> bet. boys (held constant at the reference category) in boy-only vs.
> mixed schools (which shows "schgendboy-only" can differ from its
> reference category i.e, mixed schools).
>
> By this logic, "sexgirls" main effect coef should mean diff. bet.
> girls vs. boys (which shows "sexgirls" can differ from its reference
> category i.e, boys) in mixed schools (held constant at the reference
> category).
>
> Therefore, by this logic, "schgendgirl-only" main effect coef should
> mean diff. bet. boys (held constant at the reference category) in
> girl-only vs. mixed schools (which shows "schgendgirl-only" can differ
> from its reference category i.e, mixed schools).
>
> My question is that is my logic of interpretation incorrect? Or are
> there exceptions to my logic of interpretation of which interpreting
> "schgendgirl-only" coef is one?
>
> Thank you very much,
> Simon
>
> On Sun, Sep 26, 2021 at 12:00 AM Juho Kristian Ruohonen
> <juho.kristian.ruohonen using gmail.com> wrote:
> >
> > Fellow student commenting here...
> >
> > As you suggest, schgendgirl-only can only ever apply to female students.
> Strictly speaking, it's the estimated mean difference between a student of
> any sex in a girls-only school and a similar student in a mixed school. But
> since such comparisons are only observed between girls, the estimate is
> necessarily informed by girl data only. So your intended interpretation of
> the coefficient is correct.
> >
> >
> > su 26. syysk. 2021 klo 0.27 Simon Harmel (sim.harmel using gmail.com)
> kirjoitti:
> >>
> >> Dear Colleagues,
> >>
> >> Apologies for crossposting (
> https://stats.stackexchange.com/q/545975/284623).
> >>
> >> I've two categorical moderators i.e., students' ***sex*** (`boys`,
> >> `girls`) and the ***school-gender system*** (`boy-only`, `girl-only`,
> >> `mixed`) in a model like: `y ~ sex + schoolgend`.
> >>
> >> My coefs are below. I can interpret three of the coefs but wonder how
> >> to interpret the third one from the top (.175)?
> >>
> >> Assume "intrcpt" represents the boys' mean in mixed schools.
> >>
> >>                          Estimate
> >> (Intercept)             -0.189
> >> schgendboy-only   0.180
> >> schgendgirl-only    0.175
> >> sexgirls                  0.168
> >>
> >> My interpretations of the coefficients are as follows:
> >>
> >>             "(Intercept)": mean of y for boys in mixed schools = -.189
> >>  "schgendboy-only": diff. bet. boys in boy-only vs. mixed schools =
> +.180
> >>   "schgendgirl-only": diff. bet. ???????????????????????????? = +.175
> >>                 "sexgirls": diff. bet. girls vs. boys in mixed schools
> = +.168
> >>
> >> If my interpretation logic for all other coefs is correct, then, this
> >> third coef. must mean:
> >>
> >> diff. bet. boys in girl-only vs. mixed schools = +.175! (which makes no
> sense!)
> >>
> >> ps. I know I will end-up interpreting +1.75 as: diff. bet. girls in
> >> girl-only vs. mixed schools BUT this doesn't follow the interpretation
> >> logic for other coefs PLUS there are no labels in the output to show
> >> what's what!
> >>
> >> Many thanks,
> >> Simon