[R-sig-ME] [External] Re: Help with interpreting one fixed-effect coefficient

John Fox j|ox @end|ng |rom mcm@@ter@c@
Mon Sep 27 18:26:03 CEST 2021


Dear Simon,

I believe that Russ's point is that the fact that the additive model 
allows you to estimate nonsensical quantities like a mean for girls in 
all-boys' schools implies a problem with the model. Why not do as I 
suggested and define two dichotomous factors: sex of student 
(male/female) and type of school (coed, same-sex)? The four combinations 
of levels then make sense.

Best,
  John

On 2021-09-27 12:09 p.m., Simon Harmel wrote:
> Thanks, Russ! There is one thing that I still don't understand. We
> have two completely empty cells (boys in girl-only & girls in boy-only
> schools). Then, how are the means of those empty cells computed (what
> data is used in their place in the additive model)?
> 
> Let's' simplify the model for clarity:
> 
> library(R2MLwiN)
> library(emmeans)
> 
> Form3 <- normexam ~ schgend + sex ## + standlrt + (standlrt | school)
> model3 <- lm(Form3, data = tutorial)
> 
> emmeans(model3, pairwise~sex+schgend)$emmeans
> 
>   sex  schgend   emmean     SE   df lower.CL upper.CL
>   boy  mixedsch -0.2160 0.0297 4055  -0.2742 -0.15780
>   girl mixedsch  0.0248 0.0304 4055  -0.0348  0.08437
>   boy  boysch    0.0234 0.0437 4055  -0.0623  0.10897
>   girl boysch    0.2641 0.0609 4055   0.1447  0.38360<-how computed?
>   boy  girlsch  -0.0948 0.0502 4055  -0.1931  0.00358<-how computed?
>   girl girlsch   0.1460 0.0267 4055   0.0938  0.19829
> 
> 
> 
> 
> 
> On Sun, Sep 26, 2021 at 8:22 PM Lenth, Russell V
> <russell-lenth using uiowa.edu> wrote:
>>
>> By the way, returning to the topic of interpreting coefficients, you ought to have fun with the ones from the model I just fitted:
>>
>> Fixed effects:
>>                 Estimate Std. Error t value
>> (Intercept)    -0.18882    0.05135  -3.677
>> standlrt        0.55442    0.01994  27.807
>> schgendboysch   0.17986    0.09915   1.814
>> schgendgirlsch  0.17482    0.07877   2.219
>> sexgirl         0.16826    0.03382   4.975
>>
>> One curious thing you'll notice is that there are no coefficients for the interaction terms. Why? Because those terms were "thrown out" of the model, and so they are not shown. I think it is unwise to not show what was thrown out (e.g., lm would have shown them as NAs), because in fact what we see is but one of infinitely many possible solutions to the regression equations. This is the solution where the last two coefficients are constrained to zero. There is another equally reasonable one where the coefficients for schgendboysch and schgendgirlsch  are constrained to zero, and the two interaction effects would then be non-zero. And infinitely more where all 7 coefficients are non-zero, and there are two linear constraints among them.
>>
>> Of course, since the particular estimate shown consists of all the main effects and interactions are constrained to zero, it does demonstrate that the additive model *could* have been used to obtain the same estimates and standard errors, and you can see that by comparing the results (and ignoring the invalid ones from the additive model). But it is just a lucky coincidence that it worked out this way, and the additive model did lead us down a primrose path containing silly results among the correct ones.
>>
>> Russ
>>
>> -----Original Message-----
>> From: Lenth, Russell V
>> Sent: Sunday, September 26, 2021 7:43 PM
>> To: Simon Harmel <sim.harmel using gmail.com>
>> Cc: r-sig-mixed-models using r-project.org
>> Subject: RE: [External] Re: [R-sig-ME] Help with interpreting one fixed-effect coefficient
>>
>> I guess correctness is in the eyes of the beholder. But I think this illustrates the folly of the additive model. Having additive effects suggests a belief that you can vary one factor more or less independently of the other. In his comments, John Fox makes a good point that escaped my earlier cursory view of the original question, that you don't have data on girls attending all-boys' schools, nor boys attending all-girls' schools; yet the model that was fitted estimates a mean response for both those situations. That's a pretty clear testament to the failure of that model – and also why the coefficients don't make sense. And finally why we have estimates of 15 comparisons (some of which are aliased with one another), when only 6 of them make sense.
>>
>> If instead, a model with interaction were fitted, it would be a rank-deficient model because two cells are empty. Perhaps there is some sort of nesting structure that could be used to work around that. However, it doesn't matter much because emmeans assesses estimability, and the two combinations I mentioned above would be flagged as non-estimable. One could then more judiciously use the contrast function to test meaningful contrasts across this irregular array of cell means. Or even injudiciously asking for all pairwise comparisons, you will see 6 estimable ones and 9 non-estimable ones. See output below.
>>
>> Russ
>>
>> ----- Interactive model -----
>>
>>> Form <- normexam ~ 1 + standlrt + schgend * sex + (standlrt | school)
>>> model <- lmer(Form, data = tutorial, REML = FALSE)
>> fixed-effect model matrix is rank deficient so dropping 2 columns / coefficients
>>>
>>> emmeans(model, pairwise~schgend+sex)
>>
>> ... messages deleted ...
>>
>> $emmeans
>>   schgend  sex    emmean     SE  df asymp.LCL asymp.UCL
>>   mixedsch boy  -0.18781 0.0514 Inf   -0.2885   -0.0871
>>   boysch   boy  -0.00795 0.0880 Inf   -0.1805    0.1646
>>   girlsch  boy    nonEst     NA  NA        NA        NA
>>   mixedsch girl -0.01955 0.0521 Inf   -0.1216    0.0825
>>   boysch   girl   nonEst     NA  NA        NA        NA
>>   girlsch  girl  0.15527 0.0632 Inf    0.0313    0.2792
>>
>> Degrees-of-freedom method: asymptotic
>> Confidence level used: 0.95
>>
>> $contrasts
>>   contrast                     estimate     SE  df z.ratio p.value
>>   mixedsch boy - boysch boy     -0.1799 0.0991 Inf  -1.814  0.4565
>>   mixedsch boy - girlsch boy     nonEst     NA  NA      NA      NA
>>   mixedsch boy - mixedsch girl  -0.1683 0.0338 Inf  -4.975  <.0001
>>   mixedsch boy - boysch girl     nonEst     NA  NA      NA      NA
>>   mixedsch boy - girlsch girl   -0.3431 0.0780 Inf  -4.396  0.0002
>>   boysch boy - girlsch boy       nonEst     NA  NA      NA      NA
>>   boysch boy - mixedsch girl     0.0116 0.0997 Inf   0.116  1.0000
>>   boysch boy - boysch girl       nonEst     NA  NA      NA      NA
>>   boysch boy - girlsch girl     -0.1632 0.1058 Inf  -1.543  0.6361
>>   girlsch boy - mixedsch girl    nonEst     NA  NA      NA      NA
>>   girlsch boy - boysch girl      nonEst     NA  NA      NA      NA
>>   girlsch boy - girlsch girl     nonEst     NA  NA      NA      NA
>>   mixedsch girl - boysch girl    nonEst     NA  NA      NA      NA
>>   mixedsch girl - girlsch girl  -0.1748 0.0788 Inf  -2.219  0.2287
>>   boysch girl - girlsch girl     nonEst     NA  NA      NA      NA
>>
>> Degrees-of-freedom method: asymptotic
>> P value adjustment: tukey method for comparing a family of 6 estimates
>>
>>
>> ---------------------------------------------------------
>> From: Simon Harmel <sim.harmel using gmail.com>
>> Sent: Sunday, September 26, 2021 3:08 PM
>> To: Lenth, Russell V <russell-lenth using uiowa.edu>
>> Cc: r-sig-mixed-models using r-project.org
>> Subject: [External] Re: [R-sig-ME] Help with interpreting one fixed-effect coefficient
>>
>> Dear Russ and the List Members,
>>
>> If we use Russ' great package (emmeans), we see that although meaningless, but "schgendgirl-only" can be interpreted using the logic I mentioned here: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2021q3/029723.html .
>>
>> That is, "schgendgirl-only" can meaninglessly mean: ***diff. bet. boys in girl-only vs. mixed schools*** just like it can meaningfully mean:  ***diff. bet. girls in girl-only vs. mixed schools***
>>
>> Russ, have I used emmeans correctly?
>>
>> Simon
>>
>> Here is a reproducible code:
>>
>> library(R2MLwiN) # For the dataset
>> library(lme4)
>> library(emmeans)
>>
>> data("tutorial")
>>
>> Form <- normexam ~ 1 + standlrt + schgend + sex + (standlrt | school)
>> model <- lmer(Form, data = tutorial, REML = FALSE)
>>
>> emmeans(model, pairwise~schgend+sex)$contrast
>>
>> contrast                     estimate     SE  df z.ratio p.value
>> mixedsch boy - boysch boy    -0.17986 0.0991 Inf -1.814  0.4565
>> mixedsch boy - girlsch boy   -0.17482 0.0788 Inf -2.219  0.2287   <--This coef. equals
>> mixedsch boy - mixedsch girl -0.16826 0.0338 Inf -4.975  <.0001
>> mixedsch boy - boysch girl   -0.34813 0.1096 Inf -3.178  0.0186
>> mixedsch boy - girlsch girl  -0.34308 0.0780 Inf -4.396  0.0002
>> boysch boy - girlsch boy      0.00505 0.1110 Inf  0.045  1.0000
>> boysch boy - mixedsch girl    0.01160 0.0997 Inf  0.116  1.0000
>> boysch boy - boysch girl     -0.16826 0.0338 Inf -4.975  <.0001
>> boysch boy - girlsch girl    -0.16322 0.1058 Inf -1.543  0.6361
>> girlsch boy - mixedsch girl   0.00656 0.0928 Inf  0.071  1.0000
>> girlsch boy - boysch girl    -0.17331 0.1255 Inf -1.381  0.7388
>> girlsch boy - girlsch girl   -0.16826 0.0338 Inf -4.975  <.0001
>> mixedsch girl - boysch girl  -0.17986 0.0991 Inf -1.814  0.4565
>> mixedsch girl - girlsch girl -0.17482 0.0788 Inf -2.219  0.2287   <--This coef.
>> boysch girl - girlsch girl    0.00505 0.1110 Inf  0.045  1.0000
>>
>>
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
-- 
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/



More information about the R-sig-mixed-models mailing list