[R] Predicted values from glm() when linear predictor is NA.
John Fox
j|ox @end|ng |rom mcm@@ter@c@
Thu Jul 28 04:04:20 CEST 2022
Dear Rolf,
The coefficient of TrtTime:LifestageL1 isn't estimable (as you explain)
and by setting it to NA, glm() effectively removes it from the model. An
equivalent model is therefore
> fit2 <- glm(cbind(Dead,Alive) ~ TrtTime + Lifestage +
+ I((Lifestage == "Egg + L1")*TrtTime) +
+ I((Lifestage == "L1 + L2")*TrtTime) +
+ I((Lifestage == "L3")*TrtTime),
+ family=binomial, data=demoDat)
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred
> cbind(coef(fit, complete=FALSE), coef(fit2))
[,1] [,2]
(Intercept) -0.91718302 -0.91718302
TrtTime 0.88846195 0.88846195
LifestageEgg + L1 -45.36420974 -45.36420974
LifestageL1 14.27570572 14.27570572
LifestageL1 + L2 -0.30332697 -0.30332697
LifestageL3 -3.58672631 -3.58672631
TrtTime:LifestageEgg + L1 8.10482459 8.10482459
TrtTime:LifestageL1 + L2 0.05662651 0.05662651
TrtTime:LifestageL3 1.66743472 1.66743472
There is no problem computing fitted values for the model, specified
either way. That the fitted values when Lifestage == "L1" all round to 1
on the probability scale is coincidental -- that is, a consequence of
the data.
I hope this helps,
John
On 2022-07-27 8:26 p.m., Rolf Turner wrote:
>
> I have a data frame with a numeric ("TrtTime") and a categorical
> ("Lifestage") predictor.
>
> Level "L1" of Lifestage occurs only with a single value of TrtTime,
> explicitly 12, whence it is not possible to estimate a TrtTime "slope"
> when Lifestage is "L1".
>
> Indeed, when I fitted the model
>
> fit <- glm(cbind(Dead,Alive) ~ TrtTime*Lifestage, family=binomial,
> data=demoDat)
>
> I got:
>
>> as.matrix(coef(fit))
>> [,1]
>> (Intercept) -0.91718302
>> TrtTime 0.88846195
>> LifestageEgg + L1 -45.36420974
>> LifestageL1 14.27570572
>> LifestageL1 + L2 -0.30332697
>> LifestageL3 -3.58672631
>> TrtTime:LifestageEgg + L1 8.10482459
>> TrtTime:LifestageL1 NA
>> TrtTime:LifestageL1 + L2 0.05662651
>> TrtTime:LifestageL3 1.66743472
>
> That is, TrtTime:LifestageL1 is NA, as expected.
>
> I would have thought that fitted or predicted values corresponding to
> Lifestage = "L1" would thereby be NA, but this is not the case:
>
>> predict(fit)[demoDat$Lifestage=="L1"]
>> 26 65 131
>> 24.02007 24.02007 24.02007
>>
>> fitted(fit)[demoDat$Lifestage=="L1"]
>> 26 65 131
>> 1 1 1
>
> That is, the predicted values on the scale of the linear predictor are
> large and positive, rather than being NA.
>
> What this amounts to, it seems to me, is saying that if the linear
> predictor in a Binomial glm is NA, then "success" is a certainty.
> This strikes me as being a dubious proposition. My gut feeling is that
> misleading results could be produced.
>
> Can anyone explain to me a rationale for this behaviour pattern?
> Is there some justification for it that I am not currently seeing?
> Any other comments? (Please omit comments to the effect of "You are as
> thick as two short planks!". :-) )
>
> I have attached the example data set in a file "demoDat.txt", should
> anyone want to experiment with it. The file was created using dput() so
> you should access it (if you wish to do so) via something like
>
> demoDat <- dget("demoDat.txt")
>
> Thanks for any enlightenment.
>
> cheers,
>
> Rolf Turner
>
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/
More information about the R-help
mailing list