[R] Predicted values from glm() when linear predictor is NA.
Jeff Newmiller
jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Thu Jul 28 07:31:14 CEST 2022
But "disappearing" is not what NA is supposed to do normally. Why is it being treated that way here?
On July 27, 2022 7:04:20 PM PDT, John Fox <jfox using mcmaster.ca> wrote:
>Dear Rolf,
>
>The coefficient of TrtTime:LifestageL1 isn't estimable (as you explain) and by setting it to NA, glm() effectively removes it from the model. An equivalent model is therefore
>
>> fit2 <- glm(cbind(Dead,Alive) ~ TrtTime + Lifestage +
>+ I((Lifestage == "Egg + L1")*TrtTime) +
>+ I((Lifestage == "L1 + L2")*TrtTime) +
>+ I((Lifestage == "L3")*TrtTime),
>+ family=binomial, data=demoDat)
>Warning message:
>glm.fit: fitted probabilities numerically 0 or 1 occurred
>
>> cbind(coef(fit, complete=FALSE), coef(fit2))
> [,1] [,2]
>(Intercept) -0.91718302 -0.91718302
>TrtTime 0.88846195 0.88846195
>LifestageEgg + L1 -45.36420974 -45.36420974
>LifestageL1 14.27570572 14.27570572
>LifestageL1 + L2 -0.30332697 -0.30332697
>LifestageL3 -3.58672631 -3.58672631
>TrtTime:LifestageEgg + L1 8.10482459 8.10482459
>TrtTime:LifestageL1 + L2 0.05662651 0.05662651
>TrtTime:LifestageL3 1.66743472 1.66743472
>
>There is no problem computing fitted values for the model, specified either way. That the fitted values when Lifestage == "L1" all round to 1 on the probability scale is coincidental -- that is, a consequence of the data.
>
>I hope this helps,
> John
>
>On 2022-07-27 8:26 p.m., Rolf Turner wrote:
>>
>> I have a data frame with a numeric ("TrtTime") and a categorical
>> ("Lifestage") predictor.
>>
>> Level "L1" of Lifestage occurs only with a single value of TrtTime,
>> explicitly 12, whence it is not possible to estimate a TrtTime "slope"
>> when Lifestage is "L1".
>>
>> Indeed, when I fitted the model
>>
>> fit <- glm(cbind(Dead,Alive) ~ TrtTime*Lifestage, family=binomial,
>> data=demoDat)
>>
>> I got:
>>
>>> as.matrix(coef(fit))
>>> [,1]
>>> (Intercept) -0.91718302
>>> TrtTime 0.88846195
>>> LifestageEgg + L1 -45.36420974
>>> LifestageL1 14.27570572
>>> LifestageL1 + L2 -0.30332697
>>> LifestageL3 -3.58672631
>>> TrtTime:LifestageEgg + L1 8.10482459
>>> TrtTime:LifestageL1 NA
>>> TrtTime:LifestageL1 + L2 0.05662651
>>> TrtTime:LifestageL3 1.66743472
>>
>> That is, TrtTime:LifestageL1 is NA, as expected.
>>
>> I would have thought that fitted or predicted values corresponding to
>> Lifestage = "L1" would thereby be NA, but this is not the case:
>>
>>> predict(fit)[demoDat$Lifestage=="L1"]
>>> 26 65 131
>>> 24.02007 24.02007 24.02007
>>>
>>> fitted(fit)[demoDat$Lifestage=="L1"]
>>> 26 65 131
>>> 1 1 1
>>
>> That is, the predicted values on the scale of the linear predictor are
>> large and positive, rather than being NA.
>>
>> What this amounts to, it seems to me, is saying that if the linear
>> predictor in a Binomial glm is NA, then "success" is a certainty.
>> This strikes me as being a dubious proposition. My gut feeling is that
>> misleading results could be produced.
>>
>> Can anyone explain to me a rationale for this behaviour pattern?
>> Is there some justification for it that I am not currently seeing?
>> Any other comments? (Please omit comments to the effect of "You are as
>> thick as two short planks!". :-) )
>>
>> I have attached the example data set in a file "demoDat.txt", should
>> anyone want to experiment with it. The file was created using dput() so
>> you should access it (if you wish to do so) via something like
>>
>> demoDat <- dget("demoDat.txt")
>>
>> Thanks for any enlightenment.
>>
>> cheers,
>>
>> Rolf Turner
>>
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
--
Sent from my phone. Please excuse my brevity.
More information about the R-help
mailing list