[R] Predicted values from glm() when linear predictor is NA.

Thu Jul 28 07:31:14 CEST 2022

But "disappearing" is not what NA is supposed to do normally. Why is it being treated that way here?

On July 27, 2022 7:04:20 PM PDT, John Fox <jfox using mcmaster.ca> wrote:
>Dear Rolf,
>
>The coefficient of TrtTime:LifestageL1 isn't estimable (as you explain) and by setting it to NA, glm() effectively removes it from the model. An equivalent model is therefore
>
>> fit2 <- glm(cbind(Dead,Alive) ~ TrtTime + Lifestage +
>+               I((Lifestage == "Egg + L1")*TrtTime) +
>+               I((Lifestage == "L1 + L2")*TrtTime) +
>+               I((Lifestage == "L3")*TrtTime),
>+             family=binomial, data=demoDat)
>Warning message:
>glm.fit: fitted probabilities numerically 0 or 1 occurred
>
>> cbind(coef(fit, complete=FALSE), coef(fit2))
>                                  [,1]         [,2]
>(Intercept)                -0.91718302  -0.91718302
>TrtTime                     0.88846195   0.88846195
>LifestageEgg + L1         -45.36420974 -45.36420974
>LifestageL1                14.27570572  14.27570572
>LifestageL1 + L2           -0.30332697  -0.30332697
>LifestageL3                -3.58672631  -3.58672631
>TrtTime:LifestageEgg + L1   8.10482459   8.10482459
>TrtTime:LifestageL1 + L2    0.05662651   0.05662651
>TrtTime:LifestageL3         1.66743472   1.66743472
>
>There is no problem computing fitted values for the model, specified either way. That the fitted values when Lifestage == "L1" all round to 1 on the probability scale is coincidental -- that is, a consequence of the data.
>
>I hope this helps,
> John
>
>On 2022-07-27 8:26 p.m., Rolf Turner wrote:
>> 
>> I have a data frame with a numeric ("TrtTime") and a categorical
>> ("Lifestage") predictor.
>> 
>> Level "L1" of Lifestage occurs only with a single value of TrtTime,
>> explicitly 12, whence it is not possible to estimate a TrtTime "slope"
>> when Lifestage is "L1".
>> 
>> Indeed, when I fitted the model
>> 
>>      fit <- glm(cbind(Dead,Alive) ~ TrtTime*Lifestage, family=binomial,
>>                 data=demoDat)
>> 
>> I got:
>> 
>>> as.matrix(coef(fit))
>>>                                    [,1]
>>> (Intercept)                -0.91718302
>>> TrtTime                     0.88846195
>>> LifestageEgg + L1         -45.36420974
>>> LifestageL1                14.27570572
>>> LifestageL1 + L2           -0.30332697
>>> LifestageL3                -3.58672631
>>> TrtTime:LifestageEgg + L1   8.10482459
>>> TrtTime:LifestageL1                 NA
>>> TrtTime:LifestageL1 + L2    0.05662651
>>> TrtTime:LifestageL3         1.66743472
>> 
>> That is, TrtTime:LifestageL1 is NA, as expected.
>> 
>> I would have thought that fitted or predicted values corresponding to
>> Lifestage = "L1" would thereby be NA, but this is not the case:
>> 
>>> predict(fit)[demoDat$Lifestage=="L1"]
>>>        26       65      131
>>> 24.02007 24.02007 24.02007
>>> 
>>> fitted(fit)[demoDat$Lifestage=="L1"]
>>>   26  65 131
>>>    1   1   1
>> 
>> That is, the predicted values on the scale of the linear predictor are
>> large and positive, rather than being NA.
>> 
>> What this amounts to, it seems to me, is saying that if the linear
>> predictor in a Binomial glm is NA, then "success" is a certainty.
>> This strikes me as being a dubious proposition.  My gut feeling is that
>> misleading results could be produced.
>> 
>> Can anyone explain to me a rationale for this behaviour pattern?
>> Is there some justification for it that I am not currently seeing?
>> Any other comments?  (Please omit comments to the effect of "You are as
>> thick as two short planks!". :-) )
>> 
>> I have attached the example data set in a file "demoDat.txt", should
>> anyone want to experiment with it.  The file was created using dput() so
>> you should access it (if you wish to do so) via something like
>> 
>>      demoDat <- dget("demoDat.txt")
>> 
>> Thanks for any enlightenment.
>> 
>> cheers,
>> 
>> Rolf Turner
>> 
>> 
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.