[R] Binomial glms with very small numbers

Spencer Graves spencer.graves at pdf.com
Thu Jan 15 02:56:30 CET 2004


      Yes, but "glm" maximizes the binomial likelihood assuming 
log(p/(1-p)) is a linear model.  Therefore, you don't have to transform 
the 0's and 1's.  There are cases where a particular combination of 
potential explanatory variables will clearly separate mortalities from 
survivors.  I don't know that the algorithm does with such cases, but it 
should send a slope essentially to infinite.  However, if you don't have 
this case, "glm" should do what you want. 

      hope this helps.  spencer graves

Patrick Connolly wrote:

>On Wed, 14-Jan-2004 at 05:15PM -0800, Spencer Graves wrote:
>
>|>       The advisability of using "glm" with mortality depends not on
>|> the size of sample groups but on the assumption of independence:
>|> Whether you have 3 individuals per group or 30 or 1, is it
>
>I think we can assume independence.  What concerned me more was the
>fact that there will be rather a lot of 0s and 1s, corresponding to
>-Inf and Inf on the transformed scale.  Only half the possible values
>(namely, 1 & 2) will be usable in the fitting.  On second thoughts,
>since the response can be given as a binary, perhaps I was
>unnecessarily concerned.
>
>
>|> plausible to assume that all individuals represented in your
>|> data.frame have independent chances of survival give the
>|> potentially explanatory variables?  If the answer is "yes", then
>|> "glm" is appropriate.  If the answer is "no", then some other tool
>|> may be preferable.  However, "glm" is quick and easy in R, and I
>|> might start with that, even if I felt the assumption of
>|> independence was violated.  If I found nothing there, I would not
>|> likely find anything with techniques that handled more
>|> appropriately the violations of independence.
>
>Thanks for that suggestion.
>
>|> 
>|>       Similarly, I can't see how it would matter whether potentially 
>|> explanatory variables were continuous or categorical, as long as a 
>|> categorical variable were appropriately coded as a factor (or 
>|> "character", which is then treated as a factor) if it has more than 2 
>|> levels. 
>
>I didn't think it would make a difference but I included it in case
>someone more knowledgeable had reasons why it did.
>
>Thanks.
>
>  
>




More information about the R-help mailing list