[R] what does the it when there is a zero events in the Logistic Regression with glm?

Sh.G. Sun rhelpforsun at gmail.com
Wed Nov 23 16:31:21 CET 2005


Sorry for my stupid mistakes and thanks for your reply.

I just have a study on the occurrence of rare events. Although I 
collected thousands of observations, there are some groups with 0 
events. I think it is too crude to drop those 0-events groups.

I have read some books about logistic regression searched the r-help 
maillist. But I donot find enough information about "separation". Would 
you be so kind to give me some suggestions on "separation" and the 
"better algorithms"?

Thanks!

Sh.G. Sun


Prof Brian Ripley wrote:
> On Tue, 22 Nov 2005, S. Sun wrote:
> 
>> I have a question about the glm.
> 
> Not really: your question is about understanding logistic regressions.
> 
>> When the events of an observation is 0,
>> the logit function on it is Inf. I wonder how the glm solve it.
> 
> Note that logit(0)  = -Inf whereas logit(1) = Inf.
> 
> It is the fitted probabilities which are passed to logit, not the 
> empirical proportions.  Logistic regression is often applied to 
> Bernouilli trials with 0/1 proportions, with nothing to `solve'.
> 
> So the issue only arises if the MLE would give 0 (or 1) fitted values, 
> and it cannot in a logistic regression.  You have here an example in 
> which the MLE does not exist and the log-likelihood does not attain its 
> maximum. Such situations are known as `separation' and it is well-known 
> that there are better algorithms for such problems.
> 
>> An example:
>> Treat Events Trials
>> A     0      50
>> B     7      50
>> C     10     50
>> D     15     50
>> E     17     50
>>
>> Program:
>>
>> treat <- factor(c("A", "B", "C", "D", "E"))
>> events <- c(0, 7, 10, 15, 17)
>> trials <- rep(50, 5)
>> glm(cbind(events, trials-events)~treat, family=binomial)
>>
>> What's wrong with it? And are there better ideas?
> 
> Nothing is `wrong with it'.  It finds fitted values which are very close 
> to the observed values.  You have chosen an inappropriate model and an 
> inappropriate parametrization (see ?relevel).
> 
> I presume you did think something is wrong, but you did not tell us what.
> Please do read the posting guide and try to provide us with enough 
> information to help you.  Also, please do sign your messages indicating 
> who you are and what your background is.  In cases like this the best 
> advice is to suggest asking your supervisor (if you have one) or to read 
> the literature (but what specifically depends on your background).
>




More information about the R-help mailing list