[Rd] Apparant bug in binomial model in GLM (PR#13434)

Tue Jan 6 16:08:21 CET 2009

soren.faurby at biology.au.dk wrote:
> Full_Name: Søren Faurby
> Version: 2.4.1 and 2.7.2
> OS: 
> Submission from: (NULL) (192.38.46.92)
> 
> 
> There appear to be a bug in the estimation of significance in the binomial model
> in GLM. This bug apparently appears when the correlation between two variables
> is to strong.
> 
> Such as this dummy example
> c(0,0,0,0,0,1,1,1,1,1)->a
> a->b
> m1<-glm(a~b, binomial)
> summary(m1)
> 
> It is sufficient that all 1's correspond to 1's such as this example
> 
> c(0,0,0,0,0,1,1,1,1,1)->a
> c(0,0,0,0,1,1,1,1,1,1)->c
> m1<-glm(a~c, binomial)
> summary(m1)

That's not a bug, just the way things work. When the algorithm diverges,
 as seen by the huge Std.Error, Wald tests (z) are unreliable. (Notice
that the log OR in an a vs. c table is infinite whichever way you turn
it.) The likelihood ratio test (as in drop1(m1, test="Chisq")) is
somewhat less unreliable, but in these small examples, still quite some
distance from the table based approaches of fisher.test(a,c) and
chisq.test(a,c).

> 
> I hope that this message is understandable. 
> 
> Kind regards, Søren
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907