[R] warning associated with Logistic Regression
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Sun Jan 25 18:24:23 CET 2004
David Firth <d.firth at warwick.ac.uk> writes:
> On Sunday, Jan 25, 2004, at 13:59 Europe/London, Guillem Chust wrote:
>
> > Hi All,
> >
> > When I tried to do logistic regression (with high maximum number of
> > iterations) I got the following warning message
> >
> > Warning message:
> > fitted probabilities numerically 0 or 1 occurred in: (if
> > (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,
> >
> > As I checked from the Archive R-Help mails, it seems that this
> > happens when
> > the dataset exhibits complete separation.
>
> Yes. correct.
Sufficient but not necessary. It can happen just by numerical roundoff
if the effect is strong enough. (I have an example with age and
prevalent menarche: for nearly all women this happens between the age
of 10 and 18, so if you have a couple of 40-year olds in your data
set, they'll get a fitted p of 1. Happens even more easily if you
throw in a cubic term.)
> > However, p-values tend to 1
>
> The reported p-values cannot be trusted: the asymptotic theory on
> which they are based is not valid in such circumstances.
>
> > , and
> > residual deviance tends to 0.
This, however, is a clear sign that the fit has diverged, and in that
case (but not necessarily otherwise) the asymptotic theory is invalid.
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list