[R] warning associated with Logistic Regression
David Firth
d.firth at warwick.ac.uk
Sun Jan 25 18:02:57 CET 2004
On Sunday, Jan 25, 2004, at 13:59 Europe/London, Guillem Chust wrote:
> Hi All,
>
> When I tried to do logistic regression (with high maximum number of
> iterations) I got the following warning message
>
> Warning message:
> fitted probabilities numerically 0 or 1 occurred in: (if
> (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,
>
> As I checked from the Archive R-Help mails, it seems that this happens
> when
> the dataset exhibits complete separation.
Yes. correct.
> However, p-values tend to 1
The reported p-values cannot be trusted: the asymptotic theory on which
they are based is not valid in such circumstances.
> , and
> residual deviance tends to 0.
Yes, this happens under complete separation: the model fits the
observed 0/1 data perfectly.
> My questions then is:
> -Is the converged model correct?
Well, "converged" is not really the right word to use -- the iterative
algorithm has diverged. At least one of the coefficients has its MLE
at infinity (or minus infinity). In that sense what you see reported
(ie large values of estimated log odds-ratios, which approximate
infinity) is correct. Still more correct would be estimates reported
as Inf or -Inf: but the algorithm is not programmed to detect such
divergence.
> or
> -Can I limit the number of iterations in order to avoid this warning?
Yes, probably, but this is not a sensible course of action. The
iterations are iterations of an algorithm to compute the MLE. The MLE
is not finite-valued, and the warning is a clue to that.
If you *really* want finite parameter estimates, the answer is not to
use maximum likelihood as the method of estimation. Various
alternatives exist, mostly based on penalizing the likelihood [one such
is in the brlr package, but there are others]. As a general principle
surely it's better to maximize a different criterion (eg a penalized
likelihood, with a purposefully chosen penalty function) rather than
stop the MLE algorithm prematurely and arbitrarily?
I hope this helps!
David
Professor David Firth
Dept of Statistics
University of Warwick
Coventry CV4 7AL
United Kingdom
Email: d.firth at warwick.ac.uk
Voice: +44 (0)247 657 2581
Fax: +44 (0)247 652 4532
More information about the R-help
mailing list