[R] binomial glm warnings revisited
Spencer Graves
spencer.graves at pdf.com
Wed Oct 8 19:54:37 CEST 2003
This seems to me to be a special case of the general problem of a
parameter on a boundary. Another example is the case of a variance
component that is zero. For this latter problem, Pinhiero and Bates
(2000) Mixed-Effects Models in S and S-Plus (Springer, sec. 2.4.1)
present simulation results showing that a 50-50 mixture of chi-square(0)
and chi-square(1), for example, provide an excellent approximation to
the actual sampling distribution of the 2*log(likelihood ratio).
Recent discussions of this and related questions on this list and
elsewhere produced the following list of articles that may be helpful:
Donald Andrews (2001) "Testing When a Parameter In on the Boundary
of the Maintained Hypothesis", Econometrica, 69: 683-734.
Donald Andrews (2000) "Inconsistency of the Bootstrap When a
Parameter Is on the Boundary of the Parameter Space", Econometrica, 68:
388-405.
Donald Andrews (1999) "Estimation When a Parameter Is on a
Boundary", Econometrica, 67: 1341-1383.
Rousseeuw, P. J. and Christmann, A. (2003) Robustness against
separations
and outliers in logistic regression, Computational Statistics & Data
Analysis, Vol. 43, pp. 315-332
### Unfortunately, I have not had time to review these, so I can't
comment further.
hope this helps. spencer graves
Tord Snall wrote:
>Dear all,
>
>Last autumn there was some discussion on the list of the warning
>Warning message:
>fitted probabilities numerically 0 or 1 occurred in: (if
>(is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,
>
>when fitting binomial GLMs with many 0 and few 1.
>
>Parts of replies:
>"You should be able to tell which coefficients are infinite -- the
>coefficients and their standard errors will be large. When this happens the
>standard errors and the p-values reported by summary.glm() for those
>variables are useless."
>"My guess is that the deviances and coefficients are entirely ok. I'd
>expect that problems in the general area that Thomas mentions to reveal
>themselves as a failure to converge."
>
>I have this problem with my data. In a GLM, I have 269 zeroes and only 1 one:
>
>summary(dbh)
>Coefficients:
> Estimate Std. Error z value Pr(>|z|)
>(Intercept) 0.1659 3.8781 0.043 0.966
>dbh -0.5872 0.5320 -1.104 0.270
>
>
>
>>drop1(dbh, test = "Chisq")
>>
>>
>Single term deletions
>Model:
>MPext ~ dbh
> Df Deviance AIC LRT Pr(Chi)
><none> 9.9168 13.9168
>dbh 1 13.1931 15.1931 3.2763 0.07029 .
>
>I now wonder, is the drop1() function output 'reliable'?
>
>If so, is then the estimates from MASS confint() also 'reliable'? It gives
>the same warning.
>
>Waiting for profiling to be done...
> 2.5 % 97.5 %
>(Intercept) -6.503472 -0.77470556
>abund -1.962549 -0.07496205
>There were 20 warnings (use warnings() to see them)
>
>
>Thanks in advance for your reply.
>
>
>Sincerely,
>Tord
>
>
>
>
>-----------------------------------------------------------------------
>Tord Snäll
>Avd. f växtekologi, Evolutionsbiologiskt centrum, Uppsala universitet
>Dept. of Plant Ecology, Evolutionary Biology Centre, Uppsala University
>Villavägen 14
>SE-752 36 Uppsala, Sweden
>Tel: 018-471 28 82 (int +46 18 471 28 82) (work)
>Tel: 018-25 71 33 (int +46 18 25 71 33) (home)
>Fax: 018-55 34 19 (int +46 18 55 34 19) (work)
>E-mail: Tord.Snall at ebc.uu.se
>Check this: http://www.vaxtbio.uu.se/resfold/snall.htm!
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
>
More information about the R-help
mailing list