[R] binomial glm warnings revisited

Wed Oct 8 21:57:15 CEST 2003

Thanks, Peter:  You are absolutely correct.  Thanks again for the 
correction.  Spencer Graves

Peter Dalgaard BSA wrote:

>Spencer Graves <spencer.graves at pdf.com> writes:
>
>  
>
>>      This seems to me to be a special case of the general problem of
>>a parameter on a boundary.  
>>    
>>
>
>Umm, no... 
>
>  
>
>>>I have this problem with my data. In a GLM, I have 269 zeroes and
>>>only 1 one:
>>>      
>>>
>
>I don't think that necessarily gets you a parameter estimate on the
>boundary. Only if the single "1" is smaller or bigger than all the others
>should that happen. 
>
>  
>
>>>summary(dbh)
>>>Coefficients:
>>>           Estimate Std. Error z value Pr(>|z|)
>>>(Intercept)   0.1659     3.8781   0.043    0.966
>>>dbh          -0.5872     0.5320  -1.104    0.270
>>>
>>>
>>>      
>>>
>>>>drop1(dbh, test = "Chisq")
>>>>
>>>>        
>>>>
>>>Single term deletions
>>>Model:
>>>MPext ~ dbh
>>>      Df Deviance     AIC     LRT Pr(Chi)  <none>      9.9168
>>>13.9168                  dbh     1  13.1931 15.1931  3.2763 0.07029 .
>>>
>>>I now wonder, is the drop1() function output 'reliable'?
>>>
>>>If so, is then the estimates from MASS confint() also 'reliable'? It gives
>>>the same warning.
>>>      
>>>
>
>  
>
>>>(Intercept) -6.503472 -0.77470556
>>>abund       -1.962549 -0.07496205
>>>There were 20 warnings (use warnings() to see them)
>>>      
>>>
>
>During profiling, you may be pushing one of the parameter near the
>extremes and get a model where the fitted p's are very close to 0/1.
>That's not necessarily a sign of unreliability -- the procedure is to
>set one parameter to a sequence of fixed values and optimize over the
>other, and it might just be the case that the optimizations have been
>wandering a bit far from the optimum. (I'd actually be more suspicious
>about the fact that the name of the predictor suddenly changed....)
>
>However, if you have only one "1" you are effectively asking whether
>one observation has a different mean than the other 269, and you have
>to consider the sensitivity to the distribution of the predictor. As
>far as I can see, you end up with the test of the null hypothesis
>beta==0 being essentially equivalent to a two sample t test between
>the mean of the "0" group and that of the "1" group, so with only one
>observation in one of the groups, the normal approximation of the test
>hinges quite strongly on a normal distribution of the predictor
>itself.
>
>  
>