Under dispersion; Was: [R] binomial glm warnings revisited

Tord Snall tord.snall at ebc.uu.se
Thu Oct 9 14:30:57 CEST 2003


Dear all,

>
>> >I have this problem with my data. In a GLM, I have 269 zeroes and
>> >only 1 one:
>
>
>During profiling, you may be pushing one of the parameter near the
>extremes and get a model where the fitted p's are very close to 0/1.

I just want to clarify that the warning was given already when I fitted the
glm():

> dbh<- glm(MPext ~ dbh, maxit = 100, family = "binomial", data = valkdat)
Warning message: 
fitted probabilities numerically 0 or 1 occurred in: (if (

(As you can see I had to increase maxit for th algorithm to converge.)

A summary:
summary(dbh)
Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)   0.1659     3.8781   0.043    0.966
dbh          -0.5872     0.5320  -1.104    0.270
    Null deviance: 13.1931  on 269  degrees of freedom
Residual deviance:  9.9168  on 268  degrees of freedom
AIC: 13.917

> drop1(dbh, test = "Chisq")
       Df Deviance     AIC     LRT Pr(Chi)  
<none>      9.9168 13.9168                  
dbh     1  13.1931 15.1931  3.2763 0.07029 .

And then CI:
confint(dbh)
Waiting for profiling to be done...
                2.5 %      97.5 %
(Intercept) -6.458119 10.12380773
dbh         -2.253015 -0.05047997
There were 17 warnings (use warnings() to see them)

BUT, note the under dispersion. I GUESS it is because I have surveyed a
moss on marked trees at three occations (with two years in between). The
response 1 means that the moss has disappeared, and dbh is tree diameter.
(This corresponds to revisitng patients who has a disease, and whose weight
is unchanged between the visits. H0: weight does not affect tha chance of
recovery from the disease)

Here is a version with quasibinomial:

> dbh<- glm(MPext ~ dbh, maxit = 100, family = "quasibinomial", data =
valkdat)

Note, no warning.

> summary(dbh)
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   0.1659     1.7179   0.097   0.9231  
dbh          -0.5872     0.2357  -2.491   0.0133 *
(Dispersion parameter for quasibinomial family taken to be 0.1962275)
    Null deviance: 13.1931  on 269  degrees of freedom
Residual deviance:  9.9168  on 268  degrees of freedom
AIC: NA
Number of Fisher Scoring iterations: 11

> confint(dbh)
Waiting for profiling to be done...
                2.5 %     97.5 %
(Intercept) -2.970644  3.9019555
dbh         -1.158646 -0.2131936

> drop1(dbh, test = "Chisq")
       Df Deviance scaled dev.   Pr(Chi)    
<none>      9.9168                          
dbh     1  13.1931     16.6966 4.386e-05 ***

Note, no warning.

I guess that this quasibinomial model is more reliable than the binomial.
Now I can trust the SE of the Estim. too, can't I? 

(Under dispersion has not been discussed on the list except for a reply by
Prof. Ripley on a Poisson model question.)


>That's not necessarily a sign of unreliability -- the procedure is to
>set one parameter to a sequence of fixed values and optimize over the
>other, and it might just be the case that the optimizations have been
>wandering a bit far from the optimum. (I'd actually be more suspicious
>about the fact that the name of the predictor suddenly changed....)

:D 

>
>However, if you have only one "1" you are effectively asking whether
>one observation has a different mean than the other 269, and you have
>to consider the sensitivity to the distribution of the predictor. As
>far as I can see, you end up with the test of the null hypothesis
>beta==0 being essentially equivalent to a two sample t test between
>the mean of the "0" group and that of the "1" group, so with only one
>observation in one of the groups, the normal approximation of the test
>hinges quite strongly on a normal distribution of the predictor
>itself.

Thanks for this interesting point of view.


Sincerely,
Tord

>
>-- 
>   O__  ---- Peter Dalgaard             Blegdamsvej 3  
>  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
> (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
>~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
>

-----------------------------------------------------------------------
Tord Snäll
Avd. f växtekologi, Evolutionsbiologiskt centrum, Uppsala universitet
Dept. of Plant Ecology, Evolutionary Biology Centre, Uppsala University
Villavägen 14			
SE-752 36 Uppsala, Sweden
Tel: 018-471 28 82 (int +46 18 471 28 82) (work)
Tel: 018-25 71 33 (int +46 18 25 71 33) (home)
Fax: 018-55 34 19 (int +46 18 55 34 19) (work)
E-mail: Tord.Snall at ebc.uu.se
Check this: http://www.vaxtbio.uu.se/resfold/snall.htm!




More information about the R-help mailing list