Under dispersion; Was: [R] binomial glm warnings revisited

Thu Oct 9 15:01:37 CEST 2003

Tord Snall <tord.snall at ebc.uu.se> writes:

>     Null deviance: 13.1931  on 269  degrees of freedom
> Residual deviance:  9.9168  on 268  degrees of freedom
> AIC: 13.917
...

> BUT, note the under dispersion. I GUESS it is because I have surveyed a
> moss on marked trees at three occations (with two years in between). The
> response 1 means that the moss has disappeared, and dbh is tree diameter.
> (This corresponds to revisitng patients who has a disease, and whose weight
> is unchanged between the visits. H0: weight does not affect tha chance of
> recovery from the disease)

Don't trust deviances as measures of dispersion with binary data! 

> Here is a version with quasibinomial:
> 
...
> 
> Note, no warning.
> 
> I guess that this quasibinomial model is more reliable than the binomial.
> Now I can trust the SE of the Estim. too, can't I? 

No. Neither nor.

With binary data, the deviance is purely a function of the fitted
parameters. It is the difference in -2 log L between a "perfect fit"
and the observed fit. A perfect fit has a zero prob. where the obs is
"0" and probability 1 where it is "1", and L == 1 identically in that
case. Now consider the likelihood for the "complete toss-up" i.e.
intercept and slope both equal to 0 so all probabilities are 0.5. The
likelihood in that case is 0.5^269, i.e. a constant. Take logarithms
and notice that the model deviance plus the change in deviance from
the model to the "toss-up" model is constant (2*269*log(2) to be
precise). So what appears to be a measure of residual error is
really just a measure of how far the fitted probabilities are from
0.5!

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907