Under dispersion; Was: [R] binomial glm warnings revisited
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Thu Oct 9 15:01:37 CEST 2003
Tord Snall <tord.snall at ebc.uu.se> writes:
> Null deviance: 13.1931 on 269 degrees of freedom
> Residual deviance: 9.9168 on 268 degrees of freedom
> AIC: 13.917
...
> BUT, note the under dispersion. I GUESS it is because I have surveyed a
> moss on marked trees at three occations (with two years in between). The
> response 1 means that the moss has disappeared, and dbh is tree diameter.
> (This corresponds to revisitng patients who has a disease, and whose weight
> is unchanged between the visits. H0: weight does not affect tha chance of
> recovery from the disease)
Don't trust deviances as measures of dispersion with binary data!
> Here is a version with quasibinomial:
>
...
>
> Note, no warning.
>
> I guess that this quasibinomial model is more reliable than the binomial.
> Now I can trust the SE of the Estim. too, can't I?
No. Neither nor.
With binary data, the deviance is purely a function of the fitted
parameters. It is the difference in -2 log L between a "perfect fit"
and the observed fit. A perfect fit has a zero prob. where the obs is
"0" and probability 1 where it is "1", and L == 1 identically in that
case. Now consider the likelihood for the "complete toss-up" i.e.
intercept and slope both equal to 0 so all probabilities are 0.5. The
likelihood in that case is 0.5^269, i.e. a constant. Take logarithms
and notice that the model deviance plus the change in deviance from
the model to the "toss-up" model is constant (2*269*log(2) to be
precise). So what appears to be a measure of residual error is
really just a measure of how far the fitted probabilities are from
0.5!
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list