[Rd] possible bug, anova.glm(), family="gaussian" (PR#579)
Peter Dalgaard BSA
p.dalgaard@biostat.ku.dk
25 Jun 2000 19:42:36 +0200
rnassar@duke.edu writes:
> anova(f.glm.1,f.glm.2,test="F")
> # Resid. Df Resid. Dev Df Deviance F Pr(>F)
> # A + S + A:S 4 3.500
> # A + S 5 4.625 -1 0.000 0.000 1
>
> anova(f.lm.1,f.lm.2)
> # Res.Df Res.Sum Sq Df Sum Sq F value Pr(>F)
> # 1 4 3.500
> # 2 5 4.625 -1 -1.125 1.2857 0.3202
Hum.. It is being done on purpose. anova.glmlist has this nugget
inside:
table <- data.frame(resdf, resdev, c(NA, -diff(resdf)), c(NA,
pmax(0, -diff(resdev))))
This is obviously inconsistent with the lm version, but I wonder which
one is wrong? There may be a point in requiring models to be in
increasing order. However, something is clearly wrong with that too:
> anova(f.glm.2,f.glm.1,test="F")
...
Resid. Df Resid. Dev Df Deviance F Pr(>F)
A + S 5 4.6250
A + S + A:S 4 3.5000 1 1.1250 1.2857 0.3745
Notice the p value! This comes from an F with 2 df in the denominator
instead of 4. Comes out of this bit:
if (!is.null(test)) {
bigmodel <- object[[(rdf <- order(resdf)[1])]]
dispersion <- summary(bigmodel, dispersion =
dispersion)$dispersion
df.dispersion <- if (dispersion == 1)
Inf
else rdf
which is clearly wrong since rdf will be an integer between 1 and the
number of models compared...
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._