[Rd] possible bug, anova.glm(), family="gaussian" (PR#579)

Sun, 25 Jun 2000 15:00:20 -0400 (EDT)

Many thanks for your reply.

For some reason I usually think about the models in decreasing order, and 
tend to list them that way in the argument to anova.  But as you noted 
that is not all of the problem.  Regarding the order: if there is a 
reason for requiring increasing order, then might it not be good to 
issue an error message or warning when the order used is wrong?

Thank you again!

Rashid

On 25 Jun 2000, Peter Dalgaard BSA wrote:

> Date: 25 Jun 2000 19:42:36 +0200
> From: Peter Dalgaard BSA <p.dalgaard@biostat.ku.dk>
> To: rnassar@duke.edu
> Cc: r-devel@stat.math.ethz.ch, R-bugs@biostat.ku.dk
> Subject: Re: [Rd] possible bug, anova.glm(), family="gaussian" (PR#579)
> 
> rnassar@duke.edu writes:
> 
> > anova(f.glm.1,f.glm.2,test="F")
> 
> > #             Resid. Df Resid. Dev Df Deviance     F Pr(>F)
> > # A + S + A:S         4      3.500                         
> > # A + S               5      4.625 -1    0.000 0.000      1
> > 
> > anova(f.lm.1,f.lm.2)
> 
> > #   Res.Df Res.Sum Sq Df Sum Sq F value Pr(>F)
> > # 1      4      3.500                         
> > # 2      5      4.625 -1 -1.125  1.2857 0.3202
> 
> Hum.. It is being done on purpose. anova.glmlist has this nugget
> inside:
> 
>     table <- data.frame(resdf, resdev, c(NA, -diff(resdf)), c(NA, 
>         pmax(0, -diff(resdev))))
> 
> This is obviously inconsistent with the lm version, but I wonder which
> one is wrong? There may be a point in requiring models to be in
> increasing order. However, something is clearly wrong with that too:
> 
> > anova(f.glm.2,f.glm.1,test="F")
> ...
>             Resid. Df Resid. Dev Df Deviance      F Pr(>F)
> A + S               5     4.6250                          
> A + S + A:S         4     3.5000  1   1.1250 1.2857 0.3745
> 
> Notice the p value! This comes from an F with 2 df in the denominator
> instead of 4. Comes out of this bit:
> 
>     if (!is.null(test)) {
>         bigmodel <- object[[(rdf <- order(resdf)[1])]]
>         dispersion <- summary(bigmodel, dispersion =
>         dispersion)$dispersion
>         df.dispersion <- if (dispersion == 1) 
>             Inf
>         else rdf
> 
> which is clearly wrong since rdf will be an integer between 1 and the
> number of models compared...
> 
> -- 
>    O__  ---- Peter Dalgaard             Blegdamsvej 3  
>   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
>  (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907
> 
> 
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._