[Rd] Bug in glm.fit() or plot.lm() (PR#778)

ripley@stats.ox.ac.uk ripley@stats.ox.ac.uk
Thu, 28 Dec 2000 22:54:49 +0100 (MET)


Here's what I think is going on.  

plot.lm() and everything I could see like it calls weights(object) and
resid(object).  The problem is that there is a resid.glm so one gets
deviance residuals, but there is not a weights.glm so weights.lm is used.  
The solution is to add weights.glm:

weights.glm <- function(object, type = c("prior", "working"), ...)
{
    type <- match.arg(type)
    if(type == "prior") object$prior.weights else object$weights
}

As far as I can see one either wants the default weights here and deviance
or Pearson residuals, or object$weights and the working residuals.  The
fitted object has the latter pair, which is sensible for inheriting from lm
(and compatible with S), but both plot.lm and your example use residuals()
which is not the same thing.

It seems reasonably clear from the labels that plot.lm is intended to use
standardized deviance residuals, but I am not at all sure that is the best
alternative. Davison & Snell (1991 Cox Festschrift) seem to suggest that
standardized Pearson residuals would be better, and offer other
alternatives.

I've added weights.glm.  We badly need to document the "glm" class
and subtle points like this.



On Tue, 19 Dec 2000 murdoch@stats.uwo.ca wrote:

> Here's a bug one of my students noticed.
> 
> When you call plot() on a glm object, plot.lm gets called.  The second
> plot it shows is supposed to give a normal QQ plot of the standard
> deviance residuals, but it doesn't.  The glm object created by glm.fit
> returns something (the IRLS weights?) in fit$weights which plot.lm
> takes as observation weights, so  you get strange residuals in the QQ
> plot.
> 
> For example, here I fit a simulated Poisson model where the model
> perfectly fine, but the plot.lm QQ plot makes the residuals look huge,
> ranging between -45 and 40.
> 
> >set.seed(1)
> >x <- seq(4,5,len=1000)
> >y <- rpois(1000,exp(x))
> >fit <- glm(y ~ x, family=poisson)
> >plot(fit, which=2)
> 
> If I extract the residuals myself and plot them, I see that the model
> is fine:
> 
> >qqnorm(residuals(fit))
> 
> The problem is that the weights are big; plot.lm should be using
> prior.weights, or glm should be returning the big weights under a
> different name:
> 
> > summary(fit$weights)
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
>   54.35   69.90   89.90   93.76  115.60  148.70 
> 
> > summary(fit$prior.weights)
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
>       1       1       1       1       1       1 

Match the residuals and weights is the story.

> 
> Duncan Murdoch
> 
> --please do not edit the information below--
> 
> Version:
>  platform = i386-pc-mingw32
>  arch = x86
>  os = Win32
>  system = x86, Win32
>  status = 
>  major = 1
>  minor = 2.0
>  year = 2000
>  month = 12
>  day = 15
>  language = R
> 
> Windows 9x 4.10 (build 1998)  
> 
> Search Path:
>  .GlobalEnv, package:ctest, Autoloads, package:base
> 
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._