[Rd] glm start/offset bugs (PR#1422)
Prof Brian D Ripley
ripley@stats.ox.ac.uk
Fri, 29 Mar 2002 06:25:11 +0000 (GMT)
On Fri, 29 Mar 2002 presnell@stat.ufl.edu wrote:
>
> --fupGvOGOQM
> Content-Type: text/plain; charset=us-ascii
> Content-Description: message body and .signature
> Content-Transfer-Encoding: 7bit
>
>
> There's a simple bug in the handling of the start and offset arguments
> in glm and glm.fit. The bug exists in the latest development version
> of R (version information below), but it appears that glm.R has not
> been touched much lately, so the bug affects at least the most recent
> stable release of R.
>
> Here is a simple example:
>
> > data(ships, package=MASS)
> > ships.glm <- glm(incidents ~ type + year + period + offset(log(service)),
> + family=poisson, data=ships, subset=service != 0)
> > glm(incidents ~ type + year + period + offset(log(service)),
> + family=poisson, data=ships, subset=service != 0, start=coef(ships.glm))
>
> or more simply
>
> > update(ships.glm, start=coef(ships.glm))
>
> The problem is caused by a bad initialization of etastart in glm.fit()
> when an offset is present. Fixing this and retrying the example above
> then reveals another bug, this time in glm(), that is tickled only
> when non-NULL values are given to both the offset and start arguments.
>
> I've attached to the bottom of this message a simple patch to
> src/library/base/R/glm.R that I believe correctly repairs these bugs.
>
> Essentially the same change is needed in Berwin Turlach's Wed, 27 Feb
> 2002 version of glm.fit() reported on the R bug tracking site as
> "Models/1331". (The change to glm() is still required as well of
> course.) As an aside, I'm wondering if there has been any thought
> given to adopting Berwin's patches. Though I must admit that I
> haven't taken the time to check carefully through his changes, I know
> Berwin to be a very careful and skillful programmer, and I suspect
> that he is the only person to have looked very closely at those
> particular details of glm.fit in recent months (years?). Also, some
> of the simpler changes that he made seem to be needed just to cleanup
> the code. Would it be useful if a number of us began using his
> version as an informal test? After discovering this bug earlier
> today, I have already begun to do so and so far have not encountered
> any problems.
Yes, thought has been given, and the code has been looked at. (People
do look at R code you know, even if they don't jump in to change it
after a few hour's use.)
Given the time pressures, this is not going to get looked at for 1.5.0. I
have long thought that given the history of problems maintaining glm()
that it might be easier to rewrite it for scratch. Having .null versions
serves no useful purpose, for example.
--
Brian D. Ripley, ripley@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._