[Rd] glm offset and interaction bugs (PR#4941)
charlie at stat.umn.edu
charlie at stat.umn.edu
Tue Nov 4 02:14:00 MET 2003
Full_Name: Charles J. Geyer
Version: 1.8.0
OS: i686-pc-linux-gnu (Suse 8.2)
Submission from: (NULL) (134.84.86.22)
Two bugs (perhaps related, perhaps independent) revealed by the same
Poisson regression with offset
mydata <- read.table(url("http://www.stat.umn.edu/geyer/5931/mle/seeds.txt"))
out.fubar <- glm(seedlings ~ burn01 + vegtype * burn02 +
offset(log(totalseeds)), data = mydata, family = poisson)
summary(out.fubar)
out.barfu <- glm(seedlings ~ burn01 + vegtype * burn02,
offset = log(totalseeds), data = mydata, family = poisson)
summary(out.barfu)
out.ok <- glm(seedlings ~ vegtype * burn02 + burn01,
offset = log(totalseeds), data = mydata, family = poisson)
summary(out.ok)
As far as I can tell from reading the documentation, these should produce
the same results. They don't. The regression coefficient for the
offset term in the first (fubar) regression is bogus. That's not what
offset() is supposed to do. Note that offset() works properly in
out <- glm(seedlings ~ vegtype + burn01 + burn02 + offset(log(totalseeds)),
data = mydata, family = poisson)
summary(out)
So is is only partially bogus -- very dangerous for users that are less
than hyperalert.
The difference between out.barfu and out.ok shows that "+" in formulas
is noncommutative, which is very mind bending.
The regression in out.ok is o. k. It checks by hand.
For a more complete explanation (if more is wanted), including
the printout from these summary commands on my machine and the
check of out.ok "by hand", see
http://www.stat.umn.edu/geyer/5931/mle/seed2.Rnw
http://www.stat.umn.edu/geyer/5931/mle/seed2.pdf
More information about the R-devel
mailing list