[R] Offset in glm poisson using R vs Exposure in Stata

Ben Bolker bbolker at gmail.com
Tue Nov 16 23:47:23 CET 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/16/2010 03:08 PM, Columbine Caroline Waring wrote:

> Officially I tried:

**A**
>> glm(count~md+ms+rf+sg+offset(log(Eff)), family=poisson,data=DepthHabGen)
>> glm(count~md+ms+rf+sg, offset=(log(Eff)), family=poisson,data=DepthHabGen)
> (which of course are the same as eachother)
> 

**B**
>> glm(count~md+ms+rf+sg, offset=(Eff), family=poisson,data=DepthHabGen)
>> glm(count~md+ms+rf+sg+offset(Eff), family=poisson,data=DepthHabGen)
> (which are also the same between themselves, yet wrong compared to the
> STATA model)
> 
> Additionally, given the text you found on stata website, which I am
> familiar with, I also tried:

**C**
>> glm(count~md+ms+rf+sg, offset=(exp(Eff)), family=poisson,data=DepthHabGen)
>> glm(count~md+ms+rf+sg+offset(exp(Eff)), family=poisson,data=DepthHabGen)
> (which still might be the solution however R issues the following response:
> Error: no valid set of coefficients has been found: please supply
> starting value)

  In my opinion, B and C are just wrong (C is in the wrong direction,
and it's not surprising that glm has hiccups when adding a
doubly-exponentiated version of the Eff variable to the linear predictor).

  So I think all the other stuff about specifying starting values is
essentially a red herring.

  I still don't know what Stata is doing but in your position I would
make up some data where I knew the answer and try it in both R and
Stata.  For example:


set.seed(1001)
md <- runif(100)
ms <- runif(100)
dat <- expand.grid(md=md,ms=ms)
dat$eff <- runif(nrow(dat))+2*dat$md
dat$eta <- with(dat,2*md-2*ms+log(eff))
dat$y <- with(dat,rpois(nrow(dat),exp(eta)))

m1 <- glm(y~md+ms+offset(log(eff)),data=dat, family="poisson")
summary(m1)

  I have purposely set up the offset here so that it is strongly
correlated with md, and will screw things up if it is not accounted for
properly.  I made the data set quite large so that it is clear that the
model is accurately retrieving the coefficients (2 and -2) assigned to
the predictors.

  cheers
    Ben

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkzjCfsACgkQc5UpGjwzenOc+wCfTMK8AdbiFkraQeDTd1LMcqOf
1dAAmgP/bR72ELMHsmAYHcPM2IX0AWLN
=bnkm
-----END PGP SIGNATURE-----



More information about the R-help mailing list