[R] poisson regression with robust error variance ('eyestudy
Paul Johnson
pauljohn32 at gmail.com
Thu May 8 22:35:38 CEST 2008
Ted Harding said:
> I can get the estimated RRs from
> RRs <- exp(summary(GLM)$coef[,1])
> but do not see how to implement confidence intervals based
> on "robust error variances" using the output in GLM.
Thanks for the link to the data. Here's my best guess. If you use
the following approach, with the HC0 type of robust standard errors in
the "sandwich" package (thanks to Achim Zeileis), you get "almost" the
same numbers as that Stata output gives. The estimated b's from the
glm match exactly, but the robust standard errors are a bit off.
### Paul Johnson 2008-05-08
### sandwichGLM.R
system("wget http://www.ats.ucla.edu/stat/stata/faq/eyestudy.dta")
library(foreign)
dat <- read.dta("eyestudy.dta")
### Ach, stata codes factor contrasts backwards
dat$carrot0 <- ifelse(dat$carrot==0,1,0)
dat$gender1 <- ifelse(dat$gender==1,1,0)
glm1 <- glm(lenses~carrot0, data=dat, family=poisson(link=log))
summary(glm1)
library(sandwich)
vcovHC(glm1)
sqrt(diag(vcovHC(glm1)))
sqrt(diag(vcovHC(glm1, type="HC0")))
### Result:
# > summary(glm1)
# Call:
# glm(formula = lenses ~ carrot0, family = poisson(link = log),
# data = dat)
# Deviance Residuals:
# Min 1Q Median 3Q Max
# -1.1429 -0.9075 0.3979 0.3979 0.7734
# Coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -0.8873 0.2182 -4.066 4.78e-05 ***
# carrot0 0.4612 0.2808 1.642 0.101
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# (Dispersion parameter for poisson family taken to be 1)
# Null deviance: 67.297 on 99 degrees of freedom
# Residual deviance: 64.536 on 98 degrees of freedom
# AIC: 174.54
# Number of Fisher Scoring iterations: 5
# > sqrt(diag(vcovHC(glm1, type="HC0")))
# (Intercept) carrot0
# 0.1673655 0.1971117
# >
### Compare against
## http://www.ats.ucla.edu/stat/stata/faq/relative_risk.htm
## robust standard errors are:
## .1682086 .1981048
glm2 <- glm(lenses~carrot0 +gender1 +latitude, data=dat,
family=poisson(link=log))
sqrt(diag(vcovHC(glm2, type="HC0")))
### Result:
# > summary(glm2)
#Call:
# glm(formula = lenses ~ carrot0 + gender1 + latitude, family =
poisson(link = log),
# data = dat)
# Deviance Residuals:
# Min 1Q Median 3Q Max
# -1.2332 -0.9316 0.2437 0.5470 0.9466
# Coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -0.65212 0.69820 -0.934 0.3503
# carrot0 0.48322 0.28310 1.707 0.0878 .
# gender1 0.20520 0.27807 0.738 0.4605
# latitude -0.01001 0.01898 -0.527 0.5980
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# (Dispersion parameter for poisson family taken to be 1)
# Null deviance: 67.297 on 99 degrees of freedom
# Residual deviance: 63.762 on 96 degrees of freedom
# AIC: 177.76
# Number of Fisher Scoring iterations: 5
# > sqrt(diag(vcovHC(glm2, type="HC0")))
# (Intercept) carrot0 gender1 latitude
# 0.49044963 0.19537743 0.18481067 0.01275001
### UCLA site has:
## .4929193 .1963616 .1857414 .0128142
So, either there is some "rounding error" or Stata is not using
exactly the same algorithm as vcovHC in sandwich.
--
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas
More information about the R-help
mailing list