[R] Comparing Negative Binomial Regression in Stata and R. Constants differ?
Paul E. Johnson
pauljohn at ku.edu
Thu Dec 4 13:57:25 CET 2003
I looked for examples of count data that might interest the students and
found this project about dropout rates in Los Angeles High Schools. It
is discussed in the UCLA stats help pages for the Stata users:
http://www.ats.ucla.edu/stat/stata/library/count.htm
and
See: http://www.ats.ucla.edu/stat/stata/library/longutil.htm
To replicate those results, I used R's excellent foreign package to
bring the lahigh data in, then did
poisReg1 <- glm(daysabs~gender+
mathnce+langnce,family=poisson(link=log), data=lahigh)
library(MASS)
negbinReg1 <- glm.nb(daysabs~gender+ mathnce+langnce,link=log, data=lahigh)
The parameter estimates of the coefficients are the just about the same,
except for the intercept estimates. Below I pasted in the Negative
Binomial results I got from R along with the Stata results that they
report. In the Stata output, they report alpha, same as 1/theta from
the R glm.nb output. Except for minor differences in standard errors,
only the intercept estimates markedly differ.
Can anybody explain this?
-----------------------------------------------------------
Stata:
nbreg daysabs gender mathnce langnce
Negative binomial regression Number of obs
= 316
LR chi2(3) =
20.74
Prob > chi2 =
0.0001
Log likelihood = -880.87312 Pseudo R2 =
0.0116
------------------------------------------------------------------------------
daysabs | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
---------+--------------------------------------------------------------------
gender | -.4311844 .1396656 -3.087 0.002 -.704924
-.1574448
mathnce | -.001601 .00485 -0.330 0.741 -.0111067
.0079048
langnce | -.0143475 .0055815 -2.571 0.010 -.0252871
-.003408
_cons | 3.147254 .3211669 9.799 0.000 2.517778
3.776729
---------+--------------------------------------------------------------------
/lnalpha | .2533877 .0955362 .0661402
.4406351
---------+--------------------------------------------------------------------
alpha | 1.288383 .1230871 10.467 0.000 1.068377
1.553694
------------------------------------------------------------------------------
Likelihood ratio test of alpha=0: chi2(1) = 1334.20 Prob > chi2 =
0.0000
Here is the R glm.nb output:
Deviance Residuals:
Min 1Q Median 3Q Max
-1.9785 -1.0627 -0.4147 0.2865 2.8193
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.716069 0.234174 11.598 < 2e-16 ***
gendermale -0.431185 0.139516 -3.091 0.00200 **
mathnce -0.001601 0.005300 -0.302 0.76259
langnce -0.014348 0.005372 -2.671 0.00756 **
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
(Dispersion parameter for Negative Binomial(0.7762) family taken to be 1)
Null deviance: 378.43 on 315 degrees of freedom
Residual deviance: 356.93 on 312 degrees of freedom
AIC: 1771.7
Number of Fisher Scoring iterations: 1
Correlation of Coefficients:
(Intercept) gendermale mathnce
gendermale -0.40
mathnce -0.28 -0.09
langnce -0.43 0.19 -0.69
Theta: 0.7762
Std. Err.: 0.0742
2 x log-likelihood: -1761.7460
----------------------------------------------------------
--
Paul E. Johnson email: pauljohn at ukans.edu
Dept. of Political Science http://lark.cc.ukans.edu/~pauljohn
University of Kansas Office: (785) 864-9086
Lawrence, Kansas 66045 FAX: (785) 864-5700
More information about the R-help
mailing list