[R] Comparing Negative Binomial Regression in Stata and R. Constants differ?

Paul E. Johnson pauljohn at ku.edu
Thu Dec 4 13:57:25 CET 2003


I looked for examples of count data that might interest the students and 
found this project about dropout rates in Los Angeles High Schools.  It 
is discussed in the UCLA stats help pages for the Stata users:
http://www.ats.ucla.edu/stat/stata/library/count.htm
and
See: http://www.ats.ucla.edu/stat/stata/library/longutil.htm

To replicate those results, I used R's excellent foreign package to 
bring the lahigh data in, then did
poisReg1 <- glm(daysabs~gender+ 
mathnce+langnce,family=poisson(link=log), data=lahigh)
library(MASS)
negbinReg1 <- glm.nb(daysabs~gender+ mathnce+langnce,link=log, data=lahigh)

The parameter estimates of the coefficients are the just about the same, 
except for the intercept estimates.  Below I pasted in the Negative 
Binomial results I got from R along with the Stata results that they 
report.  In the Stata output, they report alpha, same as 1/theta from 
the R glm.nb output.  Except for minor differences in standard errors, 
only the intercept estimates markedly differ.

Can anybody explain this?

-----------------------------------------------------------
Stata:

nbreg daysabs gender mathnce langnce

Negative binomial regression                      Number of obs   
=        316
                                                  LR chi2(3)      =      
20.74
                                                  Prob > chi2     =     
0.0001
Log likelihood = -880.87312                       Pseudo R2       =     
0.0116
------------------------------------------------------------------------------
 daysabs |      Coef.   Std. Err.       z     P>|z|       [95% Conf. 
Interval]
---------+--------------------------------------------------------------------
  gender  |  -.4311844   .1396656     -3.087   0.002       -.704924   
-.1574448
 mathnce |   -.001601     .00485     -0.330   0.741      -.0111067    
.0079048
 langnce  |  -.0143475   .0055815     -2.571   0.010      -.0252871    
-.003408
   _cons   |   3.147254   .3211669      9.799   0.000       2.517778    
3.776729
---------+--------------------------------------------------------------------
/lnalpha  |   .2533877   .0955362                          .0661402    
.4406351
---------+--------------------------------------------------------------------
   alpha  |   1.288383   .1230871     10.467   0.000       1.068377    
1.553694
------------------------------------------------------------------------------

Likelihood ratio test of alpha=0:    chi2(1) =  1334.20   Prob > chi2 = 
0.0000



Here is the R glm.nb output:

Deviance Residuals:
    Min       1Q   Median       3Q      Max 
-1.9785  -1.0627  -0.4147   0.2865   2.8193 

Coefficients:
             Estimate Std. Error z value Pr(>|z|)   
(Intercept)  2.716069   0.234174  11.598  < 2e-16 ***
gendermale  -0.431185   0.139516  -3.091  0.00200 **
mathnce     -0.001601   0.005300  -0.302  0.76259   
langnce     -0.014348   0.005372  -2.671  0.00756 **
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

(Dispersion parameter for Negative Binomial(0.7762) family taken to be 1)

    Null deviance: 378.43  on 315  degrees of freedom
Residual deviance: 356.93  on 312  degrees of freedom
AIC: 1771.7

Number of Fisher Scoring iterations: 1

Correlation of Coefficients:
           (Intercept) gendermale mathnce
gendermale -0.40                        
mathnce    -0.28       -0.09            
langnce    -0.43        0.19      -0.69 


              Theta:  0.7762
          Std. Err.:  0.0742

 2 x log-likelihood:  -1761.7460  

----------------------------------------------------------



-- 
Paul E. Johnson                       email: pauljohn at ukans.edu
Dept. of Political Science            http://lark.cc.ukans.edu/~pauljohn
University of Kansas                  Office: (785) 864-9086
Lawrence, Kansas 66045                FAX: (785) 864-5700




More information about the R-help mailing list