[R] Question regarding the discrepancy between count model parameter estimates between "pscl" and "MASS"
Nick Livingston
nlivingston at ymail.com
Fri Aug 29 05:57:26 CEST 2014
I have sought consultation online and in person, to no avail. I hope someone
on here might have some insight. Any feedback would be most welcome.
I am attempting to plot predicted values from a two-component hurdle model
(logistic [suicide attempt yes/no] and negative binomial count [number of
attempts thereafter]). To do so, I estimated each component separately using
glm (MASS). While I am able to reproduce hurdle results for the logit
portion in glm, estimates for the negative binomial count component are
different.
Call:
hurdle(formula = Suicide. ~ Age + gender + Victimization * FamilySupport |
Age + gender + Victimization * FamilySupport, dist = "negbin", link =
"logit")
Pearson residuals:
Min 1Q Median 3Q Max
-0.9816 -0.5187 -0.4094 0.2974 5.8820
Count model coefficients (truncated negbin with log link):
Estimate Std. Error z value
Pr(>|z|)
(Intercept) -0.29150 0.33127 -0.880 0.3789
Age 0.17068 0.07556 2.259 0.0239
*
gender 0.28273 0.31614 0.894 0.3712
Victimization 1.08405 0.18157 5.971 2.36e-09
***
FamilySupport 0.33629 0.29302 1.148 0.2511
Victimization:FamilySupport -0.96831 0.46841 -2.067 0.0387 *
Log(theta) 0.12245 0.54102 0.226 0.8209
Zero hurdle model coefficients (binomial with logit link):
Estimate Std. Error z value
Pr(>|z|)
(Intercept) -0.547051 0.215981 -2.533 0.01131
*
Age -0.154493 0.063994 -2.414
0.01577 *
gender -0.030942 0.284868 -0.109 0.91350
Victimization 1.073956 0.338015 3.177 0.00149
**
FamilySupport -0.380360 0.247530 -1.537 0.12439
Victimization\:FamilySupport -0.813329 0.399905 -2.034 0.04197 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Theta: count = 1.1303
Number of iterations in BFGS optimization: 23
Log-likelihood: -374.3 on 25 Df
> summary(logistic)
Call:
glm(formula = SuicideBinary ~ Age + gender = Victimization * FamilySupport,
family = "binomial")
Deviance Residuals:
Min 1Q Median 3Q Max
-1.9948 -0.8470 -0.6686 1.1160 2.0805
Coefficients:
Estimate Std. Error z value
Pr(>|z|)
(Intercept) -0.547051 0.215981 -2.533 0.01131 *
Age -0.154493 0.063994 -2.414 0.01577
*
gender -0.030942 0.284868 -0.109 0.91350
Victimization 1.073956 0.338014 3.177 0.00149
**
FamilySupport -0.380360 0.247530 -1.537 0.12439
Victimization:FamilySupport -0.813329 0.399904 -2.034 0.04197 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 452.54 on 359 degrees of freedom
Residual deviance: 408.24 on 348 degrees of freedom
(52 observations deleted due to missingness)
AIC: 432.24
Number of Fisher Scoring iterations: 4
> summary(Count1)
Call:
glm(formula = NegBinSuicide ~ Age + gender + Victimization * FamilySupport,
family = negative.binomial(theta = 1.1303))
Deviance Residuals:
Min 1Q Median 3Q Max
-1.6393 -0.4504 -0.1679 0.2350 2.1676
Coefficients:
Estimate Std. Error t value
Pr(>|t|)
(Intercept) 0.60820 0.13779 4.414 2.49e-05
***
Age 0.08836 0.04189 2.109 0.0373
*
gender 0.10983 0.17873 0.615 0.5402
Victimization 0.73270 0.10776 6.799 6.82e-10
***
FamilySupport 0.10213 0.15979 0.639 0.5241
Victimization:FamilySupport -0.60146 0.24532 -2.452 0.0159 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(1.1303) family taken to be
0.4549082)
Null deviance: 76.159 on 115 degrees of freedom
Residual deviance: 35.101 on 104 degrees of freedom
(296 observations deleted due to missingness)
AIC: 480.6
Number of Fisher Scoring iterations: 15
Alternatively, if there is a simpler way to plot hurdle regression output, or if anyone is award of another means of estimating NB models (I haven't had much luck with vglm from VGAM either), I would be happy to hear about that as well. I'm currently using the "visreg"
package for plotting.
Thanks!
More information about the R-help
mailing list