[R-sig-ME] Trouble running full ZINB or ZANB model: NaNs produced and errors in optim()
J Straka
jstraka at uvic.ca
Thu May 3 22:19:19 CEST 2012
Hello,
I’m attempting to run zero-inflated and hurdle models using the pscl()
package, but keep encountering problems with running my
fully-parameterized model and I’m not sure why.
My data consist of measurements of seed set (the response, which is
zero-inflated and overdispersed) for two treatments and a control (B,
F, and O) at three elevations (low, mid, and high). At each elevation
I also measured average temperature (AvgTemp) and abundance of insects
(N). My goal is to construct a model that best describes seed set
using the four predictors above (and interactions, if necessary).
I find that I have no problems running a partial model with only my
two categorical predictors as follows (just an example using hurdle(),
but results are the same for zeroinfl()):
H1B <- hurdle(AVInt ~ Elev + Treatment | Elev + Treatment, dist =
"negbin", link = "logit", data = erytseeds)
It’s when I try to add the two continuous predictors that things begin
to get messy;
H4B <- hurdle(AVInt ~ Elev*Treatment*AvgTemp*N | Elev*Treatment*AvgTemp*N)
This results in the error:
Warning message:
In sqrt(diag(object$vcov)) : NaNs produced
Error in optim(fn = loglikfun, gr = gradfun, par = c(start$count, start$zero, :
non-finite value supplied by optim
Even when I run the model using one continuous predictor and one
categorical predictor, or both continuous predictors, I end up with
NAs in the summary table of the count, but not in the zero-inflated
portion of the model. For example:
>summary(H9B)
Call:
zeroinfl(AVInt ~ AvgTemp+N | AvgTemp+N, data = erytseeds, dist =
"negbin", link = "logit")
Pearson residuals:
Min 1Q Median 3Q Max
-0.5807 -0.4566 -0.2791 -0.2791 5.5813
Count model coefficients (negbin with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.770120 0.452221 -8.337 <2e-16 ***
AvgTemp 0.404550 0.173458 2.332 0.0197 *
N 0.001211 NA NA NA
Log(theta) 1.308387 NA NA NA
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.814e+01 8.115e+00 -5.932 2.99e-09 ***
AvgTemp 4.039e+00 7.158e-01 5.642 1.68e-08 ***
N 3.010e-03 4.863e-04 6.189 6.04e-10 ***
---
Theta = 3.7002
Number of iterations in BFGS optimization: 31
Log-likelihood: -456.4 on 7 Df
> summary(H10B)
Call:
zeroinfl(AVInt ~ Treatment+N | Treatment+N, data = erytseeds, dist =
"negbin", link = "logit")
Pearson residuals:
Min 1Q Median 3Q Max
-0.6608 -0.4684 -0.3565 -0.2956 6.1730
Count model coefficients (negbin with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.9555134 NA NA NA
TreatmentF 0.1407240 0.1844811 0.763 0.446
TreatmentO 0.0430166 0.1647207 0.261 0.794
N 0.0009843 NA NA NA
Log(theta) 1.3049021 NA NA NA
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.0685048 1.0056156 0.068 0.94569
TreatmentF -0.3822091 0.3631982 -1.052 0.29264
TreatmentO -1.1624582 0.3551596 -3.273 0.00106 **
N 0.0009120 0.0004773 1.911 0.05603 .
---
Theta = 3.6873
Number of iterations in BFGS optimization: 31
Log-likelihood: -458.1 on 9 Df
Reading other posts, I know that my values are all >= 0, and there are
no missing values that could be causing this, but there ARE lots of
zeroes. So my questions are as follows:
1) Why am I getting all these NaNs, and why can’t I run the full
(saturated) model, including continuous as well as categorical
predictors and interactions? How can I begin the process of model
simplification if I can’t get the full model to run?
2) If this is a result of the many zeroes in my data set, is there a
way to work around this?
Any advice would be greatly appreciated.
Jason Straka
University of Victoria
More information about the R-sig-mixed-models
mailing list