[R-sig-ME] Question glmmADMB

Mon Jun 19 19:10:18 CEST 2017

 I did look at Wouter's data some because they sent it to me the last
time around.

The key observation is that this is the marginal distribution of the
outcome variable:

     0    1    2   3
596 105  18   1

  Mollie is right that these data are almost certainly *not*
zero-inflated. Indeed, when analyzing a similar kind of data in the
past (Pasch et al. 2013, http://www.jstor.org/stable/10.1086/673263 )
we found that we needed to reduce the data to binary (0 vs. >0),
although that was to overcome a specific technical problem (some
treatments with all zeros). Furthermore, trying to assess variation in
the level of zero-inflation across groups is almost certainly too
optimistic.  This worked fine for me:

form <- Num_Mistakes ~ (Context*Outcome*Language) + (1|Subject)
SLIP1_model <- glmmTMB(form, zi = ~1,
                   data = SLIP_1_error_data, family = "poisson")

but gave a zero-inflation parameter of -18, corresponding to a zero
inflation probability of about 10^{-8} -- indicating further that
zero-inflation is not necessary here.  (The huge standard error on the
zero-inflation is a technical issue caused by the flatness of the
goodness-of-fit surface in this extreme case.)

 cheers
   Ben Bolker

On Mon, Jun 19, 2017 at 6:47 AM, Mollie Brooks <mollieebrooks at gmail.com> wrote:
> Dear Wouter,
>
> Unfortunately we can’t see your data and code because attachments are removed from emails sent to this list. I’m guessing your convergence problems could be caused by overfitting. Lacking the code and data, I have some questions...
>
> Is subject the same as participant? Do you have multiple observations for some subjects (i.e. participants) or did aggregating remove the repeated measures? If there are not multiple observations per subject, then you do not need the random effect of subject. Make sure you aggregated the data in a logical way.
>
> How many observations do you have in total? Do you have 10 to 20 per term in the model? Are there observations representing all of the interactions? If not, you may need to simplify the model. I would avoid 3-way interactions (i.e.  Context * Outcome * Language).
>
> You don’t necessarily need to have a zero-inflated model. It’s possible that the zeros can be explained by a low mean. See Warton, D. I. (2005). Many zeros does not mean zero inflation: comparing the goodness-of-fit of parametric models to multivariate abundance data. Environmetrics, 16(3), 275–289. http://doi.org/10.1002/env.702 <http://doi.org/10.1002/env.702>.
>
> I would start with a negative binomial (NB) distribution and see if that converges. This could be done with any of the packages you mention. If it does converge, then try zero-inflated Poisson (ZIP) or zero-inflated negative binomial (ZINB) models (mgcv can do NB and ZIP; glmmTMB can do NB, ZIP, and ZINB). Then do model selection using AIC (only for models fit by the same package unless you know they calculate the likelihood in the same way).
>
> cheers,
> Mollie
>
> ———————————
> Mollie E. Brooks, Ph.D.
> Postdoctoral Researcher
> National Institute of Aquatic Resources
> Technical University of Denmark
>
>> On 16Jun 2017, at 11:02, Wouter Broos <Wouter.Broos at UGent.be> wrote:
>>
>> Dear Professor Bolker,
>>
>>
>> At the moment, I am working on a data set that contains
>> information on 'number of errors' that are made by participants. Thus
>> far, I used poisson regression to create generalized mixed effects
>> models of the data. In order to use poisson regression I aggregated the
>> data set by participant and by category. There are eight different
>> categories in total (I've added the data for ease of reference). There
>> are three main factors that determine the category: 1. Outcome: The
>> error that is made can be lexical or non-lexical / 2. Context: The
>> context of the block where the error was made can be mixed or
>> non-lexical / 3. Language: The language in which the error is made can
>> be the first language (L1) or the second language (L2). So, the
>> dependent variable in the model is 'Number of Errors' and the fixed
>> factors are 'Outcome', 'Context', and 'Language'. All factors
>> interaction with one another. One potential problem with the data set is
>> that there is an imbalance in the number of '0 number of errors per
>> category' and the '1/2/3 number of errors'. However, the zeros in my
>> data set can be explained in only one way: no mistake was made by that
>> participant in that category. So my first question is: do I need to add
>> the zero-inflation component to the model?
>>
>>
>>
>> I tried using the glmmADMB package that can take zero-inflation into
>> account in order to see whether there are differences with the normal
>> poisson model and the zero-inflated model. However, when I ran the
>> model, I got the error:
>>
>>
>> *Parameters were estimated, but standard errors were not: the most
>> likely problem is that the curvature at MLE was zero or negative*
>>
>> *Error in glmmadmb(Num_Mistakes ~ (Context * Outcome) + (1 | Subject),  : *
>>
>> *  The function maximizer failed (couldn't find parameter file)
>> Troubleshooting steps include (1) run with 'save.dir' set and inspect
>> output files; (2) change run parameters: see '?admbControl';(3) re-run
>> with debug=TRUE for more information on failure mode*
>>
>> *In addition: Warning message:*
>>
>> *running command 'C:\WINDOWS\system32\cmd.exe /c glmmadmb -maxfn 500
>> -maxph 5 -noinit -shess' had status 1*
>>
>>
>>
>> I tried leaving out one or two factors, googled the problem and tried
>> several 'solutions' but nothing works. So my second question is: how can
>> I solve this problem? I also used other packages where zero-inflation
>> could also be inserted (glmmTMB and mgcv). When I run the glmmTMB model,
>> zero-inflation works (but only for the interaction Context*Outcome, not
>> when I try to include Language) but the standard errors for the
>> zero-inflated model are rather large (and huge for the interaction)
>> leading to p-values of 1. An additional problem with glmmTMB is that I
>> cannot use all three factors because there are 'extreme or very small
>> eigenvalues'. The R-script is added as an attachment to the e-mail so
>> that you can see what I did. Would you be willing to help me out or do
>> you have any suggestions as to what I can do next? Thank you.
>>
>>
>>
>> Kind regards,
>>
>> Wouter Broos?
>>
>>       [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models