[R-sig-ME] Error message in running CLMM models from Ordinal package: "optimizer nlminb failed to converge"

OZOLS Davis davis.ozols at unifr.ch
Wed Feb 28 11:11:36 CET 2018

Dear Rune, 

Thank you very much for your thorough answer this is of great help. I
realised that the random effect structure suggested in my models is very
complicated (model fm5 here) and could be the cause of all my problems:

>fm5 <- clmm(value.statement~preexisting.belief*engagement +
>(preexisting.belief*engagement |id) + (engagement |item), data =
>ucl.ordered, Hess = TRUE)

I think my reliance on such structures comes from my lack of fully
understanding the thinking behind random effects. When discussing it in
literature on linear mixed models people like Barr, Levy, Scheepers & Tily
(2013) and to some degree Matuschek, Kliegel, Vasishth, Baayen & Bates
(2017) argue that intercept only models increase the Type 1 error and as
such random slopes should be considered in building linear mixed models to
avoid the error inflation. This was the approach I took with my data. Of
course the authors there talk about linear mixed models while I am using
the CLMM framework and additionally I investigate interactions in my fixed
effects which tend to not be used as examples in mixed models (I assume
due to the complexity?).

My approach to choosing the random effect structure for my models starts
with assuming the maximal complexity of random effects that would make
sense for the design. For this experiment I investigate aspects of
confirmation bias and my hypothesis is that participants behave
differently based on their preexisting.beliefs and engagement with those
beliefs. Therefore I started with the random effect structure of
(preexisting.belief*engagement | id) and (engagement | item), however, I
would have also been happy with the random effect structure of
(preexisting.belief | id). I then compare the fm5 model with the anova()
function to the simplest alternative such as a random intercept model you
suggested in fm1:

fm1 <- clmm( value.statement ~ preexisting.belief * engagement + (1 |id) +
(1 | item), data=ucl.ordered)

This gives me the p-val of <2.2e-16 which I understood as a better fit for
the random slope model. I then move on to suggesting more complex
alternative models such as fm3 and fm4 and do the same comparison. In all
cases fm5 shows better fit: lower AIC, higher -LogLik and a small p-val. I
took this as an indication that in terms of model selection there is
enough evidence to prefer the random effect structure in fm5. Now it could
be the case that do to the complexity of the random structure in fm5 as
well as my fixed effects this might not be the best approach and could
actually provide me with misleading results. Is my approach flawed by
assuming such a complex random effect structure for this design (which in
itself is complex) and following the steps described? Is there a rule of
thumb on how complex one should go with these models?

Finally I am a bit confused about the difference between specifying: (1 |
item:engagement) and (engagement | item) as a random effect. If I assume
that the results of the experiment are affected by how engagement
functions with certain items (e.g. for some items engagement will vary
more and for some less and that will affect the behaviour of items)
wouldn't the (engagement | item) specification be more appropriate?

Once more thank you for your help and sorry for the long questions.
Best regards,

On 27/02/18 10:15, "Rune Haubo" <rune.haubo at gmail.com> wrote:

>It is difficult to say why the bigger model converges while the
>submodels do not. Perhaps the surprising part is why the bigger model
>converges rather than why the simpler ones don't given the rather
>complicated variance structure in the models.
>I don't see how the design can demand a variance structure this
>complicated and I would not assume that there is support in the data
>for these complicated structures over simpler alternatives. With
>models as (computationally) complicated as mixed models I think it is
>good advice to start small and build on model structures until
>additional structures are no longer supported by the data. This
>doesn't mean that the inferential process has to run bottom-up as
>In a situation like this I would start with something like
>fm1 <- clmm( value.statement ~ preexisting.belief * engagement + (1 |
>id) + (1 | item), data=ucl.ordered)
>or even simpler in both fixed and random structures if this gives rise
>to any problems. Focusing on the random structure, I would expand with
>additional _independent_ terms:
>fm2 <- clmm( value.statement ~ preexisting.belief * engagement + (1 |
>id) + (1 | item) + (1 | id:engagement) + (1 | item:engagement),
>continue to add terms like (1 | id:preexisting.belief) and (1 |
>id:preexisting.belief:engagement) if these terms are necessary or
>should be assessed (and if the model remains identifiable). Unless I
>need random slopes I very rarely move on to
>multivariate/vector-valued/correlated random-effect terms such as
>'(engagement | item)', but some feel that such terms make a lot of
>sense. But if you do want, vector-valued random effects, my suggestion
>is to look at how much support the data offers for such structures (in
>my experience they are usually not supported). For instance you could
>compare (focusing on random structures for item):
>fm3 <- clmm( value.statement ~ preexisting.belief * engagement + (1 |
>item) + (1 | item:engagement)  + (1 | id), data=ucl.ordered)
>fm4 <- clmm( value.statement ~ preexisting.belief * engagement +
>(engagement | item)  + (1 | id), data=ucl.ordered)
>fm1, fm3 and fm4 represents increasingly more complex random-effect
>structures for item and my advice is to not work with models which are
>more complex than what the data can reasonably support. In practice
>this means that if I fit fm1 and fm3 and run anova(fm3, fm1) and the
>p-value is not small-ish, I stick with fm1. If, instead there is
>support for fm3 over fm1 it can make sense to move on to consider fm4.
>Finally, let me digress momentarily on what fm3 and fm4 means:
>Scientifically fm3 represents a random interaction between item and
>engagement which means that random effects for item depend on the
>level of engagement. This is classical design-of-experiments line of
>thinking and a reasonable thing to consider. fm4 represents a
>particular interaction between engagement and item in which the
>_variance_ of the random effects for item (in addition to the
>random-effects for item themselves) depend on the level of engagement,
>but is there are particular scientific reason why the item-level
>_variance_ should depend on engagement?
>Hope this helps
>On 26 February 2018 at 15:07, OZOLS Davis via R-sig-mixed-models
><r-sig-mixed-models at r-project.org> wrote:
>> Dear list,
>> I have a question with regards to model convergence in CLMM function
>>that is implemented in the Ordinal package. More specifically what might
>>cause the error: "optimizer nlminb failed to converge message" in the
>>CLMM function that I am getting.
>> I am new to mixed model analysis so I will try to explain all the steps
>>I have taken in case there might be something wrong in my approach to
>>the analysis.
>> I have data set of 2200 observations with 6 variables: 115
>>participants, 24 items and a design that has as a response variable an
>>ordered scale from 1 to 10.
>>> head(data.ord)
>> id item value.statement quant preexisting.belief engagement
>> 1 R_ysTGuC676siU2Pf   I1               3 Baseline                 low
>>    high
>> 2 R_ysTGuC676siU2Pf   I2               2 Baseline                 low
>>    high
>> 3 R_ysTGuC676siU2Pf   I26               2     Most               low
>>   high
>> 4 R_ysTGuC676siU2Pf   I40               3     Every               low
>>    high
>> 5 R_ysTGuC676siU2Pf   I4               7 Baseline           undecided
>>  high
>> 6 R_ysTGuC676siU2Pf   I10               4 Baseline           undecided
>>   low
>> I investigate the interaction of three factors on the response variable:
>> quant(4 levels)* preexisting.belief(3 levels)* engagement(2 levels)
>> this is the summary of my data:
>>> str(data.ord)
>> 'data.frame': 2200 obs. of  6 variables:
>>  $ id                 : Factor w/ 115 levels
>>  $ item               : Factor w/ 24 levels
>>  $ value.statement   : Ord.factor w/ 10 levels
>>  $ quant              : Factor w/ 4 levels
>>  $ preexisting.belief : Factor w/ 3 levels
>>  $ engagement         : Factor w/ 2 levels
>> I plan to do my analysis by fitting four clmm models with random
>>intercept and random slope structures for both participants and items. I
>>choose the exact random effect structure based on theoretical
>>assumptions in my hypothesis as well as backward model selection
>>criterion discussed by Matuschek, Kliegel, Vasishth, Baayen & Bates
>>(2017) and Barr, Levy, Scheepers & Tily (2013). Due to the complexity of
>>my design it is not possible to fit the full three way interaction as a
>>random slope so I choose (1 + preexisting.belief*engagement |id) for
>>participants and (1 + engagement |item) for items - the choice is
>>motivated by theoretical assumptions as well as comparison of various
>>random effect models (with full interaction in fixed effects) using the
>>anova() function. I then proceed to fit the four clmm models to test my
>>fixed effects, starting with the null model and then adding all the
>>interaction terms in a step wise fashion.
>> While the more complex models like model 2 and 3 are able to converge:
>>> cm.2 <- clmm(value.statement~preexisting.belief*engagement +
>> +                       (1 + preexisting.belief*engagement |id) + (1 +
>>engagement |item),
>> +                 data = ucl.ordered, Hess = TRUE)
>> running the summary() function gives me:
>> max.grad = 9.78e-03 and cond.H = 2.3e+04
>>> cm.3 <- clmm(value.statement~quant*preexisting.belief*engagement +
>> +                       (1 + preexisting.belief*engagement |id) + (1 +
>>engagement |item),
>> +                 data = ucl.ordered, Hess = TRUE)
>> running the summary() function gives me:
>> max.grad = 1.28e-01 and cond.H = 1.7e+04
>> I find that the simpler model and even the null model show failures to
>>> cm.null <- clmm(value.statement~1 +
>> +                        (1 + preexisting.belief*engagement |id) + (1 +
>>engagement |item),
>> +                  data = ucl.ordered, Hess = TRUE)
>> Error: optimizer nlminb failed to converge
>>> cm.1 <- clmm(value.statement~quant +
>> +                        (1 + preexisting.belief*engagement |id) + (1 +
>>engagement |item),
>> +                  data = ucl.ordered, Hess = TRUE)
>> Error: optimizer nlminb failed to converge
>> I have tried looking for potential solutions to this on the r-sig-mm
>>list as well as other online resources and have tried some suggestions.
>>Using the "ucminf" optimizer does not work and produces error message:
>>"cannot use ucminf optimizer for this model, using nlminb instead". I
>>have tried changing the maxIter and maxLineIter parameters under
>>clmm.control to 200 and that has also resulted in no improvement. I am
>>puzzled by the fact that the error persists only for the simpler models.
>>My first guess was that the complexity of my design is too much for clmm
>>to handle with only 2200 observations, however, if that were the case
>>wouldn't models 3 and 4 also fail to converge?
>> I would greatly appreciate any help on these errors. I am also happy to
>>share the full data (in private correspondence) if that might be of help
>> Thank you in advance,
>> Davis Ozols
>> PhD Student,
>> University of Fribourg
>> CH-1700 Fribourg, Switzerland
>> Tel: +41 26 300 79 09
>> Fax: +41 26 300 97 87
>>         [[alternative HTML version deleted]]
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

More information about the R-sig-mixed-models mailing list