[R-sig-ME] Convergence issues running clmm in ordinal package

Thu Apr 21 10:16:49 CEST 2011

On 20 April 2011 13:41, Karen Lamb <k.lamb at sphsu.mrc.ac.uk> wrote:
> Hi all,
>
> I have a data set containing around 8148 individuals nested within approximately 2548 areas (DZID). I have an ordinal response (Num20) with 5 categories (0=None, 1=1 to 3, 2=4 to 11, 3=12 to 19, 4=20 and over) and have been trying to fit multilevel models using the clmm() function in the ordinal package to examine the statistical significance of individual (e.g. Sex, Car, Age, Limitill, Nssec3) and area level predictors (e.g. TotalPA) that I have.
>
> The data looks like this:
>
> DZID Sex Car Age Limitill Nssec3 TotalPA  Num20
> 2688   1   1      44        1      1                2              4
> 2688   2   1      42        3      2                2              4
> 2692   1   1      77        1      1                1              0
> 2692   1   1      57        3      1                1              4
> 2692   2   1      52        3      1                1              4
> 2692   2   1      16        3     99               1              4
> 2672   1   2      28        2      1                4              4
> 2692   1   2      86        1      1                1              0
> 864     1     1    22        3      1                 0             4
> 864     2     1    21        2      3                 0             4
>
> etc.
>
>> summary(Num20)
>   0    1    2    3    4 NA's
> 2103 1009 1895  869 2244   28
>
>> summary(DZID)
>    689    1634    2376    1598       4    1681    1760     683     906    2521      34     698
>     29      28      26      25      19      19      19      18      18      16      15      15
>   1173    2108    2272     430    1263    1269    1456    1538      17      40     284    1630
>     14      14      14      13      13      13      13      13      12      12      12      12
>   2202    2595      31    1164    1340    1775    2146    2502     359     605    1305    1319
>     12      12      11      11      11      11      11      11      10      10      10      10
>   1354    1396    1606    1960    1969    2014    2063    2214    2228    2459     115     644
>     10      10      10      10      10      10      10      10      10      10       9       9
>    717     726     843    1003    1027    1470    1642    1748    1896    2160    2227    2360
>      9       9       9       9       9       9       9       9       9       9       9       9
>   2423    2438    2439    2555    2601     111     199     254     258     321     331     428
>      9       9       9       9       9       8       8       8       8       8       8       8
>    459     490     583     604     775     919     968    1049    1123    1144    1309    1525
>      8       8       8       8       8       8       8       8       8       8       8       8
>   1604    1667    1688    1725    1802    1804    1830    1876    1889    1903    1922    1991
>      8       8       8       8       8       8       8       8       8       8       8       8
>   2053    2128 (Other)    NA's
>      8       8    6869     212
>
> I first of all used clm() to assess the association between individual level variables and the ordinal response and had no problem using this function. I then tried to fit a simple random intercept only ordinal multilevel model with DZID as a random effect to assess whether or not there is significant area level variability in the model. Unfortunately I experienced convergence issues:
>
>> mod.a <- clmm(Num20~1, random=DZID, na.action=na.omit)
> Warning message:
> clmm may not have converged:
>  optimizer 'ucminf' terminated with max|gradient|: 0.000501749472956048

Observe that this is a warning and not an error message, also it says
that clmm *may* not have converged: whether the optimizer terminated
close enough to the optimum is essentially op to you. The reason you
get the warning is because 5e-4 is larger that 1e-5, which is the
default maximum absolute gradient criterion (the grtol control option
in the ucminf optimizer), however, 5e-4 should be small enough for
most applications, so I would trust the results in this case.

If you change the optimizer and use, e.g. method = "nlminb" or "optim"
I expect you get essentially the same parameter estimates. You could
also (using the default ucminf optimizer) change the maximum absolute
gradient convergence criterion and append
control=clmm.control(grtol=1e-6) to your clmm call and see if it gets
closer to the optimum.

The main message is that you probably do not need to worry in this
case, but if you do, there are control options you can change.
>
>> summary(mod.a)
> Cumulative Link Mixed Model fitted with the Laplace approximation
>
> Call:
> clmm(location = Num20 ~ 1, random = DZID, na.action = na.omit)
>
> Random effects:
>           Var   Std.Dev
> DZID 0.2638869 0.5136992
>
> No location coefficients
>
> No scale coefficients
>
> Threshold coefficients:
>    Estimate Std. Error z value
> 0|1 -1.1123      NaN        NaN
> 1|2 -0.5100      NaN        NaN
> 2|3  0.5009      NaN        NaN
> 3|4  1.0096      NaN        NaN
>
> log-likelihood: -12154.35
> AIC: 24318.70
> Condition number of Hessian: NaN
> (239 observations deleted due to missingness)
> Warning message:
> In summary.clmm(mod.a) :
>  Variance-covariance matrix of the parameters is not defined

If you want standard errors, p-values etc. you should add 'Hess =
TRUE' to your clmm call. (I am aware that a more informative warning
message would be nice)
>
> I experience similar convergence issues if trying to include a fixed effect in the model:
>
>> mod.b <- clmm(Num20~Sex, random=DZID, na.action=na.omit)
> Warning message:
> clmm may not have converged:
>  optimizer 'ucminf' terminated with max|gradient|: 0.000355076487956175
>
>> summary(mod.b)
> Cumulative Link Mixed Model fitted with the Laplace approximation
>
> Call:
> clmm(location = Num20 ~ Sex, random = DZID, na.action = na.omit)
>
> Random effects:
>           Var   Std.Dev
> DZID 0.2642084 0.5140121
>
> Location coefficients:
>     Estimate Std. Error z value Pr(>|z|)
> Sex2 -0.1616      NaN        NaN NA
>
> No scale coefficients
>
> Threshold coefficients:
>    Estimate Std. Error z value
> 0|1 -1.2055      NaN        NaN
> 1|2 -0.6029      NaN        NaN
> 2|3  0.4096      NaN        NaN
> 3|4  0.9193      NaN        NaN
>
> log-likelihood: -12146.78
> AIC: 24305.56
> Condition number of Hessian: NaN
> (239 observations deleted due to missingness)
> Warning message:
> In summary.clmm(mod.b) :
>  Variance-covariance matrix of the parameters is not defined
>
> I found that someone on the list had experienced a similar problem (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/005119.html) and followed guidance proposed by Rune Haubo. That is, adding an nAGQ argument to the function and/or changing maxIter:
>
> mod.a <- clmm(Num20~1, random=DZID, na.action=na.omit, nAGQ = 10)
> Warning message:
> clmm may not have converged:
>  optimizer 'ucminf' terminated with max|gradient|: 0.000104892430129334
>
> and
>
>> mod.a <- clmm(Num20~1, random=DZID, na.action=na.omit, nAGQ = 10, control = clmm.control(maxIter = 200,
> + maxLineIter = 200))
> Warning message:
> clmm may not have converged:
>  optimizer 'ucminf' terminated with max|gradient|: 0.000104892430129334

Observe that this is the exact same maximum absolute gradient
indicating that the optimizer took the same path to the optimum and
that maxIter and maxLineIter never came into play.

>
> I tried several values for maxIter and maxLineIter and still experience these convergence issues.
>
> Am I using clmm() incorrectly? Is there a problem due to the fact that I have such a large number of areas to consider in the model? Is there a limit to the number of higher level units that clmm() can deal with? It may be that the higher level variation is not statistically significant. However, I wanted to assess this in the model as I have area level variables. Is there another ordinal multilevel regression approach that anyone can suggest would be suitable for this analysis?

>From what you showed us, I don't think there is anything to worry
about with your data. There is no limit to the number of observations
or random effect levels that clmm can cope with - you may run out of
memory at some point or other things can come into play, but that is
not directly related to clmm. So the number of areas in your data does
not seem to be a problem.

I hope I got around to all your questions, but please follow up if I
missed something or you experience additional issues.

Cheers,
Rune

>
> Any suggestions would be greatly appreciated!
>
> Cheers,
> Karen
>
>
> --
> Dr Karen Lamb
> Statistician/Career Development Fellow
> Neighbourhoods and Health
> MRC Social and Public Health Sciences Unit
> 4 Lilybank Gardens
> Glasgow
> G12 8RZ
>
> Tel: 0141 357 3949
> www.sphsu.mrc.ac.uk
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

-- 
Rune Haubo Bojesen Christensen

PhD Student, M.Sc. Eng.
Phone: (+45) 45 25 33 63
Mobile: (+45) 30 26 45 54

DTU Informatics, Section for Statistics
Technical University of Denmark, Build. 305, Room 122,
DK-2800 Kgs. Lyngby, Denmark