[R-sig-ME] Convergence issues running clmm in ordinal package
Rune Haubo
rhbc at imm.dtu.dk
Thu Apr 21 10:16:49 CEST 2011
On 20 April 2011 13:41, Karen Lamb <k.lamb at sphsu.mrc.ac.uk> wrote:
> Hi all,
>
> I have a data set containing around 8148 individuals nested within approximately 2548 areas (DZID). I have an ordinal response (Num20) with 5 categories (0=None, 1=1 to 3, 2=4 to 11, 3=12 to 19, 4=20 and over) and have been trying to fit multilevel models using the clmm() function in the ordinal package to examine the statistical significance of individual (e.g. Sex, Car, Age, Limitill, Nssec3) and area level predictors (e.g. TotalPA) that I have.
>
> The data looks like this:
>
> DZID Sex Car Age Limitill Nssec3 TotalPA Num20
> 2688 1 1 44 1 1 2 4
> 2688 2 1 42 3 2 2 4
> 2692 1 1 77 1 1 1 0
> 2692 1 1 57 3 1 1 4
> 2692 2 1 52 3 1 1 4
> 2692 2 1 16 3 99 1 4
> 2672 1 2 28 2 1 4 4
> 2692 1 2 86 1 1 1 0
> 864 1 1 22 3 1 0 4
> 864 2 1 21 2 3 0 4
>
> etc.
>
>> summary(Num20)
> 0 1 2 3 4 NA's
> 2103 1009 1895 869 2244 28
>
>> summary(DZID)
> 689 1634 2376 1598 4 1681 1760 683 906 2521 34 698
> 29 28 26 25 19 19 19 18 18 16 15 15
> 1173 2108 2272 430 1263 1269 1456 1538 17 40 284 1630
> 14 14 14 13 13 13 13 13 12 12 12 12
> 2202 2595 31 1164 1340 1775 2146 2502 359 605 1305 1319
> 12 12 11 11 11 11 11 11 10 10 10 10
> 1354 1396 1606 1960 1969 2014 2063 2214 2228 2459 115 644
> 10 10 10 10 10 10 10 10 10 10 9 9
> 717 726 843 1003 1027 1470 1642 1748 1896 2160 2227 2360
> 9 9 9 9 9 9 9 9 9 9 9 9
> 2423 2438 2439 2555 2601 111 199 254 258 321 331 428
> 9 9 9 9 9 8 8 8 8 8 8 8
> 459 490 583 604 775 919 968 1049 1123 1144 1309 1525
> 8 8 8 8 8 8 8 8 8 8 8 8
> 1604 1667 1688 1725 1802 1804 1830 1876 1889 1903 1922 1991
> 8 8 8 8 8 8 8 8 8 8 8 8
> 2053 2128 (Other) NA's
> 8 8 6869 212
>
> I first of all used clm() to assess the association between individual level variables and the ordinal response and had no problem using this function. I then tried to fit a simple random intercept only ordinal multilevel model with DZID as a random effect to assess whether or not there is significant area level variability in the model. Unfortunately I experienced convergence issues:
>
>> mod.a <- clmm(Num20~1, random=DZID, na.action=na.omit)
> Warning message:
> clmm may not have converged:
> optimizer 'ucminf' terminated with max|gradient|: 0.000501749472956048
Observe that this is a warning and not an error message, also it says
that clmm *may* not have converged: whether the optimizer terminated
close enough to the optimum is essentially op to you. The reason you
get the warning is because 5e-4 is larger that 1e-5, which is the
default maximum absolute gradient criterion (the grtol control option
in the ucminf optimizer), however, 5e-4 should be small enough for
most applications, so I would trust the results in this case.
If you change the optimizer and use, e.g. method = "nlminb" or "optim"
I expect you get essentially the same parameter estimates. You could
also (using the default ucminf optimizer) change the maximum absolute
gradient convergence criterion and append
control=clmm.control(grtol=1e-6) to your clmm call and see if it gets
closer to the optimum.
The main message is that you probably do not need to worry in this
case, but if you do, there are control options you can change.
>
>> summary(mod.a)
> Cumulative Link Mixed Model fitted with the Laplace approximation
>
> Call:
> clmm(location = Num20 ~ 1, random = DZID, na.action = na.omit)
>
> Random effects:
> Var Std.Dev
> DZID 0.2638869 0.5136992
>
> No location coefficients
>
> No scale coefficients
>
> Threshold coefficients:
> Estimate Std. Error z value
> 0|1 -1.1123 NaN NaN
> 1|2 -0.5100 NaN NaN
> 2|3 0.5009 NaN NaN
> 3|4 1.0096 NaN NaN
>
> log-likelihood: -12154.35
> AIC: 24318.70
> Condition number of Hessian: NaN
> (239 observations deleted due to missingness)
> Warning message:
> In summary.clmm(mod.a) :
> Variance-covariance matrix of the parameters is not defined
If you want standard errors, p-values etc. you should add 'Hess =
TRUE' to your clmm call. (I am aware that a more informative warning
message would be nice)
>
> I experience similar convergence issues if trying to include a fixed effect in the model:
>
>> mod.b <- clmm(Num20~Sex, random=DZID, na.action=na.omit)
> Warning message:
> clmm may not have converged:
> optimizer 'ucminf' terminated with max|gradient|: 0.000355076487956175
>
>> summary(mod.b)
> Cumulative Link Mixed Model fitted with the Laplace approximation
>
> Call:
> clmm(location = Num20 ~ Sex, random = DZID, na.action = na.omit)
>
> Random effects:
> Var Std.Dev
> DZID 0.2642084 0.5140121
>
> Location coefficients:
> Estimate Std. Error z value Pr(>|z|)
> Sex2 -0.1616 NaN NaN NA
>
> No scale coefficients
>
> Threshold coefficients:
> Estimate Std. Error z value
> 0|1 -1.2055 NaN NaN
> 1|2 -0.6029 NaN NaN
> 2|3 0.4096 NaN NaN
> 3|4 0.9193 NaN NaN
>
> log-likelihood: -12146.78
> AIC: 24305.56
> Condition number of Hessian: NaN
> (239 observations deleted due to missingness)
> Warning message:
> In summary.clmm(mod.b) :
> Variance-covariance matrix of the parameters is not defined
>
> I found that someone on the list had experienced a similar problem (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/005119.html) and followed guidance proposed by Rune Haubo. That is, adding an nAGQ argument to the function and/or changing maxIter:
>
> mod.a <- clmm(Num20~1, random=DZID, na.action=na.omit, nAGQ = 10)
> Warning message:
> clmm may not have converged:
> optimizer 'ucminf' terminated with max|gradient|: 0.000104892430129334
>
> and
>
>> mod.a <- clmm(Num20~1, random=DZID, na.action=na.omit, nAGQ = 10, control = clmm.control(maxIter = 200,
> + maxLineIter = 200))
> Warning message:
> clmm may not have converged:
> optimizer 'ucminf' terminated with max|gradient|: 0.000104892430129334
Observe that this is the exact same maximum absolute gradient
indicating that the optimizer took the same path to the optimum and
that maxIter and maxLineIter never came into play.
>
> I tried several values for maxIter and maxLineIter and still experience these convergence issues.
>
> Am I using clmm() incorrectly? Is there a problem due to the fact that I have such a large number of areas to consider in the model? Is there a limit to the number of higher level units that clmm() can deal with? It may be that the higher level variation is not statistically significant. However, I wanted to assess this in the model as I have area level variables. Is there another ordinal multilevel regression approach that anyone can suggest would be suitable for this analysis?
>From what you showed us, I don't think there is anything to worry
about with your data. There is no limit to the number of observations
or random effect levels that clmm can cope with - you may run out of
memory at some point or other things can come into play, but that is
not directly related to clmm. So the number of areas in your data does
not seem to be a problem.
I hope I got around to all your questions, but please follow up if I
missed something or you experience additional issues.
Cheers,
Rune
>
> Any suggestions would be greatly appreciated!
>
> Cheers,
> Karen
>
>
> --
> Dr Karen Lamb
> Statistician/Career Development Fellow
> Neighbourhoods and Health
> MRC Social and Public Health Sciences Unit
> 4 Lilybank Gardens
> Glasgow
> G12 8RZ
>
> Tel: 0141 357 3949
> www.sphsu.mrc.ac.uk
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
--
Rune Haubo Bojesen Christensen
PhD Student, M.Sc. Eng.
Phone: (+45) 45 25 33 63
Mobile: (+45) 30 26 45 54
DTU Informatics, Section for Statistics
Technical University of Denmark, Build. 305, Room 122,
DK-2800 Kgs. Lyngby, Denmark
More information about the R-sig-mixed-models
mailing list