[R-sig-ME] Is it ok to use lmer() for an ordered categorical (5 levels) response variable?

Phillip Alday ph||||p@@|d@y @end|ng |rom mp|@n|
Tue Mar 5 13:04:04 CET 2019

Hi Nicolas,

How much you can get away bending the assumptions depends in some ways
on how well the resulting model fits your data. If the resulting model
is a poor fit, then it's not a great model for performing inference. The
other problem with bending assumptions is that a lot of 'error
statistics' (standard errors, t-values, and basically anything related
to significance testings) aren't guaranteed to do what they are supposed
to do. (In your case, the good behavior of your residuals suggests that
this won't be a huge problem, but there are no promises.)

You can get around this a bit by doing things like cross-validation or
other inferential steps based on how well the model generalizes to /
predicts new data instead of significance testing of coefficients or
linear hypotheses.

John Kruschke has written about this issue at some length and seems
convinced that it's (almost) always a bad idea to bend the
metric/continuous assumption when dealing with ordinal data:



The latter is largely a link/"press release" for the associated paper:

Liddell, T. M., & Kruschke, J. K. (2018). Analyzing ordinal data with
metric models: What could possibly go wrong? Journal of Experimental
Social Psychology , 79 , 328–348. doi:10.1016/j.jesp.2018.08.009

Finally, have you tried other link and threshold functions in clmm?
Those can make a huge difference!


On 5/3/19 11:00 am, Nicolas Deguines wrote:
> Hello everyone,
> I am investigating how engagement into a citizen science program can change
> participants' behavior in terms of implementing gardening techniques
> benefitting biodiversity.
> There are 2362 participants, distributed into 7 cohorts (= year in which
> they joined the program), and I have repeated gardening technique
> information for multiple years for each participant.
> So I need to use mixed modeling.
> One of the response variable is a score that can takes 5 values: 0, 1, 2,
> 3, or 4. It's ordered, it's not continuous (there are 5 levels).
> I would analyze this into a cumulative link mixed models (using clmm() from
> ordinal package) but the Hessian condition I obtained with such model is >
> 5.0e+06. I.e. assumption is violated (simplifying my initial full model did
> not help at all).
> As an alternative, I am wondering if I could treat this response variable
> has a continuous one into a lmer() model.
> When I do:
> - Normality of model residuals is nicely met
> - Homoscedasticity of model residuals is met as well.
> => does meeting these two assumptions is enough to validate the use of a
> lmer() model for an ordered categorical response variable?
> In one of Douglas Bates' presentation (slide 3 of Jan. 2011, Madison:
> http://lme4.r-forge.r-project.org/slides/2011-01-11-Madison/5GLMM.pdf), it
> is said that
> "When using LMMs we assume that the response being modeled is on a
> continuous scale.
> Sometimes we can bend this assumption a bit if the response is an ordinal
> response with a moderate to large number of levels.
> For example, [...a response variable taking] integer values on the scale of
> 1 to 10."
> => is 5 levels too few to be treated as continuous? Or would it be ok given
> residuals behave nicely?
> I would appreciate any help and thoughts on this.
> I checked that this was not treated in a previous post and I hope I did not
> miss it (sorry if I did).
> Best,
> Nicolas Deguines
> ----------------------------------
> Postdoctoral Research Associate
> Laboratoire Ecologie, Systématique et Evolution
> Université Paris Sud, Orsay, France
> Website: http://nicolasdeguines.weebly.com/
> 	[[alternative HTML version deleted]]
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

More information about the R-sig-mixed-models mailing list