[R-sig-ME] Is it ok to use lmer() for an ordered categorical (5 levels) response variable?

Nicolas Deguines n|codegu|ne@ @end|ng |rom gm@||@com
Wed Mar 6 15:00:32 CET 2019

Hello Phillip and all,

Thanks a lot Phillip for your very interesting and useful answer, and for
the paper from Liddell & Kruschke. It helps a lot.

About trying other link and threshold functions in clmm: no huge difference
in my case unfortunately. I tried different combinations of each.
'equidistant' did do better, but the improvement was far from enough.

I computed density plots for my response variable as observed and as
predicted from my lmer() model (similar to what Liddell and Kruschke do in
Figure 6): the linear mixed-model does pretty well in fitting the data.
=> so I'd be enclined to trust the results from my lmer models in the
present case (but Liddell and Kruschke did show very clear cases when a
linear model fit very poorly the ordinal data).

Meanwhile, I thought of another alternative for analyzing this response
variable and I would be curious to read what people may think about it.
Before presenting that alternative, I need to say more about that 5-levels
response variable.
It is a score built by Muratet and Fontaine (2015)* to assess the
naturalness of a given private backyard (it is shown to be correlated with
higher abundance of butterflies).
In the backyard: fallow area, nettles (*Urtica dioica*), ivy (*Hedera helix*),
and brambles (*Rubus spp.*) are each scored one if present, and the
naturalness index was computed as the sum of these scores.
=> it results in a 5-levels ordinal variable because it can go from 0 to 4,
and each increase in 1 means a backyard with more features of 'naturalness'.
I wonder thus if this could be modelled using a glmer() with family =
binomial and feeding to the model two columns: cbind(sum of 1's, sum of
0's) (see R documentation for family{stats}, in the Details: "*As a
two-column integer matrix: the first column gives the number of successes
and the second the number of failures.*")
I will try and see how the model fit the data. But I would be interested in
getting a theoretical opinion.

I hope this can help others too

Best regards,
Nicolas Deguines


Postdoctoral Research Associate
Laboratoire Ecologie, Systématique et Evolution
Université Paris Sud, Orsay, France
Website: http://nicolasdeguines.weebly.com/

On Tue, 5 Mar 2019 at 13:04, Phillip Alday <phillip.alday using mpi.nl> wrote:

> Hi Nicolas,
> How much you can get away bending the assumptions depends in some ways
> on how well the resulting model fits your data. If the resulting model
> is a poor fit, then it's not a great model for performing inference. The
> other problem with bending assumptions is that a lot of 'error
> statistics' (standard errors, t-values, and basically anything related
> to significance testings) aren't guaranteed to do what they are supposed
> to do. (In your case, the good behavior of your residuals suggests that
> this won't be a huge problem, but there are no promises.)
> You can get around this a bit by doing things like cross-validation or
> other inferential steps based on how well the model generalizes to /
> predicts new data instead of significance testing of coefficients or
> linear hypotheses.
> John Kruschke has written about this issue at some length and seems
> convinced that it's (almost) always a bad idea to bend the
> metric/continuous assumption when dealing with ordinal data:
> http://doingbayesiandataanalysis.blogspot.com/2017/12/which-movie-is-rated-better-dont-treat.html
> http://doingbayesiandataanalysis.blogspot.com/2018/09/analyzing-ordinal-data-with-metric.html
> The latter is largely a link/"press release" for the associated paper:
> Liddell, T. M., & Kruschke, J. K. (2018). Analyzing ordinal data with
> metric models: What could possibly go wrong? Journal of Experimental
> Social Psychology , 79 , 328–348. doi:10.1016/j.jesp.2018.08.009
> Finally, have you tried other link and threshold functions in clmm?
> Those can make a huge difference!
> Phillip
> On 5/3/19 11:00 am, Nicolas Deguines wrote:
> > Hello everyone,
> >
> > I am investigating how engagement into a citizen science program can
> change
> > participants' behavior in terms of implementing gardening techniques
> > benefitting biodiversity.
> > There are 2362 participants, distributed into 7 cohorts (= year in which
> > they joined the program), and I have repeated gardening technique
> > information for multiple years for each participant.
> > So I need to use mixed modeling.
> >
> > One of the response variable is a score that can takes 5 values: 0, 1, 2,
> > 3, or 4. It's ordered, it's not continuous (there are 5 levels).
> > I would analyze this into a cumulative link mixed models (using clmm()
> from
> > ordinal package) but the Hessian condition I obtained with such model is
> >
> > 5.0e+06. I.e. assumption is violated (simplifying my initial full model
> did
> > not help at all).
> >
> > As an alternative, I am wondering if I could treat this response variable
> > has a continuous one into a lmer() model.
> > When I do:
> > - Normality of model residuals is nicely met
> > - Homoscedasticity of model residuals is met as well.
> > => does meeting these two assumptions is enough to validate the use of a
> > lmer() model for an ordered categorical response variable?
> >
> > In one of Douglas Bates' presentation (slide 3 of Jan. 2011, Madison:
> > http://lme4.r-forge.r-project.org/slides/2011-01-11-Madison/5GLMM.pdf),
> it
> > is said that
> > "When using LMMs we assume that the response being modeled is on a
> > continuous scale.
> > Sometimes we can bend this assumption a bit if the response is an ordinal
> > response with a moderate to large number of levels.
> > For example, [...a response variable taking] integer values on the scale
> of
> > 1 to 10."
> > => is 5 levels too few to be treated as continuous? Or would it be ok
> given
> > residuals behave nicely?
> >
> > I would appreciate any help and thoughts on this.
> > I checked that this was not treated in a previous post and I hope I did
> not
> > miss it (sorry if I did).
> >
> > Best,
> > Nicolas Deguines
> > ----------------------------------
> > Postdoctoral Research Associate
> > Laboratoire Ecologie, Systématique et Evolution
> > Université Paris Sud, Orsay, France
> > Website: http://nicolasdeguines.weebly.com/
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >

	[[alternative HTML version deleted]]

More information about the R-sig-mixed-models mailing list