[R-sig-ME] How to use mixed-effects models on multinomial data

Jonathan Baron baron at psych.upenn.edu
Thu May 28 16:24:50 CEST 2009

I had already replied to Linda Mortensen, but Emmanuel Charpentier's
reply gives me the courage to say to the whole list roughly what I
said before, plus a little more.

The assumption that 0-1, 1-2, ... 4-5 are equally spaced measures of
the underlying variable of interest may indeed be incorrect, but so
may the assumption that the difference between 200-300 msec reaction
time is equivalent to the difference between 300-400 msec (etc.).
Failure of the assumptions will lead to some additional error, but, as
argued by Dawes and Corrigan (Psych. Bull., 1974), not much.  (And you
can look at the residuals as a function of the predictions to see how
bad the situation is.)  In general, in my experience (for what that is
worth), you lose far less power by assuming equal spacing than you
lose by using a more "conservative" model that treats the dependent
measure as ordinal only.

Occasionally you may have a theoretical reason for NOT treating the
dependent measure as equally spaced (e.g., when doing conjoint
analysis), or for treating it as equally spaced (e.g., when testing
additive factors in reaction time).

In the former sort of case, it might be appropriate to fit a model to
each subject using some other method, then look at the coefficients
across subjects.  (This is what I did routinely before lmer.)


On 05/28/09 14:35, Emmanuel Charpentier wrote:
> Le mercredi 27 mai 2009 �  18:08 +0200, Linda Mortensen a écrit :
> > Dear list members,
> >  
> > In the past, I have used the lmer function to model data sets with
> > crossed random effects (i.e., of subjects and items) and with either a
> > continuous response variable (reaction times) or a binary response
> > variable (correct vs. incorrect response). For the reaction time data,
> > I use the formula:
> > lmer(response ~ predictor1 * predictor2 ....  + (1 + predictor1 *
> > predictor2 .... | subject) + (1 + predictor1 * predictor2 .... |
> > item), data)

I think that the second random effect term should be (0 + ...), since
there is already an intercept in the first one.

> > I'm currently working on a data set for which the response variable is
> > number of correct items with accuracy ranging from 0 to 5. So, here
> > the response variable is not binomial but multinomial.

> This approximation may be too rough with only 5 items, though.
> Furthermore, depending on your beliefs on the cognitive model involved
> in giving a "correct" response, the distance between 0 and 1 correct
> response(s) may be close to or very different from the distance between
> 4 and 5 correct responses, which is exactly what proportional risks
> model (polr) tries to explain away.

Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)

More information about the R-sig-mixed-models mailing list