I'm hardly an expert here, but if I may awkwardly contribute a few things...
> Le mercredi 27 mai 2009 ? 18:08 +0200, Linda Mortensen a ?crit :
> > Dear list members,
> >
> > In the past, I have used the lmer function to model data sets with
> > crossed random effects (i.e., of subjects and items) and with either a
> > continuous response variable (reaction times) or a binary response
> > variable (correct vs. incorrect response). For the reaction time data,
> > I use the formula:
> > lmer(response ~ predictor1 * predictor2 .... + (1 + predictor1 *
> > predictor2 .... | subject) + (1 + predictor1 * predictor2 .... |
> > item), data)
> > And for the binomial data, I use the formula:
> > lmer(response ~ predictor1 * predictor2 .... + (1 + predictor1 *
> > predictor2 .... | subject) + (1 + predictor1 * predictor2 .... |
> > item), data, family="binomial").
> >
> > I'm currently working on a data set for which the response variable is
> > number of correct items with accuracy ranging from 0 to 5. So, here
> > the response variable is not binomial but multinomial.
You didn't ask about this, but are you sure you want the full set of
interactions among all your predictors? I imagine that'll tend to overfit,
and you won't be able to tell which terms are driving the regression. I
think you want to start with just the linear terms, and test the addition of
theoretically-interesting interactions using anova() on the fit models. The
lmer output won't give you useful significance information on the fixed
effects from just a single fit...
As for the response variable, each "number of correct responses" is just the
sum of 5 simple binomial responses to the same item, right? (Not a
multinomial.) If you still have the actual individual trials handy in your
data file, you should use that, instead of summing them before using lmer.
The family=binomial setting assumes each row is a trial, not an aggregate of
trials (sum or average). Learned this one by frustrating trial and error...
Treating it as a "pure class" variable loses the (essential) ordering
> information. Unless this ordering information (which seems to an
> ignorant outsider the most important information about your subjects) is
> essentially irrelevant to you problem, I'd rather use your number of
> correct items as a "rough" measure of a numeric variable, and accept, as
> a first approximation, its non-continuity as part of the experimental
> error.
Do you mean the sequential order of the 5 trials making up the set of
responses? That is, the fact that it's a repeated measure? Or do you mean
the fact that it's not a continuous measure, but instead is the sum of
individual responses from (presumably) a fixed distribution? It seems as if
the first issue is not important, as Dr. Mortensen is collapsing across
trials. (If it were important, trial would be a predictor.) And the second
issue should be taken care of by use of family=binomial and the implicit
logit transformation, right? This is not the same thing as a true
multinomial response, such as a Likert scale, where the difference between a
1 and a 2 may really be different from 2 vs 3. Many psychological models of
choice assume an underlying scalar response probability that is sampled
multiple times. Using the binomial option to lmer allows you to recover this
underlying probability. (Right?) And the interactions are properly scaled so
you don't get flakey significance due to ceiling/floor effects in accuracy.
-Harlan
--
Harlan D. Harris
New York University
Department of Psychology
[[alternative HTML version deleted]]