[R-sig-ME] psychometric function fitting with lmer?

Fri Oct 29 22:05:31 CEST 2010

The info below is helpful, more comments below.

> -----Original Message-----
> From: mike.lwrnc at gmail.com [mailto:mike.lwrnc at gmail.com] On Behalf Of Mike
> Lawrence
> Sent: Friday, October 29, 2010 3:41 PM
> To: Doran, Harold
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] psychometric function fitting with lmer?
> 
> On Fri, Oct 29, 2010 at 3:29 PM, Doran, Harold <HDoran at air.org> wrote:
> >   First, I don't know how one uses OLS
> >  to fit a probit model.
> 
> I've seen it done. Folks usually collapse responses to
> means-per-value-on-the-x-axis  then either use a computationally
> intensive search algorithm to minimize the squared error on the
> proportion scale, or fit a simple linear function on the probit scale
> (when they encounter means of 1 or 0, they "tweak" these values by
> either dropping that data entirely or adding/subtracting some
> arbitrary value).

I think you're right, this seems like an inadvisable way to estimate some model parameters. I'll try and ignore this part of the email and focus on the info below. I will most likely explode if I try and figure out a) how this was done and b) why someone would do it this way when there are well-known ways to estimate model parameters in such cases. 

> Regardless, I suspect we both agree that these are inadvisable ways of
> dealing with this data, but I'm not sure we are on the same page with
> respect to the underlying paradigm motivating the data analysis.
> Whereas the paper you provided appears to be discussing data derived
> from questionnaires with different items, etc, I was thinking (and I
> apologize for failing to be more clear on this earlier) of data
> derived from studies of temporal order judgement and other
> psychophysical discrimination studies. Here's an example that I
> happened to find while searching google for an article not behind a
> pay-wall:
> 
> http://www.psych.ut.ee/~jyri/en/Murd-Kreegipuu-Allik_Perception2009.pdf
> 
> In such studies, individuals are provided two stimuli and asked "which
> one is more X", where the stimuli are manipulated to explore a variety
> of values for the difference of X between them. For example, in
> temporal order judgements, we ask which of two successive stimuli came
> first, right or left, then plot proportion of "right first" responses,
> accumulated over many trials, as a function of the amount of time by
> which the right stimulus led the right stimulus (SOA, or stimulus
> onset asynchrony, where negative values mean the right stimulus
> followed the left stimulus). 

OK, with you here so far, though this kind of thing is a bit away from my field of study. So, let me simplify for sake of argument, we have binary responses at this point where 1 = respondent answer 'right' and 0 = respondent answer 'left'. You also have some observed characteristics of these individuals, call them x. 

Now, you use the terms "likely" and "unlikely" below is something of a colloquial sense, but we can actually quantify this in a model such as:

Pr(1|\theta, \beta) = 1/[1 + exp(beta-theta)]

Which gives the conditional probability that some individual with theta (being an aptitude of some form) will choose the answer "right" also conditional on \beta, which is a characteristic of the item/task itself. Now, you state that you have other observed characteristics, such as "time". You can further condition on these observed characteristics to get the conditional probabilities directly, which seems to be what you are after. If this is right, then the methods in the paper I linked are directly related to this problem, just based on a different data set. It is a general modeling strategy you can employ with lmer.

What I don't understand is what bias or variance are you trying to get at? Bias refers to the property that \beta - E[\hat{\beta}] = 0, which would not hold if the parameter estimate were biased. Maybe I am still a bit unclear on the issue.

> This typically yields a sigmoidal
> function where people are unlikely to say "right-first" when the left
> stimulus leads by a lot (large negative SOA values) and very likely to
> say "right-first" when the right stimulus leads by a lot (large
> positive SOA values. The place where this function crosses 50% is
> termed the point of subjective simultaneity (PSS) and the slope of the
> function indexes the participants' sensitivity (shallow slopes
> indicate poor sensitivity, sharp slopes indicate good sensitivity).
> Researchers are often then interested in how various experimental
> manipulations affect these two characteristics of performance.

Now, if the slope of the curve matters, and it often does, then lmer cannot be used to estimate such a model because the model we demonstrate (Rasch model) assumes all items/tasks have a constant slope. But, other models extend the conditional probability above and can do this. I believe you can accomplish this using LTM package. 

> 
> 
> > Second,
> >  why are you treating the observed data as a parameter estimate? Why don't
> you
> >  actually estimate the model parameters (i.e., the item parameters), which
> are
> >  asymptotically unbiased under certain estimation conditions. You can do
> this in a number of
> >  ways in R, lme4 can do this using lmer as described here:
> >
> >  http://www.jstatsoft.org/v20/i02
> >
> >  Or you can use JML methods for Rasch in the MiscPsycho package or you can
> use
> >  MML methods in the LTM package. What you seem to be doing is treating the
> eICC
> >  as some kind of parameter for the item; but this is not reasonable I don't
> >  think.
> >
> >> This
> >> > fitting is typically done within each individual and condition of
> >> > interest separately, then the resulting parameters are submitted to 2
> >> > ANOVAs: one for bias, one for variability. I wonder if this analysis
> >> > might be achieved more efficiently using a single mixed effects model,
> >> > but I'm having trouble figuring out how to approach coding this.
> >>
> >
> >
> >  I'm not sure I can help you here as I am unclear on what you are doing
> >  exactly. Maybe if we elaborate a bit on what you are trying to do above, we
> >  can do this part next.
> >
> >>
> >> Below
> >> > is an example of data similar to that collected in this sort of
> >> > research, where individuals fall into two groups (variable "group"),
> >> > and are tested under two conditions (variable "cue") across a set of
> >> > values from a continuous variable (variable "soa"), with each cue*soa
> >> > combination tested repeatedly within each individual. A model like
> >> >
> >> > fit = lmer(
> >> >     formula = response ~ (1|id) + group*cue*soa
> >> >     , family = binomial( link='probit' )
> >> >     , data = a
> >> > )
> >> >
> >> > employs the probit link, but of course yields estimates for the slope
> >> > and intercept of a linear model on the probit scale, and I'm not sure
> >> > how (if it's even possible) to convert the conclusions drawn on this
> >> > scale to conclusions about the bias and variability parameters of
> >> > interest.
> >> >
> >> > Thoughts?
> >> >
> >
> >> > _______________________________________________
> >> > R-sig-mixed-models at r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >