[R-sig-ME] by-item random intercepts

Jake Westfall jake.a.westfall at gmail.com
Thu Sep 14 04:21:44 CEST 2017


Hi Chunyun,

As I mentioned, the stimuli in my experiments are single-digit arithmetic
> problems. Unlike stimuli such as English words, there are only 100
> single-digit arithmetic problems for each operation and all of them were
> included in my experiment.


If you've really exhaustively sampled all possible stimuli that could have
appeared in your study, then I would argue that it doesn't make conceptual
sense to analyze the stimuli as random effects.

I could adopt a fixed-effect approach and use 100 dummy variables to
> account for the item-based clustering but this would be practically
> impossible.


Is it? Have you tried it? Adding fixed effects usually increases the
computational burden *far* less than adding random effects. So while this
analysis might be a bit unwieldy, is it actually infeasible?

If the answer is yes, then a reasonable alternative is to simply ignore the
stimulus effects altogether. Practically speaking, the result is usually
much the same as explicitly adding stimulus fixed effects to the model. The
reason is because ignoring the stimulus effects (vs. adding them as fixed)
mainly just serves to throw the stimulus variance into the residual
variance, but unless your experiment is quite tiny, the residual variance
probably already contributes *very* little to the standard errors of the
fixed effect parameter estimates of interest. (Getting more into the
mathematical weeds, the residual variance enters the standard error
*roughly* as var(resid)/sqrt(n), where n is the number of rows -- this term
is probably already tiny unless your experiment is tiny, and it should
remain tiny even if you increase var(resid) by a lot.)

Note however that the above is assuming that the stimulus effects are at
best weakly correlated with the other regressors. That assumption is likely
true in an experimental context, but to the extent that it is false,
omitting the stimulus effects could also alter the other fixed effect
parameter estimates.

Should I use a nested structure instead of the crossed one I have mentioned
> above? For example, if each participant contributed multiple observations
> on each item, should I nest the by-item random intercept under subject?


I don't see why you would do that.

Jake


On Wed, Sep 13, 2017 at 9:02 PM, Chunyun Ma <mcypsy at gmail.com> wrote:

> Hello all,
>
> I am facing a dilemma of whether or not I should include by-item random
> intercepts in my model. Here are the details of my problem.
>
> I have a dataset of repeated measure in which participants solved
> single-digit arithmetic problems (e.g., 4x5, 2+7, ) and their response
> latencies were recorded.
>
> The dependent variable is response latency. The independent variables
> include characteristics of the stimuli (i.e., level 1) and of the
> participants (i.e., level 2).
>
> I set up the structure of random effects following recommendations from
> Barr et al. (2013). For simplicity, let's say the model contains one IV.
>
> DVti = gamma00 + gamma10IVti + u0i + u1iIVti + I0i + rti
>
> gamma00, gamma10 are fixed effects
> u0i is the random intercept
> u1j is the random slope
> I0i is the by-item random intercept
> rti is the residual
>
> I used lme4 to test the model
> lmer(DV ~ IV + (1 + IV|sub) + (1|item), data= DT)
>
> As I mentioned, the stimuli in my experiments are single-digit arithmetic
> problems. Unlike stimuli such as English words, there are only 100
> single-digit arithmetic problems for each operation and all of them were
> included in my experiment. So here is my dilemma:
>
> On one hand, a random by-item intercept would allow me to account for the
> fact that there are repeated observations on each item and they are not
> independent from each other.
> On the other, a random by-item intercept implies there exists more items
> which were not included in my experiment. However, this is not the case. I
> have included all single-digit arithmetic problems in my experiment.
>
> I could adopt a fixed-effect approach and use 100 dummy variables to
> account for the item-based clustering but this would be practically
> impossible.
>
> To iterate my question:
> should I include a random by-item intercept given the special feature of my
> dataset?
> A few follow-up questions:
> what's the consequence of including/excluding this random effect?  How are
> type-I error and power affected?
> Should I use a nested structure instead of the crossed one I have mentioned
> above? For example, if each participant contributed multiple observations
> on each item, should I nest the by-item random intercept under subject?
>
> Thank you very much!
>
> Chunyun
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list