[R-sig-ME] by-item random intercepts

Thu Sep 14 04:24:02 CEST 2017

Er, small correction, I meant that the residual variance enters the
standard error as roughly sqrt(var(resid)/n), not var(resid)/sqrt(n) :p

Jake

On Wed, Sep 13, 2017 at 9:21 PM, Jake Westfall <jake.a.westfall at gmail.com>
wrote:

> Hi Chunyun,
>
> As I mentioned, the stimuli in my experiments are single-digit arithmetic
>> problems. Unlike stimuli such as English words, there are only 100
>> single-digit arithmetic problems for each operation and all of them were
>> included in my experiment.
>
>
> If you've really exhaustively sampled all possible stimuli that could have
> appeared in your study, then I would argue that it doesn't make conceptual
> sense to analyze the stimuli as random effects.
>
> I could adopt a fixed-effect approach and use 100 dummy variables to
>> account for the item-based clustering but this would be practically
>> impossible.
>
>
> Is it? Have you tried it? Adding fixed effects usually increases the
> computational burden *far* less than adding random effects. So while this
> analysis might be a bit unwieldy, is it actually infeasible?
>
> If the answer is yes, then a reasonable alternative is to simply ignore
> the stimulus effects altogether. Practically speaking, the result is
> usually much the same as explicitly adding stimulus fixed effects to the
> model. The reason is because ignoring the stimulus effects (vs. adding them
> as fixed) mainly just serves to throw the stimulus variance into the
> residual variance, but unless your experiment is quite tiny, the residual
> variance probably already contributes *very* little to the standard errors
> of the fixed effect parameter estimates of interest. (Getting more into the
> mathematical weeds, the residual variance enters the standard error
> *roughly* as var(resid)/sqrt(n), where n is the number of rows -- this term
> is probably already tiny unless your experiment is tiny, and it should
> remain tiny even if you increase var(resid) by a lot.)
>
> Note however that the above is assuming that the stimulus effects are at
> best weakly correlated with the other regressors. That assumption is likely
> true in an experimental context, but to the extent that it is false,
> omitting the stimulus effects could also alter the other fixed effect
> parameter estimates.
>
> Should I use a nested structure instead of the crossed one I have mentioned
>> above? For example, if each participant contributed multiple observations
>> on each item, should I nest the by-item random intercept under subject?
>
>
> I don't see why you would do that.
>
> Jake
>
>
> On Wed, Sep 13, 2017 at 9:02 PM, Chunyun Ma <mcypsy at gmail.com> wrote:
>
>> Hello all,
>>
>> I am facing a dilemma of whether or not I should include by-item random
>> intercepts in my model. Here are the details of my problem.
>>
>> I have a dataset of repeated measure in which participants solved
>> single-digit arithmetic problems (e.g., 4x5, 2+7, ) and their response
>> latencies were recorded.
>>
>> The dependent variable is response latency. The independent variables
>> include characteristics of the stimuli (i.e., level 1) and of the
>> participants (i.e., level 2).
>>
>> I set up the structure of random effects following recommendations from
>> Barr et al. (2013). For simplicity, let's say the model contains one IV.
>>
>> DVti = gamma00 + gamma10IVti + u0i + u1iIVti + I0i + rti
>>
>> gamma00, gamma10 are fixed effects
>> u0i is the random intercept
>> u1j is the random slope
>> I0i is the by-item random intercept
>> rti is the residual
>>
>> I used lme4 to test the model
>> lmer(DV ~ IV + (1 + IV|sub) + (1|item), data= DT)
>>
>> As I mentioned, the stimuli in my experiments are single-digit arithmetic
>> problems. Unlike stimuli such as English words, there are only 100
>> single-digit arithmetic problems for each operation and all of them were
>> included in my experiment. So here is my dilemma:
>>
>> On one hand, a random by-item intercept would allow me to account for the
>> fact that there are repeated observations on each item and they are not
>> independent from each other.
>> On the other, a random by-item intercept implies there exists more items
>> which were not included in my experiment. However, this is not the case. I
>> have included all single-digit arithmetic problems in my experiment.
>>
>> I could adopt a fixed-effect approach and use 100 dummy variables to
>> account for the item-based clustering but this would be practically
>> impossible.
>>
>> To iterate my question:
>> should I include a random by-item intercept given the special feature of
>> my
>> dataset?
>> A few follow-up questions:
>> what's the consequence of including/excluding this random effect?  How are
>> type-I error and power affected?
>> Should I use a nested structure instead of the crossed one I have
>> mentioned
>> above? For example, if each participant contributed multiple observations
>> on each item, should I nest the by-item random intercept under subject?
>>
>> Thank you very much!
>>
>> Chunyun
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>

	[[alternative HTML version deleted]]