[R-sig-ME] Specify the appropriate model for an Event Related Potentials (ERPs) study: what should I do with trial order (and other terms)

Paolo Canal paolo.canal at iusspavia.it
Tue Nov 8 12:18:05 CET 2016

Dear Mixed-Group,
I have acquired my data from one Experiment using a rather common 
paradigm in psycholinguistics. The experiment aimed at investigating the 
electro-physiological correlates of reading Typical (e.g., /chair/) vs 
Atypical (e.g., /foot rest/) members of a number (N=85) of semantic 
categories (e.g., /a kind of //Furniture/). In particular, we were 
interested in looking at differences associated with Education level 
(University N=24 vs non-University students N=23), and a three 
individual predictors. My issue is how to deal with some factors that 
are absolutely important in allowing for a better fit of the model, but 
make interpretations too "complicated".

The two main factors of interest thus Typicality (categorical, Typical 
vs Atypical) and Education (categorical, Hi vs Low Education). I already 
know that the choice of taking these factors as dichotomic is 
questionable, but I believe, defensible: in fact, although the measure 
of Typicality is actually continuous (a proportion varying from 0 to 1) 
it is paired within each semantic category, because when we selected the 
materials we took the pair of exemplars that showed the largest 
difference in Typicality, so within each category is the difference in 
typicality that actually matters. Treating Education as categorical is 
less defensible, but in some way we wanted to compare the predictive 
power of this variable with more continuous variables representing a set 
of abilities (3 cognitive measure, one of which moderated by years of 
education and age), in some way to possibly show that some brain 
mechanisms are better described when accounting for individual variation 
rather than group differences.

I used lmer in lme4 to analyze the effect of my independent variables on 
the average EEG voltage (continuous) from a set of EEG channels in two 
different time-windows of interest (I know GAMM would be even more 
appropriate than LMM, as what I am dealing with here are time-series, 
but I am not yet ready to try).

I first determined the random effect structure, selecting three grouping 
factors (subject, semantic category and channel) which are clusters of 
repeated measures: for each item I have several subjects, for each 
subject I have several items and for each channel I have several items 
and subjects (perhaps channel might be nested in subject and item rather 
than stand alone, any hints?). For each grouping factor, I allowed 
intercepts to vary (e.g., 1|subject). Moreover, because I wanted to be 
conservative and data are rather malleable (no convergence failure, no 
variance = 0 or 1, not too high correlations between terms) I included a 
set of terms to adjust by-subject and by-item slopes. I allowed 
by-subject and by-item slope adjustments for Typicality (as it varies 
within subjects and within semantic category) and by-item slope 
adjustments for Education level.

Things get more complicated when thinking of the influence of two 
variables that actually account for a lot of variation in the data: 
frequency of use of words and trial order. The first variable is also 
theoretically important and I want to include it as fixed effect; the 
second variable increases models' fit but because it makes the results 
less straightforward to interpret, I would not like to include in the 
fixed part of the model.

This brings me to the fixed effect structure and the actual questions to 
the list:

The initial design was very simple (2X2 plus covariates). The strategy 
was to fit the simple model Typicality + Frequency and evaluate if 
adding the interaction between Education (or the three covariates) and 
Typicality leads to relevant increase in likelihood, using always with 
the same random structure (the complex one).

Now I am not so sure this is appropriate and I have a list of doubts:
- Am I allowed to use the same complex random structure to compare the 
likelihood of models that have "simpler" fixed effects? In principle I 
guess it is correct to have the same random structure across comparisons.
- I am not interested in the effect of serial presentation (trial 
order), as it increases the order of the highest interaction. Is it 
appropriate to use it in the random structure only, or should I always 
discuss it in interaction with my factors of interest?

Thanks for any help

	[[alternative HTML version deleted]]

More information about the R-sig-mixed-models mailing list