[R-sig-ME] Specify the appropriate model for an Event Related Potentials (ERPs) study: what should I do with trial order (and other terms)

Thu Nov 10 13:17:20 CET 2016

Dear Paolo,

Using Subject (or Participant, if you want to avoid some ambiguity in a linguistic context) and Item as grouping factors is fairly standard in the ERP literature as is use LMM with the mean amplitude in a given time window as the dependent variable (at least when mixed models are used, many colleagues seem reticent to abandon ANOVA despite Clark 1973 and Judd, Westfall and Kenny 2012 and many other papers emphasising the advantages of explicit regression over ANOVA). GAMMs over the whole time course of the ERP are still relatively new and not widely used, although people like Harald Baayen and Stefanie Nickels are working on this. Beyond the enhanced computational complexity, GAMMs also suffer from the whole *additive* bit, which can be addressed, but is difficult for cognitive neuroscience, where the interactions are often the most interesting bits.

I hesitate to use Channel as a grouping variable, although this is the approach taken by e.g. Payne et al (2015), because the the distribution of effects for channels is not multivariate normal (the assumed distribution in lme4) for most references. Indeed, we know that Channel effects vary systematically (this the whole notion of "topography" in EEG), and I personally feel that we should actually model channel effects parametrically using a suitable coordinate system, such as the one that the 10-20 system is actually based on (angular deviations from the apical electrode). However, this is again much more complex. Including Channel as a categorical fixed effect is also not particularly satisfying as this will add n_chans-1 coefficients for the main effect of channel as well as many interaction terms. You could potentially have regions of interest (ROIs) / topographical factors (left-right, anterior-posterior) in your model fixed effects and then either ignore channel (as is actually done for the traditional rmANOVA analysis of ERP data) or include an intercept-only random-effect term for channel under the assumption that there is a multivariate normal distribution of effects within a given ROI. However, this assumption will generally only hold for high-density setups with topographically small ROIs. Larger ROIs will of course show systematic variation as you move from one edge to another. And you will also run into problems if the number of channels within each ROI are small as this will bias your random-effect estimates: remember that random effects are *variance* components and like all variance estimates, they require several observed levels for accurate estimation. (One rule of thumb I've heard is 10ish.) And as Payne et al saw in their own data, the channel factor typically doesn't help with model fit anyway and can hurt convergence, so I would just leave it out completely if you don't want to model it parametrically.

I'm not sure why you mentioned "semantic category" in your random-effect structure. In my experience, semantic category is typically something for which we care about the individual levels (of which there are not that many in any one experiment) and so are better modelled by fixed effects. (In other words, we care about the differences in processing between Furniture and People, not just that different categories show differences.) Items are good random effects, semantic categories are not. And it's not a problem if each item only belong to certain semantic categories. lme4 can handle such nesting structures. If you only have a few semantic categories, then you'll also run into computational / statistical trouble with treating them as random effects (see the last paragraph).

In short, I would propose the following model structure:

mean_voltage ~ 1 + typicality * education * frequency * semantic_category + (1+... | subject) + (1+ ... | item)

Your particular choice of which slopes to include for each random-effect grouping term is a difficult one, as has been highlighted by the Baayen et al (2008), Barr et al (2013), Barr (2013) and the recent set of Bates preprints on parsimonious mixed models as well as a number of threads on this mailing list. Generally, I start off with main effects and if that model converges, great, if not, then I reduce more. In my experience with EEG studies on language, interactions in the random-effects structure just lead to overly complex models that take a long time to compute, fail to converge or show others signs of being degenerate. In other words, I would consider the following RE structure for your data:

(1 +  typicality + frequency + semantic_category | subject) + (1 +  education  | item)

I left a lot out of the RE structure for Item because, assuming that each Item represents a single lemma / word, then it doesn't have different frequencies / categories / typicalities and so it doesn't make sense to consider a variable effect for something that is constant within the grouping unit. Similarly for education and subject.

If you don't model semantic category explicitly, then your item random effect should absorb the variance due to it.  You just won't have an explicit term in the model to point to that only describes the effect of semantic category (as item-level variance will cover a whole host of other effects related to the differences between words).

(For posterity -- I think we discussed some of these issues previously on r-help: https://stat.ethz.ch/pipermail/r-help/2015-September/432561.html )

To address some of your explicit questions more directly:

> - Am I allowed to use the same complex random structure to compare the 
> likelihood of models that have "simpler" fixed effects? In principle I 
> guess it is correct to have the same random structure across comparisons.

Not quite. You should not have random slopes for effects not in the fixed-effect model structure because the mixed-model formulation used by lme4 assumes zero-mean for the random effects. In other words, lme4 random effects are estimates of how much the different grouping factors lead to variance around the population-level estimates delivered by the fixed effects. 

> - I am not interested in the effect of serial presentation (trial 
> order), as it increases the order of the highest interaction. Is it 
> appropriate to use it in the random structure only, or should I always 
> discuss it in interaction with my factors of interest?

No, for the reason above. But you could have the order of serial presentation a non-interacting / main-effect only fixed effect. Also, if you did the usual thing and you counterbalanced presentation order (e.g. via several different pseudo-random presentation orders/lists) across participants, then the usual assumption is that any effects of presentation order cancel out across participants. The item grouping factor will also absorb some of this variance. 

Best,
Phillip

> On 8 Nov 2016, at 21:48, Paolo Canal <paolo.canal at iusspavia.it> wrote:
> 
> Dear Mixed-Group,
> I have acquired my data from one Experiment using a rather common 
> paradigm in psycholinguistics. The experiment aimed at investigating the 
> electro-physiological correlates of reading Typical (e.g., /chair/) vs 
> Atypical (e.g., /foot rest/) members of a number (N=85) of semantic 
> categories (e.g., /a kind of //Furniture/). In particular, we were 
> interested in looking at differences associated with Education level 
> (University N=24 vs non-University students N=23), and a three 
> individual predictors. My issue is how to deal with some factors that 
> are absolutely important in allowing for a better fit of the model, but 
> make interpretations too "complicated".
> 
> The two main factors of interest thus Typicality (categorical, Typical 
> vs Atypical) and Education (categorical, Hi vs Low Education). I already 
> know that the choice of taking these factors as dichotomic is 
> questionable, but I believe, defensible: in fact, although the measure 
> of Typicality is actually continuous (a proportion varying from 0 to 1) 
> it is paired within each semantic category, because when we selected the 
> materials we took the pair of exemplars that showed the largest 
> difference in Typicality, so within each category is the difference in 
> typicality that actually matters. Treating Education as categorical is 
> less defensible, but in some way we wanted to compare the predictive 
> power of this variable with more continuous variables representing a set 
> of abilities (3 cognitive measure, one of which moderated by years of 
> education and age), in some way to possibly show that some brain 
> mechanisms are better described when accounting for individual variation 
> rather than group differences.
> 
> I used lmer in lme4 to analyze the effect of my independent variables on 
> the average EEG voltage (continuous) from a set of EEG channels in two 
> different time-windows of interest (I know GAMM would be even more 
> appropriate than LMM, as what I am dealing with here are time-series, 
> but I am not yet ready to try).
> 
> I first determined the random effect structure, selecting three grouping 
> factors (subject, semantic category and channel) which are clusters of 
> repeated measures: for each item I have several subjects, for each 
> subject I have several items and for each channel I have several items 
> and subjects (perhaps channel might be nested in subject and item rather 
> than stand alone, any hints?). For each grouping factor, I allowed 
> intercepts to vary (e.g., 1|subject). Moreover, because I wanted to be 
> conservative and data are rather malleable (no convergence failure, no 
> variance = 0 or 1, not too high correlations between terms) I included a 
> set of terms to adjust by-subject and by-item slopes. I allowed 
> by-subject and by-item slope adjustments for Typicality (as it varies 
> within subjects and within semantic category) and by-item slope 
> adjustments for Education level.
> 
> Things get more complicated when thinking of the influence of two 
> variables that actually account for a lot of variation in the data: 
> frequency of use of words and trial order. The first variable is also 
> theoretically important and I want to include it as fixed effect; the 
> second variable increases models' fit but because it makes the results 
> less straightforward to interpret, I would not like to include in the 
> fixed part of the model.
> 
> This brings me to the fixed effect structure and the actual questions to 
> the list:
> 
> The initial design was very simple (2X2 plus covariates). The strategy 
> was to fit the simple model Typicality + Frequency and evaluate if 
> adding the interaction between Education (or the three covariates) and 
> Typicality leads to relevant increase in likelihood, using always with 
> the same random structure (the complex one).
> 
> Now I am not so sure this is appropriate and I have a list of doubts:
> - Am I allowed to use the same complex random structure to compare the 
> likelihood of models that have "simpler" fixed effects? In principle I 
> guess it is correct to have the same random structure across comparisons.
> - I am not interested in the effect of serial presentation (trial 
> order), as it increases the order of the highest interaction. Is it 
> appropriate to use it in the random structure only, or should I always 
> discuss it in interaction with my factors of interest?
> 
> Thanks for any help
> Paolo
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models