[R] Appropriate specification of random effects structure for EEG/ERP data: including Channels or not?

Thu Sep 24 15:12:45 CEST 2015

There is actually a fair amount of ERP literature using mixed-effects
modelling, though you may have to branch out from the traditional
psycholinguistics journals a bit (even just more "neurolinguistics" or
language studies published in "psychology" would get you more!). But
just in the traditional psycholinguistics journals, there is a wealth of
literature, see for example the 2008 special issue on mixed models of
the Journal of Memory and Language.

I would NOT encode the channels/ROIs/other topographic measures as
random effects (grouping variables). If you think about the traditional
ANOVA analysis of ERPs, you'll recall that ROI or some other topographic
measure (laterality, saggitality) are included in the main effects and
interactions. As a rule of thumb, this corresponds to a fixed effect in
random effects models. More specifically, you generally care about
whether the particular levels of the topographic measure (i.e. you care
if an ERP component is located left-anterior or what not) and this is
what fixed effects test. Random effects are more useful when you only
care about the variance introduced by a particular term but not the
specific levels (e.g. participants or items -- we don't care about a
particular participant, but we do care about how much variance there is
between participants, i.e. how the population of participants looks). 

Or, another thought: You may have seen ANOVA by-subjects and by-items,
but I bet you've never seen an ANOVA by-channels. ANOVA "implicitly"
collapses the channels within ROIs and you can do the same with mixed
models. (That's an awkward statement technically, but it should help
with the intuition.)

There is an another, related important point -- "nuisance parameters"
aren't necessarily random effects. So even if you're not interested in
the per-electrode distribution of the ERP component, that doesn't mean
those should automatically be random effects. It *might* make sense to
add a channel (as in per-electrode) random effect, if you care to model
the variation within a given ROI (as you have done), but I haven't seen
that yet. It is somewhat rare to include a per-channel fixed effect,
just because you lose a lot of information that way and introduce more
parameters into the model, but you could include a more fine-grained
notion of saggital / lateral location based on e.g. the 10-20 system and
make that into an ordered factor. (Or you could be extreme and even use
the spherical coordinates that the 10-20 is based on and have continuous
measures of electrode placement!) The big problem with including
"channel" as a random-effect grouping variable is that the channels
would have a very complicated covariance structure (because adjacent
electrodes are very highly correlated with each other) and I'm not sure
how to model this in a straightforward way with lme4.

More generally, in considering your random effects structure, you should
look at Barr et al (2013, "Random effects structure for confirmatory
hypothesis testing: Keep it maximal") and the recent reply by Bates et
al (arXiv, "Parsimonious Mixed Models"). You should read up on the GLMM
FAQ on testing random effects -- there are different opinions on this
and not all think that testing them via likelihood-ratio tests makes
sense.

That wasn't my most coherent response, but maybe it's still useful. And
for questions like this on mixed models, do check out the R Special
Interest Group on Mixed Models. :-)

Best,
Phillip

On Thu, 2015-09-24 at 12:00 +0200, r-help-request at r-project.org wrote:
> Message: 4
> Date: Wed, 23 Sep 2015 12:46:46 +0200
> From: Paolo Canal <paolo.canal at iusspavia.it>
> To: r-help at r-project.org
> Subject: [R] Appropriate specification of random effects structure for
>         EEG/ERP data: including Channels or not?
> Message-ID: <56028316.2050004 at iusspavia.it>
> Content-Type: text/plain; charset="UTF-8"
> 
> Dear r-help list,
> 
> I work with EEG/ERP data and this is the first time I am using LMM to 
> analyze my data (using lme4).
> The experimental design is a 2X2: one manipulated factor is
> agreement, 
> the other is noun (agreement being within subjects and items, and
> noun 
> being within subjects and between items).
> 
> The data matrix is 31 subjects * 160 items * 33 channels. In ERP 
> research, the distribution of the EEG amplitude differences (in a
> time 
> window of interest) are important, and we care about knowing whether
> a 
> negative difference is occurring in Parietal or Frontal electrodes.
> At 
> the same time information from single channel is often too noisy and 
> channels are organized in topographic factors for evaluating
> differences 
> in distribution. In the present case I have assigned each channel to
> one 
> of three levels of two factors, i.e., Longitude (Anterior, Central, 
> Parietal) and Medial (Left, Midline, Right): for instance, one
> channel 
> is Anterior and Left. With traditional ANOVAs channels from the same 
> level of topographic factors are averaged before variance is
> evaluated 
> and this also has the benefit of reducing the noise picked up by the 
> electrodes.
> 
> I have troubles in deciding the random structure of my model. Very
> few 
> examples on LMM on ERP data exist (e.g., Newman, Tremblay, Nichols, 
> Neville & Ullman, 2012) and little detail is provided about the 
> treatment of channel. I feel it is a tricky term but very important
> to 
> optimize fit. Newman et al say "data from each electrode within an
> ROI 
> were treated as repeated measures of that ROI". In Newman et al, the 
> ROIs are the 9 regions deriving from Longitude X Medial
> (Anterior-Left, 
> Anterior-Midline, Anterior-Right, Central-Left ... and so on), so in
> a 
> way they treated each ROI separately and not according to the
> relevant 
> dimensions of Longitude and Medial.
> 
> We used the following specifications in lmer:
> 
> [fixed effects specification: ?V ~ Agreement * Noun * Longitude *
> Medial 
> * (cov1 + cov2 + cov3 + cov4)] (the terms within brackets are a
> series 
> of individual covariates, most of which are continuous variables)
> 
> [random effects specification: (1+Agreement*Type of Noun | subject) + 
> (1+Agreement | item) + (1|longitude:medial:channel)]
> 
> What I care the most about is the last term 
> (1|longitude:medial:channel). I chose this specification because I 
> thought that allowing each channel to have different intercepts in
> the 
> random structure would affect the estimation of the topographic fixed 
> effects (Longitude and Medial) in which channel is nested.
> Unfortunately 
> a reviewer commented that since "channel is not included in the fixed 
> effects I would probably leave that out".
> 
> But each channel is a repeated measure of the eeg amplitude inside
> the 
> two topographic factors, and random terms do not have to be in the
> fixed 
> structure, otherwise we would also include subjects and items in the 
> fixed effects structure. So I kind of feel that including channels as 
> random effect is correct, and having them nested in longitude:medial 
> allows to relax the assumption that the effect in the EEG has always
> the 
> same longitude:medial distribution. But I might be wrong.
> 
> I thus tested differences in fit (ML) with anova() between 
> (1|longitude:medial:channel) and the same model without the term, and
> a 
> third model with the model with a simpler (1|longitude:medial).
> 
> Fullmod vs Nochannel:
> 
> Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
> modnoch 119 969479 970653 -484621 969241
> fullmod 120 968972 970156 -484366 968732 508.73 1 < 2.2e-16 ***
> 
> Differences in fit is remarkable (no variance components with
> estimates 
> close to zero; no correlation parameters with values close to ?1).
> 
> Fullmod vs SimplerMod:
> 
>    Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
> 
> fullmod 120 968972 970156 -484366 968732
> simplermod 120 969481 970665 -484621 969241 0 0 1
> 
> Here the number of parameters to estimate in fullmod and simplermod
> is 
> the same but the increase in fit is very consistent (-509 BIC). So I 
> guess although the chisquare is not significant we do have a string 
> increase in fit. As I understand this, a model with better fit will
> find 
> more accurate estimates, and I would be inclined to keep the fullmod 
> random structure.
> 
> But perhaps I am missing something or I am doing something wrong.
> Which 
> is the correct random structure to use?
> 
> Feedbacks are very much appreciated. I often find answers in the
> list, 
> and this is the first time I post a question.
> Thanks,
> Paolo
> 
> 
> 
>