[R] Appropriate specification of random effects structure for EEG/ERP data: including Channels or not?

Wed Sep 23 12:46:46 CEST 2015

Dear r-help list,

I work with EEG/ERP data and this is the first time I am using LMM to 
analyze my data (using lme4).
The experimental design is a 2X2: one manipulated factor is agreement, 
the other is noun (agreement being within subjects and items, and noun 
being within subjects and between items).

The data matrix is 31 subjects * 160 items * 33 channels. In ERP 
research, the distribution of the EEG amplitude differences (in a time 
window of interest) are important, and we care about knowing whether a 
negative difference is occurring in Parietal or Frontal electrodes. At 
the same time information from single channel is often too noisy and 
channels are organized in topographic factors for evaluating differences 
in distribution. In the present case I have assigned each channel to one 
of three levels of two factors, i.e., Longitude (Anterior, Central, 
Parietal) and Medial (Left, Midline, Right): for instance, one channel 
is Anterior and Left. With traditional ANOVAs channels from the same 
level of topographic factors are averaged before variance is evaluated 
and this also has the benefit of reducing the noise picked up by the 
electrodes.

I have troubles in deciding the random structure of my model. Very few 
examples on LMM on ERP data exist (e.g., Newman, Tremblay, Nichols, 
Neville & Ullman, 2012) and little detail is provided about the 
treatment of channel. I feel it is a tricky term but very important to 
optimize fit. Newman et al say "data from each electrode within an ROI 
were treated as repeated measures of that ROI". In Newman et al, the 
ROIs are the 9 regions deriving from Longitude X Medial (Anterior-Left, 
Anterior-Midline, Anterior-Right, Central-Left ... and so on), so in a 
way they treated each ROI separately and not according to the relevant 
dimensions of Longitude and Medial.

We used the following specifications in lmer:

[fixed effects specification: υV ~ Agreement * Noun * Longitude * Medial 
* (cov1 + cov2 + cov3 + cov4)] (the terms within brackets are a series 
of individual covariates, most of which are continuous variables)

[random effects specification: (1+Agreement*Type of Noun | subject) + 
(1+Agreement | item) + (1|longitude:medial:channel)]

What I care the most about is the last term 
(1|longitude:medial:channel). I chose this specification because I 
thought that allowing each channel to have different intercepts in the 
random structure would affect the estimation of the topographic fixed 
effects (Longitude and Medial) in which channel is nested. Unfortunately 
a reviewer commented that since "channel is not included in the fixed 
effects I would probably leave that out".

But each channel is a repeated measure of the eeg amplitude inside the 
two topographic factors, and random terms do not have to be in the fixed 
structure, otherwise we would also include subjects and items in the 
fixed effects structure. So I kind of feel that including channels as 
random effect is correct, and having them nested in longitude:medial 
allows to relax the assumption that the effect in the EEG has always the 
same longitude:medial distribution. But I might be wrong.

I thus tested differences in fit (ML) with anova() between 
(1|longitude:medial:channel) and the same model without the term, and a 
third model with the model with a simpler (1|longitude:medial).

Fullmod vs Nochannel:

Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
modnoch 119 969479 970653 -484621 969241
fullmod 120 968972 970156 -484366 968732 508.73 1 < 2.2e-16 ***

Differences in fit is remarkable (no variance components with estimates 
close to zero; no correlation parameters with values close to ±1).

Fullmod vs SimplerMod:

   Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)

fullmod 120 968972 970156 -484366 968732
simplermod 120 969481 970665 -484621 969241 0 0 1

Here the number of parameters to estimate in fullmod and simplermod is 
the same but the increase in fit is very consistent (-509 BIC). So I 
guess although the chisquare is not significant we do have a string 
increase in fit. As I understand this, a model with better fit will find 
more accurate estimates, and I would be inclined to keep the fullmod 
random structure.

But perhaps I am missing something or I am doing something wrong. Which 
is the correct random structure to use?

Feedbacks are very much appreciated. I often find answers in the list, 
and this is the first time I post a question.
Thanks,
Paolo

	[[alternative HTML version deleted]]