[R-sig-ME] lmer specification for maximal random effects structure and one-between and two within-subject factors

Sun Oct 6 02:53:47 CEST 2013

Hi r-sig-me,

I am analyzing the results of an experiment with one between-subject 
factor (group, 3 levels: yo, yf, ef)
and two within-subject factors (vf, 2 levels: vf and novf, and task, 3 
levels: stand, ltstat, ltmov).
Each group includes uniquely identified 15 subjects. All factors are 
coded with sum contrasts (contr.sum).

Barr et al. 2013 advise to keep the random effects structure maximal. 
Moreover, they write
that "In general, within-unit treatments require both the by-unit 
intercepts and slopes in the
random effects specification, whereas between-unit treatments require 
only the by-unit
random intercepts."

Barr et al. (2013) Random effects structure for confirmatory hypothesis 
testing:
Keep it maximal. Journal of Memory and Language 68, 255–278

I have two questions:

First, how would you specify the model if you were trying to follow Barr 
et al.
(2013) advice to have a maximal random effect covariance structure 
(granted that it still has
to be identifiable) for such a design.

Second, I could not find a discussion discussing design matrix for 
random effects corresponding
to two or more within-subject factors. It seems that it would make sense 
in cases like that
to define one intercept for each subject plus one random effect for each 
treatment relative
to the baseline but the syntax to achieve it is ackward. It seems that 
the default leads
to design matrices with two many random effects. See the examples below.

Following Barr, my initial attempt was:

 > fit<-lmer(log(e.area)~group*task*vf + (1|su0) + (task|su0) + 
(vf|su0), CP0vf)
In checkZrank(reTrms$Zt, n = n, control, nonSmall = 1e+06) :
number of observations <= rank(Z); variance-covariance matrix will be 
unidentifiable.
 > VarCorr(fit)
Groups Name Std.Dev. Corr
su0 (Intercept) 0.21461
su0.1 (Intercept) 0.36124
taskltstat-stand 0.16478 0.501
taskltdyn-stand 0.12364 -0.949 -0.676
su0.2 (Intercept) 0.19043
vfnovf-vf 0.10934 -0.224
Residual 0.17567

This model yields a warning message and includes more random effects 
that I wanted (three "intercepts" !)

My first attempt to remove intercept to simplify the covariance 
structure was not successful

 > fit<-lmer(log(e.area)~group*task*vf + (1|su0) + (task-1|su0) + 
(vf-1|su0),CP0vf)
which yielded the warning message In checkZrank(reTrms$Zt, n = n, 
control, nonSmall = 1e+06) :
number of observations <= rank(Z); variance-covariance matrix will be 
unidentifiable.
 > VarCorr(fit)
Groups Name Std.Dev. Corr
su0 (Intercept) 0.30722
su0.1 taskstand 0.38166
taskltstat 0.43823 0.804
taskltdyn 0.22527 0.949 0.888
su0.2 vfVF 0.11602
vfnoVF 0.10147 -1.000
Residual 0.17508

This yields the same error message and looking at the structure of the 
covariance matrix
I see that the number of random effects has not changed (only that now 
the coding is
made with a dummy matrix instead of the contrasts).

The following model however does not give any warning message

 > fit<-lmer(log(e.area)~group*task*vf + (task|su0) + (vf|su0),CP0vf)
 > VarCorr(fit)
Groups Name Std.Dev. Corr
su0 (Intercept) 0.39050
taskltstat-stand 0.16468 0.492
taskltdyn-stand 0.12581 -0.891 -0.659
su0.1 (Intercept) 0.27859
vfnovf-vf 0.11002 -0.129
Residual 0.17198

However, I find difficult to interpret the fact that there are "two 
intercept".
Any suggestion would be appreciated.

It seems to me that a model with one random intercept for
the subject + one random effects for each treatment relative to the
baseline is easire to interpret. The syntax to get it is however not
straightforward and requires coding dummy variables by hand.

 > tmp<-CP0vf
 > tmp$vf.novf<-ifelse(tmp$vf=="noVF",1,0)
 > tmp$task.ltstat<-ifelse(tmp$task=="ltstat",1,0)
 > tmp$task.ltmov<-ifelse(tmp$task=="ltsmov",1,0)
 > fit<-lmer(log(e.area)~group*task*vf + (1|su0) + (task.ltstat-1|su0) + 
(task.ltmov-1|su0) + (vf.novf-1|su0),tmp)
 > VarCorr(fit)
Groups Name Std.Dev.
su0 (Intercept) 0.43475
su0.1 task.ltstat 0.23228
su0.2 task.ltmov 0.37742
su0.3 vf.novf 0.19578
Residual 0.20493

or, to allow for some correlations,

 > fit<-lmer(log(e.area)~group*task*vf + (task.ltstat-1|su0) + 
(task.ltmov-1|su0) + (vf|su0),tmp)
 > VarCorr(fit)
Groups Name Std.Dev. Corr
su0 task.ltstat 0.23284
su0.1 task.ltmov 0.20882
su0.2 (Intercept) 0.43813
vfnovf-vf 0.10022 0.090
Residual 0.20410

 > fit<-lmer(log(e.area)~group*task*vf + (task|su0) + (vf.novf-1|su0), tmp)
 > VarCorr(fit)
Groups Name Std.Dev. Corr
su0 (Intercept) 0.45837
taskltstat-stand 0.16109 0.413
taskltdyn-stand 0.12072 -0.698 -0.674
su0.1 vf.novf 0.20669
Residual 0.17932

I am a bit confused about how to reason about these different 
possibilities and
I wonder what to do in the spirit of Barr et al. (2013) who suggest not 
to try
to model the random structure but to keep it maximal. To quote them:

"The generalization performance of LMEMs including data-driven random 
effects structures
strongly depends upon modeling criteria and sample size, yielding 
reasonable results on
moderately-sized samples when conservative criteria are used, but with 
little or no power
advantage over maximal models."

Gabriel

-- 
---------------------------------------------------------------------
Gabriel Baud-Bovy               tel.: (+39) 02 2643 4839 (office)
UHSR University                       (+39) 02 2643 3429 (laboratory)
via Olgettina, 58                     (+39) 02 2643 4891 (secretary)
20132 Milan, Italy               fax: (+39) 02 2643 4892