[R-sig-ME] random slopes versus nested random intercepts

Sun Feb 2 00:37:26 CET 2014

Dear all,

I have some problems wrapping my head around how to specify the random
effects structure of my data.

The data are response times (rt) from participants (variable: code) who have
worked on two different tasks in two sessions (i.e., session is the variable
coding the task). All participants have data for both sessions and I know
that session influences the rts:

> with(dl, table(session, code))
       code
session AD0210 AF0402 AL2301 AM0107 
      0    270    270    270    270 
      1    250    250    250    250 
[...]

> aggregate(rt ~ session, dl, mean)
  session       rt
1       0 2747.435
2       1 2240.403

Hence I want to model (1) the effect of session, (2) the variance introduced
by the participants, and (3) potential differences in the effect of session
across participants.

For (1) I can add a fixed effect of session and for (2) a random intercept
for participants (i.e., code).
For (3) I see two different possibilities:
 - (a) by introducing random session slopes for the participant random
intercept or 
 - (b) a (crossed) random intercept for each participant:session interaction
(i.e., basically a nested random effect).

Although possibility (a) (random slopes) seems to more closely capture the
intention (allow the session effect to vary across participants), (b) should
essentially do the same (allowing a custom intercept for each
participant:session level). However, when comparing the models, version (a)
fits better:

dl <- read.table("http://pastebin.com/raw.php?i=DDx5devK", header = TRUE)
require(lme4)
options(contrasts=c('contr.sum', 'contr.poly'))
dl$session <- factor(dl$session)

m_a <- lmer(rt ~ session + (session|code), dl)
m_b <- lmer(rt ~ session + (1|code) + (1|code:session), dl)

anova(m_a, m_b, refit = FALSE)
## Data: dl
## Models:
## m_b: rt ~ session + (1 | code) + (1 | code:session)
## m_a: rt ~ session + (session | code)
##     Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)  
## m_b  5 224156 224193 -112073   224146                           
## m_a  6 224154 224199 -112071   224142 3.6329      1    0.05665 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Can someone explain why (a) provides a better fit? Am I wrong in my
assumption that both models should in principal capture the same variation?

Thanks,
Henrik

Some extra information follows, but the main question is above:

The reason why I am trying both (a) and (b) is that my ultimate goal is to
compare two further fixed effects, value and encoding. These additional
effects should also be allowed to vary across participants, sessions, and
their interaction (i.e., the effect of e.g., encoding can be different for
each participant, in each session, and the session:encoding effect should
also be allowed to be different for each participants). Achieving this seems
to me nicer via b than a:

a <- lmer(rt ~ session + value + encoding + session:encoding + session:value
+ (session + encoding + value + session:encoding + session:value|code), dl,
control = lmerControl(optCtrl = list(maxfun = 500000)))

b <- lmer(rt ~ session + value + encoding + session:encoding + session:value
+ (value + encoding|code) + (value + encoding|code:session), dl, control =
lmerControl(optCtrl = list(maxfun = 500000)))

## anova(a, b, refit = FALSE)
## Data: dl
## Models:
## b: rt ~ session + value + encoding + session:encoding + session:value + 
## b:     (value + encoding | code) + (value + encoding | code:session)
## a: rt ~ session + value + encoding + session:encoding + session:value + 
## a:     (session + encoding + value + session:encoding + session:value | 
## a:         code)
##   Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)  
## b 29 223217 223434 -111580   223159                           
## a 45 223222 223559 -111566   223132 27.054     16    0.04089 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note that this last analysis throws some warnings of nonconvergence.

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lme4_1.1-4   Matrix_1.1-2

loaded via a namespace (and not attached):
[1] grid_3.0.2      lattice_0.20-24 MASS_7.3-29     minqa_1.2.1
nlme_3.1-113    Rcpp_0.10.6     splines_3.0.2  
[8] tools_3.0.2