[R-sig-ME] questions about parsimonious mixed models
Wing Yee Chow
wingyeechow.zoey at gmail.com
Wed Oct 26 16:13:44 CEST 2016
Hi there,
I tried to follow the suggestions in Bates et al.'s parsimonious mixed model
paper (link <https://arxiv.org/abs/1506.04967> ) and I'm having some
problems. Basically, I started with a maximal model (m0) and then a
zero-correlation-parameter model (m1), then I reduced the model by taking
out near-zero variance components (m2). Once I extended the reduced model
with correlation parameters (m3), the model becomes degenerate (some
correlation parameters are 1 or -1). However, the likelihood ratio test
(anova()) suggests that m3 provides a significantly better fit than m2. How
should I proceed? What would be the optimal model in cases like this?
I'm not sure if I did something wrong in the process, so I'll detail what I
did below.
Thanks!
Wing-Yee Chow
==================================
My experiment has a 2x2 within-participant design with 24 participants and
48 items. Both factors (Congruity and SentenceType) are categorical and I
specified the contrasts this way:
> contrasts(tempdata$congruity) <- contr.sum(2)/2
> contrasts(tempdata$sentencetype) <- contr.sum(2)/2
I started with a maximal model, and rePCA() suggests that it's
over-parameterised:
> summary(m0 <- lmer(value ~ sentencetype * congruity + (sentencetype *
congruity|subj) + (sentencetype * congruity|item), REML=FALSE,
data=tempdata))
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: value ~ sentencetype * congruity + (sentencetype * congruity |
subj) + (sentencetype * congruity | item)
Data: tempdata
AIC BIC logLik deviance df.resid
14415.9 14540.2 -7182.9 14365.9 1045
Scaled residuals:
Min 1Q Median 3Q Max
-2.6228 -0.6275 -0.1785 0.4124 6.1246
Random effects:
Groups Name Variance Std.Dev. Corr
item (Intercept) 5412.2 73.568
sentencetype1 3316.3 57.588 0.92
congruity1 866.0 29.429 0.58 0.85
sentencetype1:congruity1 138.1 11.752 0.51 0.15 -0.40
subj (Intercept) 10789.8 103.874
sentencetype1 56.2 7.497 1.00
congruity1 835.1 28.898 -1.00 -1.00
sentencetype1:congruity1 2279.8 47.747 1.00 1.00 -1.00
Residual 34511.5 185.773
Number of obs: 1070, groups: item, 48; subj, 24
Fixed effects:
Estimate Std. Error t value
(Intercept) 388.84 24.39 15.944
sentencetype1 56.27 14.18 3.968
congruity1 -18.19 13.51 -1.346
sentencetype1:congruity1 -20.90 24.82 -0.842
Correlation of Fixed Effects:
(Intr) sntnc1 cngrt1
sentenctyp1 0.328
congruity1 -0.300 0.109
sntnctyp1:1 0.357 0.047 -0.186
> summary(rePCA(m0))
$item
Importance of components:
[,1] [,2] [,3] [,4]
Standard deviation 0.5070 0.15787 1.228e-06 0
Proportion of Variance 0.9116 0.08838 0.000e+00 0
Cumulative Proportion 0.9116 1.00000 1.000e+00 1
$subj
Importance of components:
[,1] [,2] [,3] [,4]
Standard deviation 0.636 1.045e-06 1.895e-07 0
Proportion of Variance 1.000 0.000e+00 0.000e+00 0
Cumulative Proportion 1.000 1.000e+00 1.000e+00 1
Then, I converted the factors to numeric covariates using model.matrix() and
did the zero-correlation-parameter model:
> summary(m1 <- lmer(value ~ sentencetype * congruity + (cSenttype *
cCongruity||subj) + (cSenttype * cCongruity||item), REML=FALSE,
data=tempdata))
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: value ~ sentencetype * congruity + ((1 | subj) + (0 + cSenttype |
subj) + (0 + cCongruity | subj) + (0 + cSenttype:cCongruity |
subj)) + ((1 | item) + (0 + cSenttype | item) + (0 + cCongruity |
item) + (0 + cSenttype:cCongruity | item))
Data: tempdata
AIC BIC logLik deviance df.resid
14419.9 14484.5 -7196.9 14393.9 1057
Scaled residuals:
Min 1Q Median 3Q Max
-2.4709 -0.6110 -0.1823 0.4011 6.1156
Random effects:
Groups Name Variance Std.Dev.
item cSenttype:cCongruity 0.000e+00 0.000e+00
item.1 cCongruity 0.000e+00 0.000e+00
item.2 cSenttype 2.478e+03 4.978e+01
item.3 (Intercept) 5.268e+03 7.258e+01
subj cSenttype:cCongruity 0.000e+00 0.000e+00
subj.1 cCongruity 3.473e+01 5.893e+00
subj.2 cSenttype 3.819e-11 6.180e-06
subj.3 (Intercept) 1.079e+04 1.039e+02
Residual 3.543e+04 1.882e+02
Number of obs: 1070, groups: item, 48; subj, 24
Fixed effects:
Estimate Std. Error t value
(Intercept) 388.41 24.34 15.957
sentencetype1 57.23 13.59 4.210
congruity1 -17.72 11.60 -1.527
sentencetype1:congruity1 -20.81 23.07 -0.902
Correlation of Fixed Effects:
(Intr) sntnc1 cngrt1
sentenctyp1 -0.002
congruity1 0.000 -0.001
sntnctyp1:1 0.000 -0.002 -0.009
Once again rePCA() suggests that the model is over-parameterised:
> summary(rePCA(m1))
$item
Importance of components:
[,1] [,2] [,3] [,4]
Standard deviation 0.3856 0.2645 0 0
Proportion of Variance 0.6801 0.3200 0 0
Cumulative Proportion 0.6801 1.0000 1 1
$subj
Importance of components:
[,1] [,2] [,3] [,4]
Standard deviation 0.5518 0.03131 3.283e-08 0
Proportion of Variance 0.9968 0.00321 0.000e+00 0
Cumulative Proportion 0.9968 1.00000 1.000e+00 1
So then I reduced the model by taking out 4 variance components that are
zero/near-zero, and rePCA suggests that this reduced model is no longer
over-parameterised:
> summary(m2<-lmer(value ~ sentencetype * congruity + (cCongruity||subj) +
(cSenttype||item),REML=FALSE, data=tempdata))
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: value ~ sentencetype * congruity + ((1 | subj) + (0 + cCongruity |
subj)) + ((1 | item) + (0 + cSenttype | item))
Data: tempdata
AIC BIC logLik deviance df.resid
14411.9 14456.6 -7196.9 14393.9 1061
Scaled residuals:
Min 1Q Median 3Q Max
-2.4709 -0.6110 -0.1823 0.4011 6.1156
Random effects:
Groups Name Variance Std.Dev.
item cSenttype 2478.42 49.784
item.1 (Intercept) 5267.94 72.581
subj cCongruity 34.73 5.894
subj.1 (Intercept) 10785.48 103.853
Residual 35428.11 188.224
Number of obs: 1070, groups: item, 48; subj, 24
Fixed effects:
Estimate Std. Error t value
(Intercept) 388.41 24.34 15.957
sentencetype1 57.23 13.59 4.210
congruity1 -17.72 11.60 -1.527
sentencetype1:congruity1 -20.81 23.07 -0.902
Correlation of Fixed Effects:
(Intr) sntnc1 cngrt1
sentenctyp1 -0.002
congruity1 0.000 -0.001
sntnctyp1:1 0.000 -0.002 -0.009
> summary(rePCA(m2))
$item
Importance of components:
[,1] [,2]
Standard deviation 0.3856 0.2645
Proportion of Variance 0.6801 0.3200
Cumulative Proportion 0.6801 1.0000
$subj
Importance of components:
[,1] [,2]
Standard deviation 0.5518 0.03131
Proportion of Variance 0.9968 0.00321
Cumulative Proportion 0.9968 1.00000
Then I followed Bates et al's guidelines in extending this reduced model
with correlation parameters:
> summary(m3<-lmer(value ~ sentencetype * congruity + (cCongruity|subj) +
(cSenttype|item),REML=FALSE, data=tempdata))
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: value ~ sentencetype * congruity + (cCongruity | subj) + (cSenttype
| item)
Data: tempdata
AIC BIC logLik deviance df.resid
14395.2 14450.0 -7186.6 14373.2 1059
Scaled residuals:
Min 1Q Median 3Q Max
-2.6509 -0.6383 -0.1679 0.4151 5.9618
Random effects:
Groups Name Variance Std.Dev. Corr
item (Intercept) 5391.4 73.43
cSenttype 3085.8 55.55 1.00
subj (Intercept) 10803.3 103.94
cCongruity 826.1 28.74 -1.00
Residual 35022.7 187.14
Number of obs: 1070, groups: item, 48; subj, 24
Fixed effects:
Estimate Std. Error t value
(Intercept) 388.76 24.40 15.933
sentencetype1 56.88 13.99 4.066
congruity1 -18.10 12.88 -1.405
sentencetype1:congruity1 -20.94 22.93 -0.913
Correlation of Fixed Effects:
(Intr) sntnc1 cngrt1
sentenctyp1 0.247
congruity1 -0.396 -0.001
sntnctyp1:1 0.000 -0.002 -0.007
The correlation parameters of 1 and -1 suggest that this model is
degenerate. This is consistent with the results of rePCA():
> summary(rePCA(m3))
$item
Importance of components:
[,1] [,2]
Standard deviation 0.492 0
Proportion of Variance 1.000 0
Cumulative Proportion 1.000 1
$subj
Importance of components:
[,1] [,2]
Standard deviation 0.5762 1.678e-07
Proportion of Variance 1.0000 0.000e+00
Cumulative Proportion 1.0000 1.000e+00
However, the likelihood ration test suggests that m3 provides a
significantly better fit than m2:
> anova(m2, m3)
Data: tempdata
Models:
m2: value ~ sentencetype * congruity + ((1 | subj) + (0 + cCongruity |
m2: subj)) + ((1 | item) + (0 + cSenttype | item))
m3: value ~ sentencetype * congruity + (cCongruity | subj) + (cSenttype |
m3: item)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
m2 9 14412 14457 -7196.9 14394
m3 11 14395 14450 -7186.6 14373 20.639 2 3.298e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Questions: Should I have reduced the model further before extending it with
correlation parameters? How should I proceed (or what should I have done
differently)?
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list