[R-sig-ME] R: lmer model design
Benedetta Cesqui
b.cesqui at hsantalucia.it
Tue Nov 26 15:52:38 CET 2013
Sorry,
in regards to the multcompare issue of my post, the error text I got is
Error in mcp2matrix(model, linfct = linfct) :
Variable(s) 'T' of class 'integer' is/are not contained as a factor in
'model'.
benedetta
-----Messaggio originale-----
Da: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] Per conto di Benedetta
Cesqui
Inviato: martedì 26 novembre 2013 15:48
A: r-sig-mixed-models at r-project.org
Oggetto: [R-sig-ME] lmer model design
Hi all,
I am trying to determine the most appropriate way to run the following
analysis.
I asked 7 subjects to catch a ball launched with different initial
conditions. The system is calibrated to have a total of 4 time flight and 2
arrival height conditions.
I tested different subjects of different skill level ( that is, different
final score percentage, i.e. number of ball caught/ total number of
launches), and measured a response variable (y, say something as the
reaction time, or the ability at tracking the ball with the eye). Those
variables are supposed to be correlated to the performance ( ability at
catching or not the ball).
The aims of my analysis is:
1) test whether the responses are different for different score ( i.e. a
lower value of the y response variable if the subject was not able to catch
the ball). The score is 1 if the subject caught the ball. The score was 0 if
the subject touched or missed the ball.
3) test whether the response variable are different for the different
experimental conditions.
I used two different approaches and get apparently different results:
In the first case, I run an LMM analysis and used the score as a dummy
variable:
mod0 <- lmer(y ~ T+ Z + (1|subject) + factor(score), data = x)
( this model is the result of a mixed model building process. I compared
several possible models of different random and fixed effect structures. For
brevity I only reported the one that gave me the best results according to
the outcomes of the likelihood test).
summary (mod0)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ T + Z + (1 | subject) + factor(score)
Data: x
REML criterion at convergence: -642.0967
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 0.001681 0.04100
Residual 0.008938 0.09454
Number of obs: 364, groups: subject, 7
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.232191 0.025870 8.975
T 0.034977 0.004951 7.064
Z -0.046545 0.010017 -4.647
factor(score)1 0.047830 0.012407 3.855
Correlation of Fixed Effects:
(Intr) T Z
T -0.472
Z -0.580 0.024
factr(scr)1 -0.139 -0.239 0.031
The comparison between mod0 and mod1 , defined as;
mod1 <- lmer(y ~ T+ Z + (1|subject) +, data = x),
returns me the significance of the factor(score) term.
anova(mod0,mod1)
Data: x
Models:
mod1: y ~ T + Z + (1 | subject)
mod0: y ~ T + Z + (1 | subject) + factor(score)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
mod1 5 -647.01 -627.53 328.51 -657.01
mod0 6 -659.74 -636.36 335.87 -671.74 14.733 1 0.0001239 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
It turns out that the score affects the y variable. In particular y is
higher for score =1. However, as the subjects were of different skill level,
I could also wander whether the random slopes associated with the score were
different across subjects. In other words I should also be able to see
that skilled subjects ( that is, subjects with a higher number of caught
ball), should also present a higher y value.
I thus, consider the following model
mod2<- lmer(y ~ T+ Z + (1+score|subject) +factor(score), data = x)
summary(mod2)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ T + Z + (1 + score | subject) + factor(score)
Data: x
REML criterion at convergence: -646.0884
Random effects:
Groups Name Variance Std.Dev. Corr
subject (Intercept) 0.001240 0.03521
score 0.001571 0.03963 0.55
Residual 0.008730 0.09344
Number of obs: 364, groups: subject, 7
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.232614 0.024430 9.522
T 0.034213 0.004892 6.993
Z -0.045544 0.009921 -4.591
factor(score)1 0.039768 0.020356 1.954
Correlation of Fixed Effects:
(Intr) T Z
T -0.496
Z -0.606 0.024
factr(scr)1 0.132 -0.141 0.017
and get:
anova(mod0, mod2)
Data: x
Models:
mod0: y ~ T + Z + (1 | subject) + factor(score)
mod2: y ~ T + Z + (1 + score | subject) + factor(score)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
mod0 6 -659.74 -636.36 335.87 -671.74
mod2 8 -659.01 -627.83 337.50 -675.01 3.2628 2 0.1957
If I interpret my results correctly, this means that overall the y variable
is affected by the ability of the subjects to catch the ball or not.
However, a subject that has a lower score, doesn't necessary have also a
lower y response variable.
Now, I decide to reverse the point of view and see whether the score ( i.e.
the performance of the subjects in the task) was affected by the response
variable. I run a GLMM analysis. To this aim I compared the following
models:
mod3 <- glmer(score ~ y + T+ Z + (1|subject) , data = x,
family=binomial(link = "logit"))
mod4 <- glmer(score ~ y+ T+ Z+ (1 +y|subject), data = x,
family=binomial(link = "logit"))
anova(mod3,mod4)
Models:
mod3: score ~ y + T + Z + (1 | subject)
mod9: score ~ y + T + Z + (1 + y | subject)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
mod7 5 358.51 377.99 -174.25 348.51
mod9 7 355.27 382.55 -170.64 341.27 7.2348 2 0.02685 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Thus, mod4, better accounts for my data.
summary(mod4)
Generalized linear mixed model fit by maximum likelihood ['glmerMod']
Family: binomial ( logit )
Formula: score ~ y + T + Z + (1 + y | subject)
Data: x
AIC BIC logLik deviance
355.2723 382.5524 -170.6362 341.2723
Random effects:
Groups Name Variance Std.Dev. Corr
subject (Intercept) 0.009295 0.09641
y 42.975699 6.55559 1.00
Number of obs: 364, groups: subject, 7
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.31323 0.68571 -3.373 0.000742 ***
y 2.97926 2.92840 1.017 0.308979
T 0.42262 0.14305 2.954 0.003133 **
Z 0.08513 0.28984 0.294 0.768987
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) y T
y -0.251
T -0.322 -0.159
Z -0.739 0.149 -0.084
This analysis retuns a p_value for the y response variable that is not
significant ( 0.3089), and instead returns me a significant value of the T
variable.
hence it suggests that the score depends on the T condition, and not on the
y variable. What the significance of the (1 + y | subject) term suggets?
What is the best approach to run this analysis?
Finally, I tried to run a Tukey test to study possible difference of the
score level depending on the T and Z conditions:
cont <- glht(mod9, linfct = mcp( T = "Tukey"))
and got these error:
Error in mcp2matrix(model, linfct = linfct) :
Variable(s) 'Tcat' of class 'integer' is/are not contained as a factor in
'model'.
Does anyone know what I did wrong?
Sorry for the long post, but I am really stacked in this analysis and do not
know how to move on!
Many thanks
Benedetta
---
Benedetta Cesqui, Ph.D.
Laboratory of Neuromotor Physiology
IRCCS Fondazione Santa Lucia
Via Ardeatina 306
00179 Rome, Italy
tel: (+39) 06-51501485
fax:(+39) 06-51501482
E_mail: b.cesqui at hsantalucia.it
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list