[R-sig-ME] R: lmer model design

Tue Nov 26 15:52:38 CET 2013

Sorry,
in regards to the multcompare issue of my post, the error text I got is 

 Error in mcp2matrix(model, linfct = linfct) : 
  Variable(s) 'T' of class 'integer' is/are not contained as a factor in
'model'.

benedetta

-----Messaggio originale-----
Da: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] Per conto di Benedetta
Cesqui
Inviato: martedì 26 novembre 2013 15:48
A: r-sig-mixed-models at r-project.org
Oggetto: [R-sig-ME] lmer model design

Hi all,

I am trying to determine the most appropriate way to run the following
analysis.

I asked 7  subjects to catch a ball launched with different initial
conditions. The system is calibrated to have a total of 4 time flight and 2
arrival height conditions.
 I tested different subjects of different skill level ( that is, different
final score percentage, i.e. number of ball caught/ total number of
launches), and measured a response variable (y, say something as the
reaction time, or the ability at tracking the ball with the eye). Those
variables are supposed to be correlated to the performance ( ability at
catching or not the ball).

The aims of my analysis is: 

1) test whether the responses are different for different score ( i.e. a
lower  value of the y response variable if the subject was not able to catch
the ball). The score is 1 if the subject caught the ball. The score was 0 if
the subject touched or missed the ball.

3) test whether the response variable are different for the different
experimental conditions.

I used two different approaches and get apparently different results:

In the first case, I run an LMM analysis and used the score as a dummy
variable:

mod0 <- lmer(y ~ T+ Z + (1|subject) + factor(score), data = x)

( this model is the result of a mixed model building process. I compared
several possible models of different random and fixed effect structures. For
brevity I only reported the one that gave me the best results according to
the outcomes of the likelihood test).

summary (mod0)

Linear mixed model fit by REML ['lmerMod']

Formula: y ~ T + Z + (1 | subject) + factor(score) 

   Data: x 

REML criterion at convergence: -642.0967 

Random effects:

Groups   Name        Variance Std.Dev.

subject (Intercept) 0.001681 0.04100 

 Residual             0.008938 0.09454 

Number of obs: 364, groups: subject, 7

Fixed effects:

                Estimate Std. Error t value

(Intercept)     0.232191   0.025870   8.975

T            0.034977   0.004951   7.064

Z           -0.046545   0.010017  -4.647

factor(score)1  0.047830   0.012407   3.855

Correlation of Fixed Effects:

            (Intr) T   Z  

T        -0.472              

Z        -0.580  0.024       

factr(scr)1 -0.139 -0.239  0.031

The comparison between mod0 and mod1 , defined as;

mod1 <- lmer(y ~ T+ Z + (1|subject) +, data = x),

returns me the significance of the factor(score)  term.

anova(mod0,mod1)

Data: x

Models:

mod1: y ~ T + Z + (1 | subject)

mod0: y ~ T + Z + (1 | subject) + factor(score)

     Df     AIC     BIC logLik deviance  Chisq Chi Df Pr(>Chisq)    

mod1  5 -647.01 -627.53 328.51  -657.01                             

mod0  6 -659.74 -636.36 335.87  -671.74 14.733      1  0.0001239 ***

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

It turns out that the score affects the y variable. In particular y is
higher for score =1. However, as the subjects were of different skill level,
I could also wander whether the random slopes associated with the score were
different across subjects. In other words I should   also be able to see
that skilled subjects ( that is, subjects with a higher number of caught
ball), should also present a higher y value.

I thus, consider the following model

mod2<-  lmer(y ~ T+ Z + (1+score|subject) +factor(score), data = x)

summary(mod2)

Linear mixed model fit by REML ['lmerMod']

Formula: y ~ T + Z + (1 + score | subject) + factor(score) 

   Data: x 

REML criterion at convergence: -646.0884 

Random effects:

Groups   Name        Variance Std.Dev. Corr

subject (Intercept) 0.001240 0.03521      

          score       0.001571 0.03963  0.55

Residual             0.008730 0.09344      

Number of obs: 364, groups: subject, 7

Fixed effects:

                Estimate Std. Error t value

(Intercept)     0.232614   0.024430   9.522

T            0.034213   0.004892   6.993

Z           -0.045544   0.009921  -4.591

factor(score)1  0.039768   0.020356   1.954

Correlation of Fixed Effects:

            (Intr) T   Z  

T        -0.496              

Z        -0.606  0.024       

factr(scr)1  0.132 -0.141  0.017

and get:

anova(mod0, mod2)

Data: x

Models:

mod0: y ~ T + Z + (1 | subject) + factor(score)

mod2: y ~ T + Z + (1 + score | subject) + factor(score)

     Df     AIC     BIC logLik deviance  Chisq Chi Df Pr(>Chisq)

mod0  6 -659.74 -636.36 335.87  -671.74                         

mod2  8 -659.01 -627.83 337.50  -675.01 3.2628      2     0.1957

If I interpret my results correctly, this means that overall  the y variable
is affected by the ability of the subjects to catch the ball or not.
However, a subject that has a lower score, doesn't necessary have also  a
lower y response variable.

Now, I decide to reverse the point of view and see whether the score ( i.e.
the performance of the subjects in the task) was affected by the response
variable. I run a GLMM analysis. To this aim I compared the following
models:

mod3 <- glmer(score ~ y + T+ Z + (1|subject) , data = x,
family=binomial(link = "logit"))

mod4 <- glmer(score ~ y+ T+ Z+ (1 +y|subject), data = x,
family=binomial(link = "logit"))

anova(mod3,mod4)

Models:

mod3: score ~ y + T + Z + (1 | subject)

mod9: score ~ y + T + Z + (1 + y | subject)

     Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)  

mod7  5 358.51 377.99 -174.25   348.51                           

mod9  7 355.27 382.55 -170.64   341.27 7.2348      2    0.02685 *

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Thus, mod4, better accounts for my data.

summary(mod4)

Generalized linear mixed model fit by maximum likelihood ['glmerMod']

Family: binomial ( logit )

Formula: score ~ y + T + Z + (1 + y | subject) 

   Data: x 

      AIC       BIC    logLik  deviance 

 355.2723  382.5524 -170.6362  341.2723 

Random effects:

Groups   Name        Variance  Std.Dev. Corr

subject (Intercept)  0.009295 0.09641      

          y     42.975699 6.55559  1.00

Number of obs: 364, groups: subject, 7

Fixed effects:

            Estimate Std. Error z value Pr(>|z|)    

(Intercept) -2.31323    0.68571  -3.373 0.000742 ***

y      2.97926    2.92840   1.017 0.308979    

T         0.42262    0.14305   2.954 0.003133 ** 

Z         0.08513    0.28984   0.294 0.768987    

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:

        (Intr) y T  

y -0.251              

T    -0.322 -0.159       

Z    -0.739  0.149 -0.084

This analysis retuns a p_value for the y response variable that is not
significant ( 0.3089), and instead returns me a significant value of the T
variable.

hence it suggests that the score depends on the T condition, and not on the
y variable. What the significance of the (1 + y | subject) term suggets?

What is the best approach to run this analysis?

Finally, I tried to run a Tukey test to study possible difference of the
score level depending on the T and Z conditions:

cont <- glht(mod9,  linfct = mcp( T = "Tukey"))

and got these error:

Error in mcp2matrix(model, linfct = linfct) : 

  Variable(s) 'Tcat' of class 'integer' is/are not contained as a factor in
'model'.

Does anyone know what I did wrong?

Sorry for the long post, but I am really stacked in this analysis and do not
know how to move on!

Many thanks

Benedetta

---

Benedetta Cesqui, Ph.D.

Laboratory of Neuromotor Physiology

IRCCS Fondazione Santa Lucia

Via Ardeatina 306

00179 Rome, Italy

tel: (+39) 06-51501485

fax:(+39) 06-51501482

E_mail:  b.cesqui at hsantalucia.it

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models