[R-sig-ME] z transform versus random intercept

Thu Dec 3 12:22:09 CET 2009

Dear all

In my domain (phonetics) , it is usual to z-transform the response
(i.e. using  scale( ... ,scale=T)   -by subjects, for example-  )
before doing classical regression analysis. I'm not awared enough in 
statistics to
see and explain all the statistical and fundamental differences there 
are between this approach
and mixed models.

For example , with the data "sleepstudy" from package lme4, is there 
something wrong or
dubious  with the following model ?  :

(a)    lm ( zreaction ~ Days ,data=sleepstudy)  

where zreaction is   Reaction   scaled by Subject.

to be compared with:

(b)    lmer( Reaction ~Days +(1|Subject), data=sleepstudy)

 (a) still considers all the z-measures as independant, and I think that
it is still "dubious" , despite the fact that after the scaling, all the 
zreaction
have a  mean==0 and a sd ==1. Am I right ?

Apparently, there are no differences between (a) and the following mixed 
model :

(c) lmer(zreac ~ Days + (1 | Subject) ...)

(Of course this last  model found a null variance inter-subject).

Would differences be appeared if  I run simulations on (a) ,(b) and (c) 
to test
the effect of Days ?

I'm looking for "good" arguments to convince my colleagues that
mixed model is a better way than z-transform, even for such a simple model,
for which it would be not only an easier or more elegant way to do the 
same .

(I know that the "good model" is the mixed model with a random slope, 
and that
this time  the "z-model" and the mixed one cannot be compared )

Thank you for your help.

Dear all

In my domain (phonetics) , it is usual to z-transform the response
(i.e. using  scale( )   -by subjects, for example-  )
before doing classical regression analysis. I'm not awared enough in 
statistics to
see and explain all the statistical and fundamental differences there 
are between this approach
and mixed models.

For example , with the data "sleepstudy" from package lme4, is there 
something wrong or
dubious  with the following model ?  :

(a)    lm( zreaction ~ Days ,data=sleepstudy) 

where zreaction is   Reaction   scaled by Subject.

to be compared with:

(b)    lmer( Reaction ~Days +(1|Subject), data=sleepstudy)

 (a) still considers all the z-measures as independant, and I think that
it is still "dubious" , despite the fact that after the scaling, all the 
zreaction
have a  mean==0 and a sd ==1. Am I right ?

Apparently, there are no differences between (a) and the following mixed 
model :

(c) lmer(zreac ~ Days + (1 | Subject) ...)

(Of course this last  model found a null variance inter-subject).

Would differences be appeared if  I run simulations on (a) ,(b) and (c) 
to test
the effect of Days ?

I'm looking for "good" arguments to convince my colleagues that
mixed model is a better way than z-transform, even for such a simple model,
for which it would be not only an easier or more elegant way to do the 
same .

(I know that the "good model" is the mixed model with a random slope, 
and that
this time  the "z-model" and the mixed one cannot be compared )

Thank you for your help.

######   output from   classical lm   on z scaling data

Dear all

In my domain (phonetics) , it is usual to z-transform the response
(i.e. using  scale( )   -by subjects, for example-  )
before doing classical regression analysis. I'm not awared enough in 
statistics to
see and explain all the statistical and fundamental differences there 
are between this approach
and mixed models.

For example , with the data "sleepstudy" from package lme4, is there 
something wrong or
dubious  with the following model ?  :

(a)    lm( zreaction ~ Days ,data=sleepstudy) 

where zreaction is   Reaction   scaled by Subject.

to be compared with:

(b)    lmer( Reaction ~Days +(1|Subject), data=sleepstudy)

the model (a) still considers all the z-measures as independant, and I 
think that
it is still "dubious" , despite the fact that after the scaling, all the 
zreaction
have a  mean==0 and a sd ==1. Am I right ?

Apparently, there are no differences between (a) and the following mixed 
model :

(c) lmer(zreac ~ Days + (1 | Subject) ...)

(Of course this last  model found a null variance inter-subject).

Am I wrong when I expect some (hidden) differences betwenn (a) and (c) ?

Would differences be appeared if  I run simulations on (a) ,(b) and (c) 
to test
the effect of Days ? 

I'm looking for "good" arguments to convince my colleagues that
mixed model is a better way than z-transform for such a simple model,
for which it would be not only an easier or more elegant way to do the 
same .
(I know that the "good model" is the mixed model with a random slope, 
and that
this time  the "z-model" and the mixed one cannot be compared )

Thank you for your help.

######   : output from   classical lm   on the z-scaled data ,model (a)

 > summary( fm0z.lm)

Call:
lm(formula = zreac ~ Days, data = zsleep)

Residuals:
     Min       1Q   Median       3Q      Max
-1.98864 -0.36035  0.01233  0.35292  2.55175

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept) -1.06168    0.09249  -11.48   <2e-16 ***
Days         0.23593    0.01733   13.62   <2e-16 ***

Residual standard error: 0.6676 on 178 degrees of freedom

###  output from lmer on z-scaled data  , model (c)

 > summary( fm0z.lmer)
Linear mixed model fit by REML
Formula: zreac ~ Days + (1 | Subject)
   Data: zsleep
   AIC   BIC logLik deviance REMLdev
 381.8 394.6 -186.9    363.4   373.8
Random effects:
 Groups   Name        Variance Std.Dev.
 Subject  (Intercept) 0.00000  0.00000
 Residual             0.44574  0.66764
Number of obs: 180, groups: Subject, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept) -1.06168    0.09249  -11.48
Days         0.23593    0.01733   13.62

Correlation of Fixed Effects:
     (Intr)
Days -0.843