[R-sig-ME] repeated measures OR block with time covariate?

Thu May 21 17:41:58 CEST 2009

Dear Douglas,  thank you very much for the reply, it is the kind of advice I was looking for.  Regards, Paul

Paul Prew  |  Statistician
651-795-5942   |   fax 651-204-7504 
Ecolab Research Center  | Mail Stop ESC-F4412-A 
655 Lone Oak Drive  |  Eagan, MN 55121-1560 

-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas Bates
Sent: Wednesday, May 20, 2009 10:23 AM
To: Prew, Paul
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] repeated measures OR block with time covariate?

On Tue, May 19, 2009 at 8:51 PM, Prew, Paul <Paul.Prew at ecolab.com> wrote:
> Hello,

> I've received helpful advice in the past from the readers of this list, and would like some more advice.  I think I understand that a blocking / nuisance factor in ANOVA will not affect the significance estimates of fixed factors / 'factors of interest', regardless of whether the blocking factor is designed as random or fixed --- the same sum-of-squares are taken off the top in ANOVA.  Other quantities may be affected, such as prediction intervals for specified levels of the fixed factor.  Fixed block effects are welcome, because virtually all the literature in experimental design treats Randomized Complete Block Designs as having fixed effects for blocks.

> In a discussion with a colleague today, the following question was posed about repeated measures ~

> A cleaning product is being tested at a sample of hospitals.  Micro-organism counts are taken over a period of weeks, for a control formulation and an experimental formulation.  The hospitals are not of interest, and can be considered blocking factors.  So my interpretation is that there's no harm in designating the hospitals as a fixed effect.

> Within the hospital, we can see week-over-week reductions in micro-organisms.    There's a slope related to time.

> *****   Could a fixed effect for the blocking factor Hospital and a fixed covariate Time take the place of what seems to be a good candidate for a repeated measures analysis (assuming that the repeated measures implies Hospital = random effect)?  *****

Off the top of my head I would say that using fixed effects for the
blocking factor is a conservative approach and probably the best
approach in terms of simplicity.  I'm assuming that the design is
balanced in that each hospital is observed for the same number of
weeks, in which case the hospital and time effects would be
orthogonal.

With regard to the sums of squares, a model with random effects will
remove a smaller part of the sum of squares than will a model with
fixed effects because the random effects are shrunk relative to the
fixed effects.  Thus the residual sum of squares in a random effects
model will be at least as large as that in a model with fixed-effects
for the hospitals.  Consider the enclosed model fits for the
sleepstudy data.

library(lme4)
(fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy))
summary(fm2 <- lm(Reaction ~ Subject * Days, sleepstudy))
deviance(fm2)  # residual sum of squares for fixed-effects model
fm1 at deviance["wrss"]  # residual sum of squares for mixed model

When run it produces

> deviance(fm2)  # residual sum of squares for fixed-effects model
[1] 94311.5
> fm1 at deviance["wrss"]  # residual sum of squares for mixed model
    wrss
98880.24

There are differences in interpretation, of course.  It is a little
more difficult to decide how you would test for a significant
"typical" slope in the fixed-effects model than in the mixed model.
The parameter labeled "Days" in the fixed-effects model is the
estimate of the slope for the first Subject, not a typical subject.

The big advantage of using lm instead of lmer in a situation like this
is that lm gives you p-values and lmer doesn't. :-)

> Any thought are greatly appreciated.  There's actually a Standard Operating Procedure being written for our scientists and engineers designating they use the repeated measures analysis.  I think it's over their heads, given that few could credibly define degrees of freedom, p-value, or other basic statistical concepts.  An approach that sticks to the "Randomized Complete Block Designs / fixed effects for blocks" is better suited, if it performs as well as the repeated measures design.
>
> Regards, Paul
>
>  Paul Prew  |  Statistician
>  651-795-5942   |   fax 651-204-7504
>  Ecolab Research Center  | Mail Stop ESC-F4412-A
>  655 Lone Oak Drive  |  Eagan, MN 55121-1560
>
>
>
> CONFIDENTIALITY NOTICE:
> This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above.
> Any unauthorized review, use, disclosure or distribution is prohibited.
> If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
CONFIDENTIALITY NOTICE: =\ \ This e-mail communication a...{{dropped:12}}