[R-sig-ME] repeated measures: lme(r) vs manova

Sun Apr 1 15:53:55 CEST 2012

On Mar 19, 2012, at 22:31 , Nathan Lemoine wrote:

> Hi all,
> Sorry in advance for the length of this post, but I've searched around and couldn't find anything that addressed this issue:
> 
> I recently ran into the issue of deciding on the appropriate way to analyze a repeated measures design. We enriched quadrats to measure productivity and we monitored them for three years. Four quadrats were nested within plots. Here are the data:
> 
> "plot" "quad" "nut" "t1" "t3" "t4"
> "1" 1 "A" "nut" 17.69130435 70.4 57.8
> "2" 1 "A" "no nut" 65.4173913 125.8 109.9
> "3" 1 "B" "nut" 19.56521739 103.2 100.8
> "4" 1 "B" "no nut" 89.03636364 131.3 99.1
> "5" 1 "C" "nut" 29.88723404 25.7 29.9
> "6" 1 "C" "no nut" 45.45454545 113.1 110.6
> "7" 1 "D" "nut" 18.28181818 60.9 67.7
> "8" 1 "D" "no nut" 68.88888889 136 95
> "9" 2 "A" "nut" 35.41666667 61.6 16
> "10" 2 "A" "no nut" 40.90909091 59.4 64.7
> "11" 2 "B" "nut" 34.14255319 26.7 23.1
> "12" 2 "B" "no nut" 36.27021277 71.6 47.2
> "13" 2 "C" "nut" 13.33333333 20.9 26.4
> "14" 2 "C" "no nut" 7.118181818 31.2 19.1
> "15" 2 "D" "nut" 20 30.9 27.8
> "16" 2 "D" "no nut" 19.34893617 31.3 16.7
> "17" 3 "A" "nut" 22.22222222 130.7 163.6
> "18" 3 "A" "no nut" 32.90869565 83.8 86.2
> "19" 3 "B" "nut" 38.29787234 99 110.1
> "20" 3 "B" "no nut" 38.83636364 127.1 115.2
> "21" 3 "C" "nut" 38.88888889 81.7 193.7
> "22" 3 "C" "no nut" 28.98888889 72.1 103.8
> "23" 3 "D" "nut" 50 111.3 117.7
> "24" 3 "D" "no nut" 26.86666667 94.2 113
> "25" 4 "A" "nut" 63.63636364 128.4 114.8
> "26" 4 "A" "no nut" 108.8956522 121 80.7
> "27" 4 "B" "nut" 104.4444444 146.5 102.2
> "28" 4 "B" "no nut" 84.74444444 111.5 109.9
> "29" 4 "C" "nut" 71.31111111 86.2 118.4
> "30" 4 "C" "no nut" 115.9555556 131.4 141.9
> "31" 4 "D" "nut" 75.65555556 141.5 92.5
> "32" 4 "D" "no nut" 108.9888889 146.6 122.2
> "33" 5 "A" "nut" 20.2 57.4 14.6
> "34" 5 "A" "no nut" 12.34489796 55.4 13.4
> "35" 5 "B" "nut" 48.98888889 56.3 28.7
> "36" 5 "B" "no nut" 35.65555556 55.8 17.6
> "37" 5 "C" "nut" 22.22222222 45.9 7.3
> "38" 5 "C" "no nut" 9.088888889 55.6 20.5
> "39" 5 "D" "nut" 64.44444444 86.1 61.7
> "40" 5 "D" "no nut" 15.65555556 75.7 41.8
> "41" 6 "A" "nut" 22.22222222 101.1 69.8
> "42" 6 "A" "no nut" 53.33333333 171.2 113.5
> "43" 6 "B" "nut" 37.87777778 111.1 66.8
> "44" 6 "B" "no nut" 46.96666667 120.8 83.8
> "45" 6 "C" "nut" 17.87777778 120.7 84
> "46" 6 "C" "no nut" 21.21212121 116.3 76.8
> "47" 6 "D" "nut" 24.01304348 86.1 64.6
> "48" 6 "D" "no nut" 29.51034483 112.5 51.9
> 
> The basic question is: When is it appropriate to use a MANOVA-based repeated measures design over a mixed effects model? 
> 
> For example, the MANOVA approach:
> library(car)
> repeated.manova <- lm(cbind(t1,t3,t4)~nut+plot+quad, data=manova.data)
> Manova(repeated.manova)
> 
> nut is not significant and there are 40 denominator df. 
> 
> If I set up the data and run lme:
> 
> mixed.dat <- melt(manova.data, id=c("plot","quad","nut"))
> colnames(mixed.dat)[4:5] <- c("time","prod")
> mixed.dat$time <- as.numeric(mixed.dat$time))
> 
> library(nlme)
> lme.repeated <- lme(prod~nut, random=~nut|time, data=mixed.dat)
> anova(lme.repeated)
> 
> Gives 140 denominator df. I'm also not sure this is the appropriate set up for a repeated measures design. Running the following code seems more in line with what I've read to take into account the correlation in observations within the same plot:
> 
> lme.repeated2 <- lme(prod~nut*time, random=~time|plot, data=mixed.dat)
> anova(lme.repeated2)
> 
> This model seems much more appropriate, as observations within plots are now allowed to be correlated, but there is still a huge difference between the MANOVA-based approach and the mixed-effects-based approach, as the mixed-effects model gives me a significant result. The MANOVA assumes that I have three (correlated) observations on 48 independent units, whereas the lme approach assumes that I have 144 observations on correlated units. Also not sure if that interpretation is correct.
> 
> Alternatively, I used lmer() for non-nested, multilevel models allowing observations to be correlated in space and time:
> 
> repeated.mixed3 <- lmer(prod~nut + (1|plot) + (1|time), data=mixed.dat)
> repeated.mixed4 <- lmer(prod~ (1|plot) + (1|time), data=mixed.dat)
> anova(repeated.mixed3, repeated.mixed4)
> 
> This approach also gives me a significant result. Which of these is the most appropriate? The differences between lme and lmer are trivial (in this case), but the difference between the MANOVA approach and mixed-effects is substantial. I figure the MANOVA approach is probably in correct on account of the nested design, but my question extends to situations when the design is not nested. 
> 
> Thanks in advance for your help,
> 

You need to be careful to compare similar models. Your MANOVA model is effectively treating plot:quad:nut (i.e. anything not involving time) as the observational unit. In particular, this ignores the pairing of the two nut levels at the same level of plot:quad. Also notice that a MANOVA model tests the effect of a trivariate outcome by comparing a model with three parameters for the effect (one for each variate) with a model with no effect at all, i.e. the test has three d.f. So your MANOVA test is effectively that of time:nut in the following model:

> mixed.dat$ia <- with(mixed.dat, interaction(plot,quad,nut))
> lme.repeated2 <- lme(prod~time/nut, random=~time|ia, data=mixed.dat)
> anova(lme.repeated2)
            numDF denDF   F-value p-value
(Intercept)     1    91 155.47120  <.0001
time            2    91  60.29474  <.0001
time:nut        3    91   1.08992  0.3575

which is fairly similar to 

> anova(repeated.manova)
Analysis of Variance Table

            Df  Pillai approx F num Df den Df Pr(>F)    
(Intercept)  1 0.85717   88.019      3     44 <2e-16 ***
nut          1 0.06636    1.043      3     44 0.3832    
Residuals   46                                          

At this point, a rather long thread could be spun about the differences between the two analyses, but I think you'll agree that they are broadly in agreement. Both are pretty clearly wrong though. At the very least, you'd want to include the design properly, especially the paired nature of the comparisons of levels of nut.

-åd

> Nathan
> 
> 
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com