[R-sig-ME] lme/anova vs paired t test query

Mon May 23 22:10:16 CEST 2011

  Great, very clear!  Thanks.

  Ben

On 05/23/2011 03:40 PM, Gang Chen wrote:
> I believe I know the answer for this one because I used to struggle
> with this for quite a while. Eventually I figured it out.
> 
> To match up with the paired t-test, you need
> 
> (a2 <- anova(lme(prevalence~sex,random=list(tripsite=pdCompSymm(~sex-1)),data=dat,method="REML")))
> (fstat2 <- a2[["F-value"]][2]) # you get 0.789627
> 
> The catch here is that the compound-symmetry structure for the
> variance-covariance structure of the random components in the lme
> model is equivalent to a paired t-test, a special of the traditional
> repeated-meaures one-way ANOVA.
> 
> Gang
> 
> 
> On Mon, May 23, 2011 at 2:55 PM, Ben Bolker <bbolker at gmail.com> wrote:
>>  It was my understanding that testing a two-level fixed effect in a
>> one-way randomized block ANOVA using REML should be identical to a
>> paired t test ... but, at least with the attempt below, using
>> lme(...,method="REML") and t.test(...,paired=TRUE,var.equal=TRUE) seems
>> to give slightly different results.  They're not different in any
>> important way, but the difference bothers me.  Can anyone see what I'm
>> missing?
>>
>>  Ben Bolker
>>
>> dat <- structure(list(sex = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("f",
>> "m"), class = "factor"), prevalence = c(0, 0.375, 0.133333333333333,
>> 0.176470588235294, 0.1875, 0, 0, 1, 1, 0.5, 0.6, 0.333333333333333,
>> 0.5, 0, 0.333333333333333, 0, 0.5, 0, 0.625, 0.333333333333333,
>> 0.5, 0, 0.333333333333333, 0.153846153846154, 0.222222222222222,
>> 0.5, 1, 0.5, 0, 0.277777777777778, 0.125, 0, 0, 0.428571428571429,
>> 0.451612903225806, 0.362068965517241), tripsite = structure(c(1L,
>> 1L, 4L, 4L, 14L, 14L, 5L, 5L, 8L, 8L, 15L, 15L, 6L, 6L, 9L, 9L,
>> 11L, 11L, 16L, 16L, 2L, 2L, 7L, 7L, 10L, 10L, 13L, 13L, 17L,
>> 17L, 3L, 3L, 12L, 12L, 18L, 18L), .Label = c("1.2", "4.2", "5.2",
>> "1.3", "2.3", "3.3", "4.3", "2.4", "3.4", "4.4", "3.5", "5.5",
>> "4.6", "1.9", "2.9", "3.9", "4.9", "5.9"), class = "factor")), .Names =
>> c("sex",
>> "prevalence", "tripsite"), row.names = c(1L, 2L, 3L, 4L, 9L,
>> 10L, 11L, 12L, 13L, 14L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L,
>> 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 38L, 39L, 40L,
>> 41L, 42L, 43L, 45L, 46L), class = "data.frame")
>>
>>
>> t0 <-
>> with(dat,t.test(prevalence[sex=="f"],prevalence[sex=="m"],paired=TRUE,var.equal=TRUE))
>>
>> library(nlme)
>>
>> (a1 <- anova(lme(prevalence~sex,random=~1|tripsite,data=dat,method="REML")))
>>
>> (fstat0 <- t0$statistic^2)     ## 0.789627
>> (fstat1 <- a1[["F-value"]][2]) ## 0.8056624
>>
>> tmpf <- function(dat) {
>>  t0 <-
>> with(dat,t.test(prevalence[sex=="f"],prevalence[sex=="m"],paired=TRUE,var.equal=TRUE))
>>  (a1 <-
>> anova(lme(prevalence~sex,random=~1|tripsite,data=dat,method="REML")))
>>  (fstat0 <- t0$statistic^2)     ## 0.789627
>>  (fstat1 <- a1[["F-value"]][2]) ## 0.8056624
>>  fstat0-fstat1
>> }
>>
>>  If I then redo this with random data, the results are within 1e-5
>> set.seed(1001)
>> rr <- replicate(500,tmpf(transform(dat,prevalence=rnorm(nrow(dat)))))
>> ##
>> ##  summary(rr)
>> ##      Min.    1st Qu.     Median       Mean    3rd Qu.       Max.
>> ## -1.7770000 -0.0568400 -0.0000015 -0.0722900  0.0000000  0.0000139
>>
>> mean(abs(rr)<1e-5)
>> ## 0.502
>>
>>  hist(log10(abs(rr)),breaks=50,col="gray")
>>
>> ## this is sort of interesting, suggesting a bimodal distribution with
>> one component (the 'good ones' centered at 10^{-8} and another centered
>> at 0.1 ...
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>