[R] Problem with "by": does not work with ttest (but with lme)

Tue Aug 14 15:43:35 CEST 2007

Daniel Stahl wrote:
> Hello,
> 
> I would like to do a large number of e.g. 1000 paired ttest using the by-function. But instead of using only the data within the 1000 groups, R caclulates 1000 times the ttest for the full data set(The same happens with Wilcoxon test). However, the by-function works fine with the lme function.
> Did I just miss something or is it really not working? If not, is there any other possibility to avoid loops? 
> Thanks 
> Daniel
> 
> Here is the R help example for "by" 
>  require(stats)
>  attach(warpbreaks)
>  by(warpbreaks, tension, function(x) lm(breaks ~ wool, data = x))
> *->works great
> by(warpbreaks,tension,function(x)t.test(breaks ~ wool,data=warpbreaks,paired = TRUE))
> *Same output for each level of tension:
> 
> tension: L
> 
> 	Paired t-test
> 
> data:  breaks by wool
> t = 1.9956, df = 26, p-value = 0.05656
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -0.1735803 11.7291358
> sample estimates:
> mean of the differences
>                 5.777778
> 
> ------------------------------------------------------------------------
> 
> tension: M
> 
> 	Paired t-test
> 
> data:  breaks by wool
> t = 1.9956, df = 26, p-value = 0.05656
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -0.1735803 11.7291358
> sample estimates:
> mean of the differences
>                 5.777778
> 
> ------------------------------------------------------------------------
> 
> tension: H
> 
> 	Paired t-test
> 
> data:  breaks by wool
> t = 1.9956, df = 26, p-value = 0.05656
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -0.1735803 11.7291358
> sample estimates:
> mean of the differences
>                 5.777778

  Try something like this:

library(MASS)
df <- mvrnorm(30, mu=c(-1,1), Sigma = diag(2))
df <- as.data.frame(df)
df$GROUP <- rep(1:3, each=10)

df.uni <- reshape(df, varying = list(c("V1","V2")), v.names="Y",
direction="long")

by(df.uni, df.uni$GROUP, function(x)t.test(Y ~ time,
                                           data = x, paired = TRUE))

df.uni$GROUP: 1

        Paired t-test

data:  Y by time
t = -4.3719, df = 9, p-value = 0.001792
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.249894 -1.033530
sample estimates:
mean of the differences
              -2.141712

---------------------------------------------------------------------------------------------------------------------------

df.uni$GROUP: 2

        Paired t-test

data:  Y by time
t = -6.4125, df = 9, p-value = 0.0001234
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.277425 -1.568074
sample estimates:
mean of the differences
              -2.422749

---------------------------------------------------------------------------------------------------------------------------

df.uni$GROUP: 3

        Paired t-test

data:  Y by time
t = -4.4918, df = 9, p-value = 0.001507
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.581428 -1.182313
sample estimates:
mean of the differences
              -2.381871

> --
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894