# [R] Problem with "by": does not work with ttest (but with lme)

Chuck Cleland ccleland at optonline.net
Tue Aug 14 15:43:35 CEST 2007

```Daniel Stahl wrote:
> Hello,
>
> I would like to do a large number of e.g. 1000 paired ttest using the by-function. But instead of using only the data within the 1000 groups, R caclulates 1000 times the ttest for the full data set(The same happens with Wilcoxon test). However, the by-function works fine with the lme function.
> Did I just miss something or is it really not working? If not, is there any other possibility to avoid loops?
> Thanks
> Daniel
>
> Here is the R help example for "by"
>  require(stats)
>  attach(warpbreaks)
>  by(warpbreaks, tension, function(x) lm(breaks ~ wool, data = x))
> *->works great
> by(warpbreaks,tension,function(x)t.test(breaks ~ wool,data=warpbreaks,paired = TRUE))
> *Same output for each level of tension:
>
> tension: L
>
> 	Paired t-test
>
> data:  breaks by wool
> t = 1.9956, df = 26, p-value = 0.05656
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -0.1735803 11.7291358
> sample estimates:
> mean of the differences
>                 5.777778
>
> ------------------------------------------------------------------------
>
> tension: M
>
> 	Paired t-test
>
> data:  breaks by wool
> t = 1.9956, df = 26, p-value = 0.05656
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -0.1735803 11.7291358
> sample estimates:
> mean of the differences
>                 5.777778
>
> ------------------------------------------------------------------------
>
> tension: H
>
> 	Paired t-test
>
> data:  breaks by wool
> t = 1.9956, df = 26, p-value = 0.05656
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -0.1735803 11.7291358
> sample estimates:
> mean of the differences
>                 5.777778

Try something like this:

library(MASS)
df <- mvrnorm(30, mu=c(-1,1), Sigma = diag(2))
df <- as.data.frame(df)
df\$GROUP <- rep(1:3, each=10)

df.uni <- reshape(df, varying = list(c("V1","V2")), v.names="Y",
direction="long")

by(df.uni, df.uni\$GROUP, function(x)t.test(Y ~ time,
data = x, paired = TRUE))

df.uni\$GROUP: 1

Paired t-test

data:  Y by time
t = -4.3719, df = 9, p-value = 0.001792
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.249894 -1.033530
sample estimates:
mean of the differences
-2.141712

---------------------------------------------------------------------------------------------------------------------------

df.uni\$GROUP: 2

Paired t-test

data:  Y by time
t = -6.4125, df = 9, p-value = 0.0001234
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.277425 -1.568074
sample estimates:
mean of the differences
-2.422749

---------------------------------------------------------------------------------------------------------------------------

df.uni\$GROUP: 3

Paired t-test

data:  Y by time
t = -4.4918, df = 9, p-value = 0.001507
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.581428 -1.182313
sample estimates:
mean of the differences
-2.381871

> --
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> and provide commented, minimal, self-contained, reproducible code.

--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

```