[R] Problem with "by": does not work with ttest (but with lme)
Chuck Cleland
ccleland at optonline.net
Tue Aug 14 15:43:35 CEST 2007
Daniel Stahl wrote:
> Hello,
>
> I would like to do a large number of e.g. 1000 paired ttest using the by-function. But instead of using only the data within the 1000 groups, R caclulates 1000 times the ttest for the full data set(The same happens with Wilcoxon test). However, the by-function works fine with the lme function.
> Did I just miss something or is it really not working? If not, is there any other possibility to avoid loops?
> Thanks
> Daniel
>
> Here is the R help example for "by"
> require(stats)
> attach(warpbreaks)
> by(warpbreaks, tension, function(x) lm(breaks ~ wool, data = x))
> *->works great
> by(warpbreaks,tension,function(x)t.test(breaks ~ wool,data=warpbreaks,paired = TRUE))
> *Same output for each level of tension:
>
> tension: L
>
> Paired t-test
>
> data: breaks by wool
> t = 1.9956, df = 26, p-value = 0.05656
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -0.1735803 11.7291358
> sample estimates:
> mean of the differences
> 5.777778
>
> ------------------------------------------------------------------------
>
> tension: M
>
> Paired t-test
>
> data: breaks by wool
> t = 1.9956, df = 26, p-value = 0.05656
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -0.1735803 11.7291358
> sample estimates:
> mean of the differences
> 5.777778
>
> ------------------------------------------------------------------------
>
> tension: H
>
> Paired t-test
>
> data: breaks by wool
> t = 1.9956, df = 26, p-value = 0.05656
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -0.1735803 11.7291358
> sample estimates:
> mean of the differences
> 5.777778
Try something like this:
library(MASS)
df <- mvrnorm(30, mu=c(-1,1), Sigma = diag(2))
df <- as.data.frame(df)
df$GROUP <- rep(1:3, each=10)
df.uni <- reshape(df, varying = list(c("V1","V2")), v.names="Y",
direction="long")
by(df.uni, df.uni$GROUP, function(x)t.test(Y ~ time,
data = x, paired = TRUE))
df.uni$GROUP: 1
Paired t-test
data: Y by time
t = -4.3719, df = 9, p-value = 0.001792
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.249894 -1.033530
sample estimates:
mean of the differences
-2.141712
---------------------------------------------------------------------------------------------------------------------------
df.uni$GROUP: 2
Paired t-test
data: Y by time
t = -6.4125, df = 9, p-value = 0.0001234
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.277425 -1.568074
sample estimates:
mean of the differences
-2.422749
---------------------------------------------------------------------------------------------------------------------------
df.uni$GROUP: 3
Paired t-test
data: Y by time
t = -4.4918, df = 9, p-value = 0.001507
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.581428 -1.182313
sample estimates:
mean of the differences
-2.381871
> --
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894
More information about the R-help
mailing list