[R] multiple t-tests across similar variable names
Rui Barradas
ruipbarradas at sapo.pt
Thu Oct 11 15:25:43 CEST 2012
Hello,
I have a problem, with your data example my results are different. I
have changed the names of two of the variables, to allow for 'pre' and
'post' to be first in the names.
# auxiliary functions
ifswap <- function(x)
if(x[1] %in% c("pre", "post")) x[2:1] else x
getpair <- function(i, post)
post[ which(vmat[post, 1] == vmat[i, 1]) ]
makeLine <- function(h)
c(MeanDiff = unname(h$estimate),
CIlower = h$conf.int[1],
CIupper = h$conf.int[2],
p.value = h$p.value)
doTests <- function(DF, Pairs){
t.list <- lapply( seq_len(nrow(Pairs)), function(i)
t.test(DF[, Pairs[i, 1]], DF[, Pairs[i, 2]], paired = TRUE) )
do.call(rbind, lapply(t.list, makeLine))
}
# dataset
set.seed(432)
dat2 <- data.frame(apple_pre = sample(10:20,5,replace=TRUE),
orange_post = sample(18:28,5,replace=TRUE),
pre_banana = sample(25:35,5,replace=TRUE), # here
apple_post = sample(20:30,5,replace=TRUE),
post_banana = sample(40:50,5,replace=TRUE), # and here
orange_pre = sample(5:10,5,replace=TRUE))
#--------------------------------
# start processing the data.frame
# Make pairs of pre/post columns
vars <- names(dat2)
vmat <- do.call(rbind, strsplit(vars, "_"))
vmat <- t(apply(vmat, 1, ifswap))
pre <- which(vmat[, 2] == "pre")
post <- which(vmat[, 2] == "post")
post <- sapply(pre, getpair, post)
pairs <- matrix(c(pre, post), ncol = 2)
# now the tests
result <- doTests(dat2, pairs)
rownames(result) <- vmat[pre, 1]
result
In your results I believe that the values for meandifference are the
means of x[, 1], at least that's what I've got.
Anyway, I'll see both codes again, to try to see what's going on.
Hope this helps,
Rui Barradas
Em 11-10-2012 05:31, arun escreveu:
> HI,
>
> If you have a lot of variables and in no order, then it would be better to order the data by column names.
> For e.g.
> set.seed(432)
> dat2<-data.frame(apple_pre=sample(10:20,5,replace=TRUE),orange_post=sample(18:28,5,replace=TRUE),banana_pre=sample(25:35,5,replace=TRUE),apple_post=sample(20:30,5,replace=TRUE),banana_post=sample(40:50,5,replace=TRUE),orange_pre=sample(5:10,5,replace=TRUE))
> dat3<-dat2[order(colnames(dat2))] #order the columns
> list3<-list(dat3[,1:2],dat3[,3:4],dat3[,5:6])
> res3<-do.call(rbind,lapply(lapply(list3,function(x) t.test(x[,1],x[,2],paired=TRUE)),function(x) data.frame(meandifference=x$estimate,CIlow=unlist(x$conf.int)[1],CIhigh=unlist(x$conf.int)[2],p.value=x$p.value)))
> row.names(res3)<-unlist(unique(lapply(strsplit(colnames(dat3),"_"),`[`,1)))
> res3
> # meandifference CIlow CIhigh p.value
> #apple 12.6 8.519476 16.68052 0.0010166626
> #banana 15.0 12.088040 17.91196 0.0001388506
> #orange 18.2 13.604166 22.79583 0.0003888560
>
> A.K.
>
>
>
> ----- Original Message -----
> From: "Nundy, Shantanu" <snundy at chicagobooth.edu>
> To: "r-help at r-project.org" <r-help at r-project.org>
> Cc:
> Sent: Wednesday, October 10, 2012 7:09 PM
> Subject: Re: [R] multiple t-tests across similar variable names
>
> Hi everyone-
>
> I have a dataset with multiple "pre" and "post" variables I want to compare. The variables are named "apple_pre" or "pre_banana" with the corresponding post variables named "apple_post" or "post_banana". The variables are in no particular order.
>
> apple_pre orange_pre orange_post pre_banana apple_post post_banana
> person_1
> person_2
> person_3
> ...
> person_x
>
>
> How do I:
> 1. Run a series of paired t-tests for the apple_pre variables and pre_banana variables? Would be great to do something like ttest(*.*pre*.*,*.*post*.*).
> 2. Print the results from these t-tests in a table with col 1=mean difference, col 2= 95% conf interval, col 3=p-value.
>
> Thank you kindly,
> -Shantanu
>
> Shantanu Nundy, M.D.
> University of Chicago
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list