[R] t.tests on a data.frame using an apply-type function

Alison Macalady ali at kmhome.org
Sat Aug 21 16:15:37 CEST 2010


I have a data.frame with ~250 observations (rows) in each of ~50  
categories (columns).  I would like to perform t.tests on subsets of  
observations within each column, with the subsets according to index  
vectors contained in other columns of the data.frame.

My data.frame looks something like this:

x<-data.frame(matrix(rnorm(200,mean=5,sd=.5),nrow=20))
colnames(x)<-c("site", "status", "X1", "X2", "X3", "X4", "X5", "X6",  
"X7", "X8")
x$site<-as.factor(rep(c("A", "A", "B", "B", "C"), 4))
x$status<-as.factor(rep(c("D", "L"), 10))

I want to do t.tests on the numeric observations within the data.frame  
by "site" and by "status":

t.test(x[x$site == "A" & x$status =="D",]$X1, x[x$site == "A" & x 
$status =="L",]$X1)
t.test(x[x$site == "B" & x$status =="D",]$X1, x[x$site == "B" & x 
$status =="L",]$X1)
t.test(x[x$site == "C" & x$status =="D",]$X1, x[x$site == "C" & x 
$status =="L",]$X1)

t.test(x[x$site == "A" & x$status =="D",]$X2, x[x$site == "A" & x 
$status =="L",]$X2)
t.test(x[x$site == "B" & x$status =="D",]$X2, x[x$site == "B" & x 
$status =="L",]$X2)
t.test(x[x$site == "C" & x$status =="D",]$X2, x[x$site == "C" & x 
$status =="L",]$X2)

etc...

I know I must be able to do this more efficently using a loop and one  
of the apply functions, e.g. something like this:

k=length(levels(x$site))
for (i in 1:k)
{
site<-levels(x$site)[i]
x1<-x[x$site == site, ]
results[i]<-apply(x1, 2, function(x1) {t.test(x1[x1$status == "D",],  
x1[x1$status == "L",])})
results
}

But I can't figure out how to do the apply function correctly...

Also wonder whether there's a way to use the apply-type function and  
aviod the loop all together.

Thanks in advance!

Ali



More information about the R-help mailing list