[R] multiple csv files for T-test
arun
smartpink111 at yahoo.com
Fri Jun 28 14:57:01 CEST 2013
HI,
According to ?t.test() documentation
If ‘paired’ is ‘TRUE’ then both ‘x’ and ‘y’ must be specified and
they must be the same length. Missing values are silently removed
(in pairs if ‘paired’ is ‘TRUE’)
#Example with missing values
set.seed(24)
dat1<- as.data.frame(matrix(sample(c(NA,20:40),40,replace=TRUE),ncol=4))
set.seed(285)
dat2<- as.data.frame(matrix(sample(c(NA,35:60),40,replace=TRUE),ncol=4))
sapply(colnames(dat1),function(i) t.test(dat1[,i],dat2[,i],paired=TRUE)$p.value)
# V1 V2 V3 V4
#7.004488e-05 1.374986e-03 6.666004e-04 3.749257e-04
#Removing missing values and then do the test
sapply(colnames(dat1),function(i) {x1<-na.omit(cbind(dat1[,i],dat2[,i]));t.test(x1[,1],x1[,2],paired=TRUE)$p.value})
# V1 V2 V3 V4
#7.004488e-05 1.374986e-03 6.666004e-04 3.749257e-04
A.K.
thanks very much, you're help is much appreciated.
Just another small question, what's the best way to deal with missing data? If i want to do a paired t-test?
----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc:
Sent: Thursday, June 27, 2013 1:47 PM
Subject: Re: multiple csv files for T-test
Hi,
I used as.data.frame(matrix(...)) just to create an example dataset. In your case, you don't need to do that. Using the same example:
set.seed(24)
dat1<- as.data.frame(matrix(sample(20:40,40,replace=TRUE),ncol=4))
set.seed(285)
dat2<- as.data.frame(matrix(sample(35:60,40,replace=TRUE),ncol=4))
write.csv(dat1,"file1.csv",row.names=FALSE)
write.csv(dat2,"file2.csv",row.names=FALSE)
data1<- read.csv("file1.csv")
data2<- read.csv("file2.csv")
###Your code:
dat1New<- as.data.frame(matrix(data1))
dat2New<- as.data.frame(matrix(data2))
###It is always useful to check ?str()
str(dat1New)
#'data.frame': 4 obs. of 1 variable:
# $ V1:List of 4
# ..$ : int 26 24 34 30 33 39 25 36 36 25
#..$ : int 32 27 34 34 26 38 24 20 30 22
# ..$ : int 21 31 35 22 24 34 21 32 33 20
#..$ : int 26 25 27 23 39 24 35 33 34 40
dat1New
# V1
#1 26, 24, 34, 30, 33, 39, 25, 36, 36, 25
#2 32, 27, 34, 34, 26, 38, 24, 20, 30, 22
#3 21, 31, 35, 22, 24, 34, 21, 32, 33, 20
#4 26, 25, 27, 23, 39, 24, 35, 33, 34, 40
dat2New
# V1
#1 53, 40, 47, 57, 57, 53, 35, 42, 53, 41
#2 54, 37, 43, 40, 57, 42, 37, 53, 60, 39
#3 54, 60, 46, 50, 35, 41, 58, 45, 36, 53
#4 52, 56, 44, 40, 38, 53, 47, 46, 60, 50
sapply(colnames(dat1New),function(i) t.test(dat1New[,i],dat2New[,i],paired=TRUE)$p.value)
#Error in x - y : non-numeric argument to binary operator
##Just using data1 and data2
sapply(colnames(data1),function(i) t.test(data1[,i],data2[,i],paired=TRUE)$p.value)
# V1 V2 V3 V4
#3.202629e-05 6.510644e-04 6.215225e-04 3.044760e-04
#or using dat1New and dat2New
sapply(seq_along(dat1New$V1),function(i) t.test(dat1New$V1[[i]],dat2New$V1[[i]],paired=TRUE)$p.value)
#[1] 3.202629e-05 6.510644e-04 6.215225e-04 3.044760e-04
A.K.
thanks for the reply, I am getting the following error
Error in x - y : non-numeric argument to binary operator
This is what I enter below
> data1 <-read.csv("file1.csv")
> data2 <-read.csv("file2.csv")
> dat1<- as.data.frame(matrix(data1))
> dat2<- as.data.frame(matrix(data2))
> sapply(colnames(dat1),function(i) t.test(dat1[,i],dat2[,i],paired=TRUE)$p.value)
As far as I can see all my values are numeric...?
----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc:
Sent: Thursday, June 27, 2013 10:17 AM
Subject: Re: multiple csv files for T-test
Hi,
May be this helps:
#You can use ?read.csv() to read the two files.
set.seed(24)
dat1<- as.data.frame(matrix(sample(20:40,40,replace=TRUE),ncol=4))
set.seed(285)
dat2<- as.data.frame(matrix(sample(35:60,40,replace=TRUE),ncol=4))
sapply(colnames(dat1),function(i) t.test(dat1[,i],dat2[,i],paired=TRUE)$p.value)
# V1 V2 V3 V4
#3.202629e-05 6.510644e-04 6.215225e-04 3.044760e-04
A.K.
Hi
I am fairly new to R so if this is a stupid question please forgive me.
I have a CSV file with multiple parameters (50). I have another
CSV file with the same parameters after treatment. Is there a way I
can read these two files into R and do multiple paired T-test as all the
parameters are in the same columns in each file?
Thanks in advance
More information about the R-help
mailing list