[R] spearman correlation and p-value as a matrix

arun smartpink111 at yahoo.com
Thu Feb 14 06:16:40 CET 2013


HI,

#ag data was created
bg<- as.matrix(read.table(text=" 
     Otu00022 Otu00029 Otu00039 Otu00042 Otu00101 Otu00105 Otu00125 Otu00131 Otu00137 Otu00155 Otu00158 Otu00172 Otu00181 Otu00185 Otu00190 Otu00209 Otu00218 
Gi20Jun11  0.001217        0 0.001217        0 0.000000        0        0        0 0.001217        0        0        0        0        0 0.001217        0 0.001217 
Gi40Jun11  0.000000        0 0.000000        0 0.000000        0        0        0 0.000000        0        0        0        0        0 0.000000        0 0.000000 
Gi425Jun11 0.000000        0 0.000000        0 0.000000        0        0        0 0.000000        0        0        0        0        0 0.000000        0 0.000000 
Gi45Jun11  0.000000        0 0.000000        0 0.001513        0        0        0 0.000000        0        0        0        0        0 0.000000        0 0.000000 
Gi475Jun11 0.000000        0 0.000000        0 0.000000        0        0        0 0.000000        0        0        0        0        0 0.000000        0 0.000000 
Gi50Jun11  0.000000        0 0.000000        0 0.000000        0        0        0 0.000000        0        0        0        0        0 0.000000        0 0.000000 
",sep="",header=TRUE,stringsAsFactors=F))
set.seed(128)
ag<- matrix(rnorm(30),nrow=6)
colnames(ag)<- paste("ag",1:5,sep="")
bg_ag<-expand.grid(colnames(bg),colnames(ag),stringsAsFactors=FALSE)
 attr(bg_ag,"out.attrs")<- NULL
 library(Hmisc)
#correlation
resr<-do.call(rbind,lapply(split(bg_ag,1:nrow(bg_ag)),function(x) {res<-rcorr(cbind(bg[,x[,1]],ag[,x[,2]]),type="spearman")$r; row.names(res)<- rep(paste(x[1],x[2],sep="_"),2);res}))
 head(resr)
#                 [,1]      [,2]
#Otu00022_ag1 1.0000000 0.1309307
#Otu00022_ag1 0.1309307 1.0000000
#Otu00029_ag1 1.0000000       NaN
#Otu00029_ag1       NaN 1.0000000
#Otu00039_ag1 1.0000000 0.1309307
#Otu00039_ag1 0.1309307 1.0000000

#p-values
resP<-do.call(rbind,lapply(split(bg_ag,1:nrow(bg_ag)),function(x) {res<-rcorr(cbind(bg[,x[,1]],ag[,x[,2]]),type="spearman")$P; row.names(res)<- rep(paste(x[1],x[2],sep="_"),2);res}))
  head(resP)
#                  [,1]      [,2]
#Otu00022_ag1        NA 0.8047262
#Otu00022_ag1 0.8047262        NA
#Otu00029_ag1        NA       NaN
#Otu00029_ag1       NaN        NA
#Otu00039_ag1        NA 0.8047262
#Otu00039_ag1 0.8047262        NA

#If you need only the values
indx<-row(resr)%%2!=1
 resPnew<-as.matrix(resP[indx[,1],1])
 resrnew<-as.matrix(resr[indx[,1],1])

head(resPnew)
#                  [,1]
#Otu00022_ag1 0.8047262
#Otu00029_ag1       NaN
#Otu00039_ag1 0.8047262
#Otu00042_ag1       NaN
#Otu00101_ag1 0.1583024
#Otu00105_ag1       NaN

head(resrnew)
#                   [,1]
#Otu00022_ag1  0.1309307
#Otu00029_ag1        NaN
#Otu00039_ag1  0.1309307
#Otu00042_ag1        NaN
#Otu00101_ag1 -0.6546537
#Otu00105_ag1        NaN

A.K.




----- Original Message -----
From: Ozgul Inceoglu <Ozgul.Inceoglu at ulb.ac.be>
To: r-help at r-project.org
Cc: 
Sent: Wednesday, February 13, 2013 4:48 AM
Subject: [R] spearman correlation and p-value as a matrix

I have two data matrices that I want to make the correlation between each column from data1 and each column from data 2 and also calculate the p-value Matrices dont have the same size and I tried such a script.
> bg <- read.table (file.choose(), header=T, row.names) 
> bg 
> Otu00022 Otu00029 Otu00039 Otu00042 Otu00101 Otu00105 Otu00125 Otu00131 Otu00137 Otu00155 Otu00158 Otu00172 Otu00181 Otu00185 Otu00190 Otu00209 Otu00218 
> Gi20Jun11 0.001217 0 0.001217 0 0.000000 0 0 0 0.001217 0 0 0 0 0 0.001217 0 0.001217 
> Gi40Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 
> Gi425Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 
> Gi45Jun11 0.000000 0 0.000000 0 0.001513 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 
> Gi475Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 
> Gi50Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 
ag <- read.table (file.choose(), header=T, row.names) 

for (i in 1:(ncol(bg))) 
for (j in 1:(ncol(ag))) 
print(c(i,j))
final_matrix <- matrix(rep("0",ncol(bg)*ncol(ag)),ncol=ncol(bg),nrow=ncol(ag))

cor <- cor.test(as.vector(as.matrix(bg[,i])),as.vector(as.matrix(ag[,j])), method="spearman")

#but the output is not matrice with all the values but a single correlation value

data:  bg[, i] and ag[, j] 
t = 2.2992, df = 26, p-value = 0.02978
alternative hypothesis: true correlation is not equal to 0 
95 percent confidence interval:
0.04485289 0.67986803 
sample estimates:
      cor 
0.4110515 

# How I can creat an outfile with all the correlations and p-values?

Thank you very much!
Özgül

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list