[R] Pearson corelation and p-value for matrix

John Fox jfox at mcmaster.ca
Sat Apr 16 03:36:05 CEST 2005


Dear Mark,

I think that the reflex of trying to avoid loops in R is often mistaken, and
so I decided to try to time the two approaches (on a 3GHz Windows XP
system).

I discovered, first, that there is a bug in your function -- you appear to
have indexed rows instead of columns; fixing that:

cor.pvals <- function(mat)
{
  cols <- expand.grid(1:ncol(mat), 1:ncol(mat))
  matrix(apply(cols, 1,
               function(x) cor.test(mat[, x[1]], mat[, x[2]])$p.value),
         ncol = ncol(mat))
}


My function is cor.pvalues and yours cor.pvals. This is for a data matrix
with 1000 observations on 100 variables:

> R <- diag(100)
> R[upper.tri(R)] <- R[lower.tri(R)] <- .5
> library(mvtnorm)
> X <- rmvnorm(1000, sigma=R)
> dim(X)
[1] 1000  100
> 
> system.time(cor.pvalues(X))
[1] 5.53 0.00 5.53   NA   NA
> 
> system.time(cor.pvals(X))
[1] 12.66  0.00 12.66    NA    NA
> 

I frankly didn't expect the advantage of my approach to be this large, but
there it is.

Regards,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 

> -----Original Message-----
> From: Marc Schwartz [mailto:MSchwartz at MedAnalytics.com] 
> Sent: Friday, April 15, 2005 7:08 PM
> To: John Fox
> Cc: 'Dren Scott'; R-Help
> Subject: RE: [R] Pearson corelation and p-value for matrix
> 
> Here is what might be a slightly more efficient way to get to John's
> question:
> 
> cor.pvals <- function(mat)
> {
>   rows <- expand.grid(1:nrow(mat), 1:nrow(mat))
>   matrix(apply(rows, 1,
>                function(x) cor.test(mat[x[1], ], mat[x[2], 
> ])$p.value),
>          ncol = nrow(mat))
> }
> 
> HTH,
> 
> Marc Schwartz
> 
> On Fri, 2005-04-15 at 18:26 -0400, John Fox wrote:
> > Dear Dren,
> > 
> > How about the following?
> > 
> >  cor.pvalues <- function(X){
> >     nc <- ncol(X)
> >     res <- matrix(0, nc, nc)
> >     for (i in 2:nc){
> >         for (j in 1:(i - 1)){
> >             res[i, j] <- res[j, i] <- cor.test(X[,i], X[,j])$p.value
> >             }
> >         }
> >     res
> >     }
> > 
> > What one then does with all of those non-independent test 
> is another 
> > question, I guess.
> > 
> > I hope this helps,
> >  John
> 
> > > -----Original Message-----
> > > From: r-help-bounces at stat.math.ethz.ch 
> > > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Dren Scott
> > > Sent: Friday, April 15, 2005 4:33 PM
> > > To: r-help at stat.math.ethz.ch
> > > Subject: [R] Pearson corelation and p-value for matrix
> > > 
> > > Hi,
> > >  
> > > I was trying to evaluate the pearson correlation and the p-values 
> > > for an nxm matrix, where each row represents a vector. 
> One way to do 
> > > it would be to iterate through each row, and find its correlation 
> > > value( and the p-value) with respect to the other rows. Is there 
> > > some function by which I can use the matrix as input? 
> Ideally, the 
> > > output would be an nxn matrix, containing the p-values 
> between the 
> > > respective vectors.
> > >  
> > > I have tried cor.test for the iterations, but couldn't find a 
> > > function that would take the matrix as input.
> > >  
> > > Thanks for the help.
> > >  
> > > Dren
> 
>




More information about the R-help mailing list