[BioC] ExpressionSet Time-series correlation stuff

Forst, Christian christian.forst at mssm.edu
Tue Jan 22 22:42:28 CET 2013


Thanks but it doesn't really do what I want. First I still need to do 
sp <- cor(exprs(es[,ts]))

instead of
sp <- cor(es[,ts])

Otherwise I get
Error in cor(es[,ts]) : 
  supply both 'x' and 'y' or a matrix-like 'x'

Then, sp is a square matrix over ts and not over the time-correlated genes which I need.

--------------------------------------
And what I really want is doing time-forward/backward correlation. Can this be done elegantly with cor()? Or would I have to go back to my for-loops?

Chris


________________________________________
From: James W. MacDonald [jmacdon at uw.edu]
Sent: Tuesday, January 22, 2013 16:21
To: Forst, Christian
Cc: bioconductor at r-project.org
Subject: Re: [BioC] ExpressionSet Time-series correlation stuff

Hi Christian,

On 1/22/2013 4:01 PM, Forst, Christian wrote:
> Is there an easier way to do time-series correlation between genes of an ExpressionSet other than using for-loops and cor()? especially if I want to play with the particular time-series?
> And I am not really happy with the packages I found so far: bioDist, qpgraph, qvalue
>
> I have:
>
> es...ExpressionSet
> ts<- c("t1", "t2", "t3", "t4", "t5")  some time series from es (out of many)
>
> sp<- matrix(nrow=10,ncol=10)
> for(i in 1:10) {
>    sp[i,i]<- 1.
>    for(j in i:10) {
>      sp[i,j]<- cor(as.vector(exprs(es[i,ts])), as.vector(exprs(es[j,ts])), method="spearman")
>      sp[j,i]<- sp[i,j]
>    }
> }
>
> And I actually want to do this for all the 40000 genes in es and not 10 as given in the example.

If you are just trying to compute the correlation matrix then you are
doing things the hard way. Note from ?cor

  cor(x, y = NULL, use = "everything",
           method = c("pearson", "kendall", "spearman"))



Arguments:

        x: a numeric vector, matrix or data frame.

So you can just use

sp <- cor(es[,ts])

HOWEVA, this may be slow and may well require more RAM than you have if
you are doing all 40K genes (which might be sort of silly - you will
have high correlations between genes that never change at any time
point; is that interesting?).

There is a faster version of cor() implemented in the WGCNA package that
is designed for these larger scale computations.

Best,

Jim



>
> Thanks - Chris
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099




More information about the Bioconductor mailing list