[BioC] Which package for gene expression correlation analysis
Steve Lianoglou
mailinglist.honeypot at gmail.com
Mon Jun 28 18:02:14 CEST 2010
Hi,
On Mon, Jun 28, 2010 at 10:53 AM, Yuan Hao <yuan.hao at cantab.net> wrote:
> Dear List,
>
> I would like to ask if there is such a bioconductor package available that
> can help to achieve the following purpose. Thank you very much in advance!
>
> I got 16 Affy chips corresponding to 4 samples: wild-type treated,
> wide-type untreated, knocked-down treated, and knocked-down untreated,
> i.e. 4 replicates for each sample.
>
> I want to look at the expression correlations between genes. Say, my gene
> of interest is gene X. I would like to find out other genes on the chip
> which have the similar expression profiles with gene X across samples. In
> other words, if expression levels of gene X increased from wild-type
> treated to knocked-out treated, I would like to find all the other genes
> have the same trend.
Given the size of the bioconductor universe, it's hard to say with any
certainty that a certain function does NOT exist, but I'd be somehow
surprised if this function actually is there, since it's relatively
easy for you to implement yourself.
You are essentially repeatedly performing a test against each row of
your expression matrix, so think "loops" or some incantation of *apply
methods.
Here's an easy one. Let's assume:
* `exprs` is a (gene x experiment) matrix with your expression value.
* the value `x` holds the row index of the gene you are interested
R> set.seed(123)
R> exprs <- matrix(rnorm(100), 5)
R> x <- 1
Now you want to test the correlation of the vector @ x with the rest.
R> cors <- apply(exprs[-x,], 1, cor.test, exprs[x,])
This gives you a list of correlation tests that you can (i) get the
statistic out of; and (ii) order
R> cors.estimate <- sapply(cors, '[[', 'estimate') ## (i)
R> alike <- order(cors.estimate, decreasing=TRUE) ## (ii)
`alike` now has the indices of genes that are "most + correlated" to
"most - correlated" to gene "x"
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
If you're a bit more familiar with R functions, you might have known
the there is function named "cor" that creates a correlation matrix
out of matrix. This function works column-wise, so you first have to
transpose your matrix:
R> all.cors <- cor(t(exprs))
R> cors.estimate
cor cor cor cor
-0.01971735 -0.26353249 0.03361119 -0.11578081
R> all.cors[1,]
[1] 1.00000000 -0.01971735 -0.26353249 0.03361119 -0.11578081
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
The various cluster/heatmap fucntions do correlation based clustering
by default (I believe), which will group your genes row-wise (and
column wise) for you.
Look at ?heatmap and check what that function returns to you in the
"Value" section.
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioconductor
mailing list