[BioC] How to search for coexpression?

Björn Usadel usadel at mpimp-golm.mpg.de
Wed Mar 8 17:16:23 CET 2006


Hi Kurt,

as Michael Watson already noted, you can use cor. But unless you did 
significant filtering you are usually not able to calculate an all 
versus all matrix.
so you might want to use
tgenes<-t(expressionvalues) as noted and then

cor(tgenes[,yourgenelist],tgenes) which will "only" give you the 
correlation coeffients of your genes against all others.

However, setting a good threshold for "coexpression" might be difficult. 
You might want to experiment with p-values here. (use Fisher's 
z-transformation)
But if you have a huge amount of arrays you will get "significant" 
p-values, even though the co-expression is minimal.
On the other hand if you have very few arrays, you will hardly ever get 
significance.

Also try to use cor(as above, method="spe") which uses spearmans rank 
correlation. The default pearson is very sensitive to outliers.
However, spearman correlation needs more array data.  Otherwise you will 
run into trouble since all your numerical values are transfomed into 
ranks and with few arrrays only few ranks are possible.


Cheers,
Björn

>Dear colleaques,
>
>After reading data from affymetrix CEL files I would like to
>get information about coexpressed and non-coexpressed genes for
>a chosed set of probesets.
>
>I start with:
>
>  
>
>>library(affy)
>>Data <- ReadAffy()
>>eset <- rma(Data)
>>    
>>
>
>This give me a big array of all intensities for
>all probesets and experiments.
>But what to do then?
>
>Any hints appreciated.
>
>Kurt Stueber
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>  
>



More information about the Bioconductor mailing list