[BioC] cluster genes based on expression pattern
Moshe Olshansky
olshansky at wehi.EDU.AU
Sun Jun 19 14:50:06 CEST 2011
Hi Stijn,
Thank you for your note.
Are you doing Pearson correlation on the data itself or it's logarithm?
What is the title of the book you mentioned?
Best regards,
Moshe.
>
> On Fri, Jun 17, 2011 at 10:49:08AM +1000, Moshe Olshansky wrote:
>> Hi Rabe,
>>
>> You can check timecourse package (Bioconductor).
>> Sean's suggestion to filter genes is always a good idea.
>> My naive approach would be to define a sensible distance between two
>> genes
>> and use this distance for clustering (one possibility is hclust).
>> To define a distance, suppose that you have two genes, A and B and n+1
>> time points: 0,1,...,n. Let Ai and Bi be expression levels of genes A
>> and
>> B at time i (i=0,1,...,n). One possibility is just the Lp distance (for
>> a
>> suitable p). Another possibility is to say that we do not care about the
>> absolute abundance but only about how it evolves in time and then we can
>> look at AAi = Ai/A1 and BBi = Bi/B1, i=1,2,...,n and take some Lp (or
>> other) distance between AA and BB.
>> These are just some suggestions. You may think of another reasonable
>> distance.
>
>
> Another good choice is Pearson correlation or the absolute value
> of Pearson correlation (in that case, anti-correlated genes
> will cluster with correlated genes).
>
> In our lab we have had good experiences with a network-based approach.
> In this case one chooses a certain threshold, and only retains node-pairs
> for which the (absolute) Pearson correlation falls above that threshold.
> It is possible/advisable to vary such a threshold and look at graph
> statistics such as average node degree and number of singletons to
> get an idea for an appropriate threshold.
>
> From there on, any graph clustering can be used. We use MCL (developed
> in our lab, so naturally). With MCL it pays to further transform the data,
> but I will not elaborate here. Cei Abreu-Goodger and I have written
> a book chapter on this subject, available for anyone interested.
>
> regards,
> Stijn
>
>
> --
> Stijn van Dongen >8< -o) O< forename pronunciation:
> [Stan]
> EMBL-EBI /\\ Tel: +44-(0)1223-492675
> Hinxton, Cambridge, CB10 1SD, UK _\_/ http://micans.org/stijn
>
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list