[BioC] Finding similarities between multiple sequenced genomic libraries

Gordon [guest] guest at bioconductor.org
Sun Nov 24 13:59:38 CET 2013


I have three sequenced (Illumina) genomic libraries (L1,L2,L3) and three biological replicates for each (A,B,C).
I'd like to compare those libraries, and find which are the most similar (e.g. is "L1" more similar to "L2" or to "L3", based on the profile of mapped short reads in the libraries).

I've generated windows of 2kb over the genome, and counted how many reads from each library/replicate fall within the window.

The result is similar to this:
Window  L1_RepA  L1_RepB  L1_RepC  L2_RepA  L2_RepB  L2_RepC  L3_RepA  L3_RepB  L3_RepC
chr1_1  10       15       0        1000     1100     800      90       0        300
chr1_2  77       80       99       3        4        12       100      200      193
(and so forth for every window over the chromosomes).

What would be a reliable way to compare these three libraries, as a whole (meaning, not just "differential expression" on each window independently, as would be done with genes) ?

Thank you,

 -- output of sessionInfo(): 


Sent via the guest posting facility at bioconductor.org.

More information about the Bioconductor mailing list