[Bioc-sig-seq] ShortRead sequences

Droit Arnaud Arnaud.Droit at ircm.qc.ca
Thu Feb 11 19:15:17 CET 2010


Hello Everyone,

We are developing a new package for ChiP-Seq analysis.  We use  the  ShorRead package to import the short read data, e.g.

data<readAligned("s_1_sequence.maq.map",type="MAQMap")

However, all data information are included in the R alignedRead object (data : sequences, start, stop, strand, etc), but in ChIP-Seq we do not really need the short read sequences, only their position and strand information. Is there a more direct/efficient way to do it? We know that we can convert/coerce the alignedRead into a RangedData or a similar object but this is not very efficient. Our experience is that the readAligned function can be very demanding both in memory and time when reading large datafiles even with 32G of RAM (on a 64bit R version), and we think that this is mostly due to the sequences being read.
We are wondering if there is any available filters to include only the start, the end and the strand or perhaps a different R object, R function that we should use?

Thanks

Arnaud.



More information about the Bioc-sig-sequencing mailing list