[Bioc-sig-seq] ShortRead sequences

Wolfgang Huber whuber at embl.de
Thu Feb 11 19:59:23 CET 2010


Hi Arnaud

Rsamtools might be your friend. You'll need to work straight off the SVN 
repository though. A useful starting point is the scanBam manual page. 
And: "Rsamtools is experimental; expect frequent changes in data types 
and functionality."

Best wishes
      Wolfgang


--
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber/contact




Droit Arnaud scripsit 02/11/2010 07:15 PM:
> Hello Everyone,
> 
> We are developing a new package for ChiP-Seq analysis.  We use  the  ShorRead package to import the short read data, e.g.
> 
> data<readAligned("s_1_sequence.maq.map",type="MAQMap")
> 
> However, all data information are included in the R alignedRead object (data : sequences, start, stop, strand, etc), but in ChIP-Seq we do not really need the short read sequences, only their position and strand information. Is there a more direct/efficient way to do it? We know that we can convert/coerce the alignedRead into a RangedData or a similar object but this is not very efficient. Our experience is that the readAligned function can be very demanding both in memory and time when reading large datafiles even with 32G of RAM (on a 64bit R version), and we think that this is mostly due to the sequences being read.
> We are wondering if there is any available filters to include only the start, the end and the strand or perhaps a different R object, R function that we should use?
> 
> Thanks
> 
> Arnaud.
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing



More information about the Bioc-sig-sequencing mailing list