[Bioc-sig-seq] Rsamtools: Select reads on a chromosome

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu Jan 7 22:55:13 CET 2010


Hi Martin,

On Thu, Jan 7, 2010 at 1:11 AM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
<snip>

>> Is there are "better" way to do it, eg. w/o making the IRanges object
>> that's stretches over the chromosome?
>
> I don't think so, though 'end' doesn't have to be a literal end, e.g,.
> .Machine$integer.max and 'stretches' doesn't really involve any cost --
> just two numbers.

Yeah, I know re: no real cost -- was just curious is all. Good point
on simply using $integer.max, though.

> The use case I was thinking of was a well-defined collection of regions
> of interest, probably coming from some genome annotation, but I guess
> you're interested in chromosome-at-a-time processing?

Yup -- I've been creating some libraries to help deal with *-seq
experiments + data and didn't really have a good way to store and load
the reads quickly until I tried BAM files. Although Rsamtools + BAM
files are really fast, I still like to pull all of my reads per
chromosome into an IntervalTree and do whatever batch processing I
need to against that.

Thanks,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioc-sig-sequencing mailing list