[Bioc-sig-seq] Rsamtools: Select reads on a chromosome
mailinglist.honeypot at gmail.com
Wed Jan 20 23:04:43 CET 2010
About selecting all reads on a chromosome:
On Thu, Jan 7, 2010 at 1:11 AM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>> which <- RangesList(chr1=IRanges(start=1, end=247249719))
>> params <- ScanBamParams(which=which)
>> reads <- scanBam(my.bam.file, param=params)[]
>> Is there are "better" way to do it, eg. w/o making the IRanges object
>> that's stretches over the chromosome?
> I don't think so, though 'end' doesn't have to be a literal end, e.g,.
> .Machine$integer.max and 'stretches' doesn't really involve any cost --
> just two numbers.
I just tried to do this in another context, but this actually send R
into a tailspin, eg:
R> which <- RangesList(chr1=IRanges(start=1, end=.Machine$integer.max-1))
R> r <- scanBam('scratch-sorted.bam', param=ScanBamParam(what='pos',
*** caught segfault ***
address 0x0, cause 'unknown'
1: .Call(func, file, index, "rb", list(space(which),
.uunlist(start(which)), .uunlist(end(which))), flag, simpleCigar,
2: .io_bam(.scan_bam, file, index, tmpl, param = param)
3: .local(file, index, ...)
4: scanBam("scratch-sorted.bam", param = ScanBamParam(what = "pos",
which = which))
5: scanBam("scratch-sorted.bam", param = ScanBamParam(what = "pos",
which = which))
I have access to chromosome length information, so it's not really a
problem for me, but it seems as if something is happening which you
didn't expect, so I thought you'd like to know.
ps: I'm using IRanges(.., end=.Machine$integer.max-1) because using
.Machine$integer.max causes an integer overflow
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioc-sig-sequencing