[Bioc-sig-seq] strandedness bug in Rsamtools?

Steve Lianoglou mailinglist.honeypot at gmail.com
Sat Feb 6 16:29:52 CET 2010


Hi,

On Sat, Feb 6, 2010 at 4:17 AM, Steve Lianoglou
<mailinglist.honeypot at gmail.com> wrote:

> I think there might be a bug in Rsamtools. I have reason to believe
> that the strand information that's being returned from
> scanBam(...)$strand is flipped, ie. a read that should be on the '+'
> strand is being reported as aligning to the '-' strand.

I just wanted to include an example of what I'm seeing (sorry I didn't
before, the need for shut-eye was competing with the desire to write a
better message :-)

Anyway, here's what I get from my bamfile:

$ samtools view my-sorted.bam chr2:2355-2360
1805160 16      chr2    2355    255     20M     *       0       0
 GTCTTACTTTGATTATGATC    IIIIIIIIIIIIIIIIIIII    XA:i:0  MD:Z:20
NM:i:0
4681745 16      chr2    2355    255     20M     *       0       0
 GTCTTACTTTGATTATGATC    IIIIIIIIIIIIIIIIIIII    XA:i:0  MD:Z:20
NM:i:0

I believe a value of 16 in the FLAG field indicates the negative strand ...

Now in R:

R> bf <- 'my-sorted.bam
R> r <- scanBam(bf,
param=ScanBamParam(which=RangesList(chr2=IRanges(start=2355,
end=2360))))[[1]]
R> r$pos
[1] 2355 2355
R> r$strand
[1] + +
Levels: + - *

If I'm still thinking about this straight, and it's a bug, maybe for
now I'll just swap the levels of the r$strand to be c('-', '+', '*')
when I get my reads back from scanBam.

Thanks,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioc-sig-sequencing mailing list