[BioC] How to access custom SAM tags (Rsamtools?)

Nicolas Delhomme delhomme at embl.de
Thu Nov 29 15:35:13 CET 2012


Hej!

You want to use the tag argument of the ScanBamParam object. Here is how I retrieve the BWA XA tags:

bam<- scanBam("my.bam",
               index="my.bam",
               param=ScanBamParam(
                 flag=scanBamFlag(isUnmappedQuery=FALSE),
                 what=c("rname","pos","qwidth"),
                 tag="XA"))[[1]]

Cheers,

Nico

---------------------------------------------------------------
Nicolas Delhomme

Genome Biology Computational Support

European Molecular Biology Laboratory

Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------





On Nov 29, 2012, at 3:23 PM, Kemal Akman wrote:

> Hello,
> 
> I'm interested in accessing extended SAM tags in aligned short read
> sequence files using a R/Bioconductor library. Rsamtools seems
> potentially suited for this purpose, but I couldn't find the arguments
> to return custom SAM tags, such as "XX:Z", but there doesn't seem to
> be a corresponding "what" value in ScanBamParam()?
> 
> I'd be especially interested in such a feature to parse methylation
> strings from Bismark, as well as custom SAM tags from other tools.
> 
> Any suggestions on Rsamtools or alternative methods to achieve this in
> Bioconductor would be much appreciated.
> 
> Example SAM data:
> $ head -2 sample.sam
> SRR306424.2547_PRESLEY:4:4:62:558_length=76     16      chr22
> 30675421        255     36M     *       0       0
> CACACACATCCACATAACACCATAACCAACCCCCGA
> ;,?A9:>B at B?@BBAC at C/BBC at CBCCC8BCBACBB    NM:i:4  XX:Z:15GG6G11G
> XM:Z:...............hh......h..........Zx       XR:Z:CT XG:Z:GA
> SRR306424.5227_PRESLEY:4:4:113:1768_length=76   0       chr5
> 123101409       255     75M     *       0       0
> TGATTTTTATATAAGGTGTAAGTAAGAGGTTTAGTTTTGATTTTTTGTATATGGATAATTAGTTTTTTTAGTATT
>    B>?@BBABBCBCCBCB>B at B@B?B at B@BB>BB at B?BBBB at BCBBABBAABABB@@B??ABAA>BABBBA=><@A=
>    NM:i:13 XX:Z:22C8C12C4C8CC4C1CCC2C1CC
> XM:Z:......................h........x............x....h........hx....h.hhx..h.hh
>       XR:Z:CT XG:Z:CT
> 
> 
> Best regards,
> 
> Kemal Akman
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list