[Bioc-devel] qname modification in BamFile() function of Rsamtools

Oghabian, Ali @li@ogh@bi@n @ending from hel@inki@fi
Mon Jan 7 12:04:36 CET 2019


I have a question regarding how to set qnameSuffixStart and qnamePrefixEnd to be able to read paired reads from bam files:

Many .bam files I worked with have qnames in the format SAMPLENAME.READID.1 and SAMPLENAME.READID.2 .

I was wondering how I can set qnameSuffixStart and qnamePrefixEnd params in BamFile function of Rsamtools to be able to make functions such as readGAlignmentPairs() of GenomicAlignments properly distinguish paired reads based on the READID.

Would setting qnameSuffixStart="." and qnamePrefixEnd="." do the trick?

and What happens if I only set qnamePrefixEnd="." ? Is it greedy? Does it neglect everything after the first "." in qname (i.e. considers SAMPLENAME) or does it omit everything after the last "."  (i.e. considers SAMPLENAME.READID)?




Ali Oghabian

RNA-splicing laboratory (Room 4024B),

Institute of Biotechnology,
P.O.Box 56 (Viikinkaari 5),
00014 University of Helsinki

Email: ali.oghabian using helsinki.fi<https://webmail.helsinki.fi/horde/imp/dynamic.php?page=mailbox#>
Phone: +358 50 4484569

