[Bioc-devel] GenomicAlignments: using asMates=TRUE and yieldSize with paired-end BAM files

Valerie Obenchain vobencha at fhcrc.org
Wed Mar 19 18:30:09 CET 2014


On 03/19/14 07:39, Michael Love wrote:
> hi,
>
>>From last year, in order to use yieldSize with paired-end BAMs, I
> should sort the BAMs by qname and then use the following call to
> BamFile:
>
> library(pasillaBamSubset)
> fl <- sortBam(untreated3_chr4(), tempfile(), byQname=TRUE)
> bf <- BamFile(fl, index=character(0), yieldSize=3, obeyQname=TRUE)
>
> https://stat.ethz.ch/pipermail/bioconductor/2013-March/051490.html
>
> If I want to use GenomicAlignments::readGAlignmentsList with
> asMates=TRUE and respecting the yieldSize, what is the proper
> construction? (in the end, I want to use summarizeOverlaps on
> paired-end BAMs while respecting the yieldSize)
>
> library(pasillaBamSubset)
> fl <- sortBam(untreated3_chr4(), tempfile(), byQname=TRUE)
> bf <- BamFile(fl, index=character(0), yieldSize=3, obeyQname=TRUE, asMates=TRUE)
> x <- readGAlignmentsList(bf)
> Warning message:
>   In scanBam(bamfile, ..., param = param) :
>     'obeyQname=TRUE' ignored when 'asMates=TRUE'
>   Calls: readGAlignmentsList ... .matesFromBam ->
> .load_bamcols_from_bamfile -> scanBam -> scanBam
>
> I see in the man pages for summarizeOverlaps it has:
>
> "In Bioconductor > 2.12 it is not
> necessary to sort paired-end BAM files by ‘qname’. When
> counting with ‘summarizeOverlaps’, setting ‘singleEnd=FALSE’
> will trigger paired-end reading and counting."
>
> but I don't see how this can respect the specified yieldSize, because
> readGAlignmentsList has to read in as many reads as necessary to find
> the mate.
>

I didn't specifically answer this in my last email. The reason yieldSize 
is no longer a problem is that we've rewritten the mate pairing 
algorithm in C so pairing is no longer done in R. Both 
readGAlignmentsList() and readGAlignmentPairs() use the same code. The 
*List container is more flexible in what it can hold (singletons, reads 
with unmapped mates etc.) and the *Pairs container holds mated-pairs in 
a left-right organization.

Valerie


> Sorry in advance if I am missing something in the documentation!
>
> Mike
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



More information about the Bioc-devel mailing list