[BioC] [devteam-bioc] readGAlignmentPairs perfromace issue

Phil East philip.east at cancer.org.uk
Tue May 20 15:31:40 CEST 2014


Hi Valerie,

Thank you for getting back to me. Here are the times for
readGAlignmentPairs, readGAlignmentsList, and scanBam using the code you
sent.

$readGAlignmentsList
    user   system  elapsed 
2529.510   57.487 2589.144 

$scanBam
    user   system  elapsed 
2465.353   49.404 2516.275 

$readGAlignmentPairs
    user   system  elapsed 
2560.754   56.612 2619.769

Best wishes
Phil

On Fri, 2014-05-16 at 12:55 -0700, Valerie Obenchain wrote:
> Hi Phil,
> 
> We have several functions that call the same C code in the background. 
> To help isolate the problem can you please run your code with scanBam() 
> and readGAlignmentsList()?
> 
> bf <- BamFile(fl, asMates=TRUE)
> readGAlignmentsList(bf, param=param0)
> scanBam(bf, param=param0)
> 
> readGAlignmentsList() and readGAlignementPairs() should be very close in 
> time. scanBam() will be faster but not by a huge amount.
> 
> Thanks.
> Valerie
> 
> 
> On 05/13/2014 07:23 AM, Maintainer wrote:
> > Hi Guys,
> >
> > I'm experiencing some performance issues with readGAlignmentPairs from the latest version of Bioconductor (GenomicAlignments_1.0.1, BioC 2.14, R 3.1.0)
> >
> > Reading RNASeq paired reads aligned to chr19 (mm9) from a BAM file containing 108,592,829 paired reads takes 3118s. The same code run in R-3.0.2, BioC 2.13, Rsamtools_1.14.3 takes 208s. The results are identical across the two versions.
> >
> > Here's the code:
> >
> > library(GenomicAlignments)
> > library(Rsamtools)
> >
> > param0 <- ScanBamParam(which=GRanges(seqnames="chr19",
> > ranges=IRanges(start=1, end=chr19Length))
> > rd <- readGAlignmentPairs(bamFile, param=param0)
> >
> > Any ideas as to why this might be?
> >
> > Thanks in advance
> >
> > Phil East
> >
> >
> >
> >   -- output of sessionInfo():
> >
> > R version 3.1.0 (2014-04-10)
> > Platform: x86_64-unknown-linux-gnu (64-bit)
> >
> > locale:
> >   [1] LC_CTYPE=en_GB       LC_NUMERIC=C         LC_TIME=en_GB
> >   [4] LC_COLLATE=en_GB     LC_MONETARY=en_GB    LC_MESSAGES=en_GB
> >   [7] LC_PAPER=en_GB       LC_NAME=C            LC_ADDRESS=C
> > [10] LC_TELEPHONE=C       LC_MEASUREMENT=en_GB LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] grDevices datasets  parallel  stats     graphics  utils     methods
> > [8] base
> >
> > other attached packages:
> >   [1] GenomicAlignments_1.0.1 BSgenome_1.32.0         Rsamtools_1.16.0
> >   [4] Biostrings_2.32.0       XVector_0.4.0           GenomicRanges_1.16.3
> >   [7] GenomeInfoDb_1.0.2      IRanges_1.22.6          Biobase_2.24.0
> > [10] BiocGenerics_0.10.0
> >
> > loaded via a namespace (and not attached):
> >   [1] BatchJobs_1.2      BBmisc_1.6         BiocParallel_0.6.0 bitops_1.0-6
> >   [5] brew_1.0-6         codetools_0.2-8    DBI_0.2-7          digest_0.6.4
> >   [9] fail_1.2           foreach_1.4.2      iterators_1.0.7    plyr_1.8.1
> > [13] Rcpp_0.11.1        RSQLite_0.11.4     sendmailR_1.1-2    stats4_3.1.0
> > [17] stringr_0.6.2      tools_3.1.0        zlibbioc_1.10.0
> >
> > --
> > Sent via the guest posting facility at bioconductor.org.
> >
> > ________________________________________________________________________
> > devteam-bioc mailing list
> > To unsubscribe from this mailing list send a blank email to
> > devteam-bioc-leave at lists.fhcrc.org
> > You can also unsubscribe or change your personal options at
> > https://lists.fhcrc.org/mailman/listinfo/devteam-bioc
> >
> 
> 



NOTICE AND DISCLAIMER
This e-mail (including any attachments) is intended for ...{{dropped:8}}



More information about the Bioconductor mailing list