[BioC] [devteam-bioc] readGAlignmentPairs perfromace issue
Phil East
philip.east at cancer.org.uk
Tue May 20 15:31:40 CEST 2014
Hi Valerie,
Thank you for getting back to me. Here are the times for
readGAlignmentPairs, readGAlignmentsList, and scanBam using the code you
sent.
$readGAlignmentsList
user system elapsed
2529.510 57.487 2589.144
$scanBam
user system elapsed
2465.353 49.404 2516.275
$readGAlignmentPairs
user system elapsed
2560.754 56.612 2619.769
Best wishes
Phil
On Fri, 2014-05-16 at 12:55 -0700, Valerie Obenchain wrote:
> Hi Phil,
>
> We have several functions that call the same C code in the background.
> To help isolate the problem can you please run your code with scanBam()
> and readGAlignmentsList()?
>
> bf <- BamFile(fl, asMates=TRUE)
> readGAlignmentsList(bf, param=param0)
> scanBam(bf, param=param0)
>
> readGAlignmentsList() and readGAlignementPairs() should be very close in
> time. scanBam() will be faster but not by a huge amount.
>
> Thanks.
> Valerie
>
>
> On 05/13/2014 07:23 AM, Maintainer wrote:
> > Hi Guys,
> >
> > I'm experiencing some performance issues with readGAlignmentPairs from the latest version of Bioconductor (GenomicAlignments_1.0.1, BioC 2.14, R 3.1.0)
> >
> > Reading RNASeq paired reads aligned to chr19 (mm9) from a BAM file containing 108,592,829 paired reads takes 3118s. The same code run in R-3.0.2, BioC 2.13, Rsamtools_1.14.3 takes 208s. The results are identical across the two versions.
> >
> > Here's the code:
> >
> > library(GenomicAlignments)
> > library(Rsamtools)
> >
> > param0 <- ScanBamParam(which=GRanges(seqnames="chr19",
> > ranges=IRanges(start=1, end=chr19Length))
> > rd <- readGAlignmentPairs(bamFile, param=param0)
> >
> > Any ideas as to why this might be?
> >
> > Thanks in advance
> >
> > Phil East
> >
> >
> >
> > -- output of sessionInfo():
> >
> > R version 3.1.0 (2014-04-10)
> > Platform: x86_64-unknown-linux-gnu (64-bit)
> >
> > locale:
> > [1] LC_CTYPE=en_GB LC_NUMERIC=C LC_TIME=en_GB
> > [4] LC_COLLATE=en_GB LC_MONETARY=en_GB LC_MESSAGES=en_GB
> > [7] LC_PAPER=en_GB LC_NAME=C LC_ADDRESS=C
> > [10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] grDevices datasets parallel stats graphics utils methods
> > [8] base
> >
> > other attached packages:
> > [1] GenomicAlignments_1.0.1 BSgenome_1.32.0 Rsamtools_1.16.0
> > [4] Biostrings_2.32.0 XVector_0.4.0 GenomicRanges_1.16.3
> > [7] GenomeInfoDb_1.0.2 IRanges_1.22.6 Biobase_2.24.0
> > [10] BiocGenerics_0.10.0
> >
> > loaded via a namespace (and not attached):
> > [1] BatchJobs_1.2 BBmisc_1.6 BiocParallel_0.6.0 bitops_1.0-6
> > [5] brew_1.0-6 codetools_0.2-8 DBI_0.2-7 digest_0.6.4
> > [9] fail_1.2 foreach_1.4.2 iterators_1.0.7 plyr_1.8.1
> > [13] Rcpp_0.11.1 RSQLite_0.11.4 sendmailR_1.1-2 stats4_3.1.0
> > [17] stringr_0.6.2 tools_3.1.0 zlibbioc_1.10.0
> >
> > --
> > Sent via the guest posting facility at bioconductor.org.
> >
> > ________________________________________________________________________
> > devteam-bioc mailing list
> > To unsubscribe from this mailing list send a blank email to
> > devteam-bioc-leave at lists.fhcrc.org
> > You can also unsubscribe or change your personal options at
> > https://lists.fhcrc.org/mailman/listinfo/devteam-bioc
> >
>
>
NOTICE AND DISCLAIMER
This e-mail (including any attachments) is intended for ...{{dropped:8}}
More information about the Bioconductor
mailing list