[BioC] [devteam-bioc] readGAlignmentPairs perfromace issue

Valerie Obenchain vobencha at fhcrc.org
Fri May 16 21:55:01 CEST 2014


Hi Phil,

We have several functions that call the same C code in the background. 
To help isolate the problem can you please run your code with scanBam() 
and readGAlignmentsList()?

bf <- BamFile(fl, asMates=TRUE)
readGAlignmentsList(bf, param=param0)
scanBam(bf, param=param0)

readGAlignmentsList() and readGAlignementPairs() should be very close in 
time. scanBam() will be faster but not by a huge amount.

Thanks.
Valerie


On 05/13/2014 07:23 AM, Maintainer wrote:
> Hi Guys,
>
> I'm experiencing some performance issues with readGAlignmentPairs from the latest version of Bioconductor (GenomicAlignments_1.0.1, BioC 2.14, R 3.1.0)
>
> Reading RNASeq paired reads aligned to chr19 (mm9) from a BAM file containing 108,592,829 paired reads takes 3118s. The same code run in R-3.0.2, BioC 2.13, Rsamtools_1.14.3 takes 208s. The results are identical across the two versions.
>
> Here's the code:
>
> library(GenomicAlignments)
> library(Rsamtools)
>
> param0 <- ScanBamParam(which=GRanges(seqnames="chr19",
> ranges=IRanges(start=1, end=chr19Length))
> rd <- readGAlignmentPairs(bamFile, param=param0)
>
> Any ideas as to why this might be?
>
> Thanks in advance
>
> Phil East
>
>
>
>   -- output of sessionInfo():
>
> R version 3.1.0 (2014-04-10)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_GB       LC_NUMERIC=C         LC_TIME=en_GB
>   [4] LC_COLLATE=en_GB     LC_MONETARY=en_GB    LC_MESSAGES=en_GB
>   [7] LC_PAPER=en_GB       LC_NAME=C            LC_ADDRESS=C
> [10] LC_TELEPHONE=C       LC_MEASUREMENT=en_GB LC_IDENTIFICATION=C
>
> attached base packages:
> [1] grDevices datasets  parallel  stats     graphics  utils     methods
> [8] base
>
> other attached packages:
>   [1] GenomicAlignments_1.0.1 BSgenome_1.32.0         Rsamtools_1.16.0
>   [4] Biostrings_2.32.0       XVector_0.4.0           GenomicRanges_1.16.3
>   [7] GenomeInfoDb_1.0.2      IRanges_1.22.6          Biobase_2.24.0
> [10] BiocGenerics_0.10.0
>
> loaded via a namespace (and not attached):
>   [1] BatchJobs_1.2      BBmisc_1.6         BiocParallel_0.6.0 bitops_1.0-6
>   [5] brew_1.0-6         codetools_0.2-8    DBI_0.2-7          digest_0.6.4
>   [9] fail_1.2           foreach_1.4.2      iterators_1.0.7    plyr_1.8.1
> [13] Rcpp_0.11.1        RSQLite_0.11.4     sendmailR_1.1-2    stats4_3.1.0
> [17] stringr_0.6.2      tools_3.1.0        zlibbioc_1.10.0
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> ________________________________________________________________________
> devteam-bioc mailing list
> To unsubscribe from this mailing list send a blank email to
> devteam-bioc-leave at lists.fhcrc.org
> You can also unsubscribe or change your personal options at
> https://lists.fhcrc.org/mailman/listinfo/devteam-bioc
>


-- 
Valerie Obenchain
Program in Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, Seattle, WA 98109

Email: vobencha at fhcrc.org
Phone: (206) 667-3158



More information about the Bioconductor mailing list