[Bioc-devel] GPos slower than GRanges ?
Charles Plessy
charles-listes+bioc-devel at plessy.org
Fri Feb 9 05:03:47 CET 2018
Hello,
I have just discovered the GPos class, and I would like to use it in
my "CAGEr" package, where for the moment I store single-nucleotide
positions of transcription start sites in GRanges of width 1.
But a simple microbenchmark sugests that, although GPos are more
memory-efficient, they also may be more CPU-hungry, at least
with the "range" function.
Is there a way to optimise, or is it better to stay with
GRanges of width 1 when memory is not an issue ?
> gpos1 <- GPos(c("chr1:44-53", "chr1:5-10", "chr2:2-5"))
> granges1 <- GRanges(gpos1)
> microbenchmark::microbenchmark(range(granges1), range(gpos1))
Unit: milliseconds
expr min lq mean median uq max neval cld
range(granges1) 21.42761 21.97009 24.1627 22.24532 22.92655 179.9715 100 a
range(gpos1) 30.11515 30.84472 32.8824 31.36639 32.19281 104.3027 100 b
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.7.0
LAPACK: /usr/lib/lapack/liblapack.so.3.7.0
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8
[4] LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicRanges_1.31.16 GenomeInfoDb_1.15.5 IRanges_2.13.22 S4Vectors_0.17.30
[5] BiocGenerics_0.25.2
loaded via a namespace (and not attached):
[1] Rcpp_0.12.14 XVector_0.19.8 MASS_7.3-47 splines_3.4.3
[5] zlibbioc_1.24.0 munsell_0.4.3 lattice_0.20-35 colorspace_1.3-2
[9] rlang_0.1.4 multcomp_1.4-8 plyr_1.8.4 tools_3.4.3
[13] grid_3.4.3 gtable_0.2.0 TH.data_1.0-8 survival_2.41-3
[17] yaml_2.1.15 lazyeval_0.2.1 tibble_1.3.4 Matrix_1.2-12
[21] GenomeInfoDbData_0.99.1 ggplot2_2.2.1 codetools_0.2-15 microbenchmark_1.4-2.1
[25] bitops_1.0-6 RCurl_1.95-4.10 sandwich_2.4-0 compiler_3.4.3
[29] scales_0.5.0 mvtnorm_1.0-6 zoo_1.8-0
(I have also made a benchmark on "real" data, which confirmed the test above)
Have a nice day,
Charles
--
Charles Plessy, Ph.D. – RIKEN Center for Life Science Technologies
Division of Genomic Technologies – Genomics Miniaturization Technology Unit
1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045 Japan
■■□―――――――――― http://population-transcriptomics.org ――――――――――□■■
More information about the Bioc-devel
mailing list