[Bioc-devel] BiocParallel: fine-grained progress bar

Martin Morgan martin.morgan at roswellpark.org
Sun Dec 31 19:41:07 CET 2017


On 12/30/2017 04:08 PM, Ludwig Geistlinger wrote:
> Hi,
> 
> 
> I'm currently playing around with progress bars in BiocParallel - which is a great package! ;-)
> 
> 
> For demonstration, I'm using the example code from DESeq2::DESeq.
> 
> 
> library(DESeq2)
> library(BiocParallel)
> 
> f <- function(mu)
> {
>      cnts <- matrix(rnbinom(n=1000, mu=mu, size=1/0.5), ncol=10)
>      cond <- factor(rep(1:2, each=5))
> 
>      # object construction
>      suppressMessages({
>          dds <- DESeqDataSetFromMatrix(cnts, DataFrame(cond), ~ cond)
>          dds <- DESeq(dds)
>      })
>      res <- results(dds)
> 
>      return(res)
> }
> 
> 
> and apply 'f' to a range of 'mu' values using 'bplapply'.
> 
> mu.grid <- 90:120
> x <- bplapply(mu.grid, f)
> 
> 
> Now, switching to serial execution and verbosing progress
> 
> bp <- registered()$SerialParam
> bpprogressbar(bp) <- TRUE
> register(bp)
> 
> x <- bplapply(mu.grid, f)
> 
> gives me somehow no progress bar at all.

probably a limitation (aka bug)...

> 
> Furthermore, switching to multi-core execution (2 cores) and verbosing progress
> 
> bp <- registered()$MulticoreParam
> bpprogressbar(bp) <- TRUE
> register(bp)
> 
> x <- bplapply(mu.grid, f)
>   |                                                                                                                                           |                                                                      |===================================                                   |                                                                      |======================================================================| 100%
> 
> gives me only a very coarse-grained progress bar (updates when 50% of the job is done, and when the complete job = 100% is done).
> 
> What I actually want to have is a fine-grained progress bar that updates whenever f finishes execution on an element of the vector I am applying over.

For four workers and a job with X = 1:100, bplapply by default divides 
the job into 4 equally sized tasks 1:25, 26:50, ... and sends them off 
to workers. It reports progress as each task (e.g., 1:25) completes, so 
at most there are four ticks. If fine-grained progress trumps all other 
concerns, then setting the number of tasks equal to length(X) will 
indicate progress.

It's not impossible to arrange for more fine-grained progress in all 
cases, and it's a reasonable feature request.

Martin

> 
> 
> In "normal" serial R execution, the desired behavior can be illustrated via
> 
> pb <- txtProgressBar(90, 120, style=3, width=length(mu.grid))
> r <- vector(mode="list", length=length(mu.grid))
> for(i in mu.grid)
> {
>      setTxtProgressBar(pb, i)
>      r[[i-89]] <- f(i)
> }
> close(pb)
> 
> 
> Is there a way to obtain something similar using BiocParallel?
> 
> Thanks,
> Ludwig
> 
> 
> --
> Dr. Ludwig Geistlinger
> CUNY School of Public Health
> 
>> sessionInfo()
> R version 3.4.2 (2017-09-28)
> Platform: x86_64-apple-darwin15.6.0 (64-bit)
> Running under: macOS High Sierra 10.13.1
> 
> Matrix products: default
> BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
> LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
> 
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> 
> attached base packages:
> [1] parallel  stats4    stats     graphics  grDevices utils     datasets
> [8] methods   base
> 
> other attached packages:
>   [1] BiocParallel_1.12.0        DESeq2_1.18.1
>   [3] SummarizedExperiment_1.8.0 DelayedArray_0.4.1
>   [5] matrixStats_0.52.2         Biobase_2.38.0
>   [7] GenomicRanges_1.30.0       GenomeInfoDb_1.14.0
>   [9] IRanges_2.12.0             S4Vectors_0.16.0
> [11] BiocGenerics_0.24.0
> 
> loaded via a namespace (and not attached):
>   [1] genefilter_1.60.0       locfit_1.5-9.1          splines_3.4.2
>   [4] lattice_0.20-35         colorspace_1.3-2        htmltools_0.3.6
>   [7] base64enc_0.1-3         blob_1.1.0              survival_2.41-3
> [10] XML_3.98-1.9            rlang_0.1.4             DBI_0.7
> [13] foreign_0.8-69          bit64_0.9-7             RColorBrewer_1.1-2
> [16] GenomeInfoDbData_0.99.1 plyr_1.8.4              stringr_1.2.0
> [19] zlibbioc_1.24.0         munsell_0.4.3           gtable_0.2.0
> [22] htmlwidgets_0.9         memoise_1.1.0           latticeExtra_0.6-28
> [25] knitr_1.17              geneplotter_1.56.0      AnnotationDbi_1.40.0
> [28] htmlTable_1.9           Rcpp_0.12.14            acepack_1.4.1
> [31] xtable_1.8-2            scales_0.5.0            backports_1.1.1
> [34] checkmate_1.8.5         Hmisc_4.0-3             annotate_1.56.1
> [37] XVector_0.18.0          bit_1.1-12              gridExtra_2.3
> [40] ggplot2_2.2.1           digest_0.6.12           stringi_1.1.6
> [43] grid_3.4.2              tools_3.4.2             bitops_1.0-6
> [46] magrittr_1.5            RSQLite_2.0             lazyeval_0.2.1
> [49] RCurl_1.95-4.8          tibble_1.3.4            Formula_1.2-2
> [52] cluster_2.0.6           Matrix_1.2-12           data.table_1.10.4-3
> [55] rpart_4.1-11            nnet_7.3-12             compiler_3.4.2
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 


This email message may contain legally privileged and/or...{{dropped:2}}



More information about the Bioc-devel mailing list