[Bioc-devel] bplapply - memory issue

Paul Benton hpbenton at scripps.edu
Sat Apr 28 01:14:13 CEST 2018


Dear all,

I’m trying to understand how bplapply works with memory and what could be causing the following error - Error: 'bplapply' receive data failed: error reading from connection

When I get this error the system (ubuntu linux 16.04) seems to have filled up it’s memory then severs the connection to the slaves but the slaves stay alive and zombied. Unfortunately the way that the algorithm works it is hard to estimate the amount of memory required by each core prior to running. I have a small example below that seems to show the same behavior except that the zombies are later killed, freeing up the system incase the command was processed with a try except.

So my question is - Is there a memory safe fashion to run bplapply or at least stop it from zombifying its slaves?

Best regards,
Paul

PS: quick google also shows the issue here - https://github.com/YosefLab/scone/issues/69 without a fix.

Attempt to get system to recreate issue on small example ----
[varangian ~] :) ulimit 20
[varangian ~] :) R
R version 3.4.4 (2018-03-15) -- "Someone to Lean On”
….
> library(BiocParallel)
>  options(MulticoreParam=quote(MulticoreParam(workers=4)))
> lmat<-lapply(sample(1:100, 50), function(z){matrix(rnorm(100000, sd=z), ncol=10000)})
> foo<-bplapply(lmat, cor, BPPARAM=SerialParam()) #### run in serial
Error: cannot allocate vector of size 762.9 Mb
> gc()
….
> foo<-bplapply(lmat, cor, BPPARAM=MulticoreParam(4)) #### run in parallel on reduced 4 cores
Error: 'bplapply' receive data failed:
  cannot allocate vector of size 762.9 Mb
> Error in serialize(data, node$con, xdr = FALSE) : ignoring SIGPIPE signal
Error in serialize(data, node$con, xdr = FALSE) : ignoring SIGPIPE signal
Error in serialize(data, node$con, xdr = FALSE) : ignoring SIGPIPE signal


From actual XCMS code error
> xdata <- findChromPeaks(raw_data, param = cwp, BPPARAM=MulticoreParam(4))
Error: 'bplapply' receive data failed:
  error reading from connection
> sessionInfo()
Error in system(paste(which, shQuote(names[i])), intern = TRUE, ignore.stderr = TRUE) :
  cannot popen '/usr/bin/which 'uname' 2>/dev/null', probable reason 'Cannot allocate memory'
> system('free -m')
Warning message:
In system("free -m") : system call failed: Cannot allocate memory
> q()
…..
[varangian ~] :) free -m
              total        used        free      shared  buff/cache   available
Mem:           7983        7774         125           1          82          17
Swap:         16384       16186         198
[varangian ~] :) kill -9 $(pgrep R)
[varangian ~] :) R
> sessionInfo()
...
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] xcms_3.1.1          MSnbase_2.4.2       ProtGenerics_1.10.0
[4] mzR_2.13.5          Rcpp_0.12.16        BiocParallel_1.12.0
[7] Biobase_2.38.0      BiocGenerics_0.24.0

loaded via a namespace (and not attached):
 [1] pillar_1.2.1           compiler_3.4.4         BiocInstaller_1.28.0
 [4] RColorBrewer_1.1-2     plyr_1.8.4             tools_3.4.4
 [7] iterators_1.0.9        zlibbioc_1.24.0        MALDIquant_1.17
[10] digest_0.6.15          preprocessCore_1.40.0  tibble_1.4.2
[13] gtable_0.2.0           lattice_0.20-35        rlang_0.2.0
[16] Matrix_1.2-14          foreach_1.4.4          S4Vectors_0.16.0
[19] IRanges_2.12.0         stats4_3.4.4           multtest_2.34.0
[22] grid_3.4.4             impute_1.52.0          survival_2.41-3
[25] XML_3.98-1.10          RANN_2.5.1             limma_3.34.9
[28] ggplot2_2.2.1          MASS_7.3-49            splines_3.4.4
[31] scales_0.5.0           pcaMethods_1.70.0      codetools_0.2-15
[34] MassSpecWavelet_1.44.0 mzID_1.16.0            colorspace_1.3-2
[37] affy_1.56.0            lazyeval_0.2.1         munsell_0.4.3
[40] doParallel_1.0.11      vsn_3.46.0             affyio_1.48.0

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list