[R-sig-hpc] mclapply processes return an error for every nth job, where n is the number of cores, using flowCore and ggcyto
MEC @end|ng |rom @tower@@org
Mon Sep 16 20:33:21 CEST 2019
I am experiencing a strange issue in my use of mclapply.
mclapply is returning an error for every nth job, where n is the number of cores
As part of debugging, I'm reporting the indices of jobs which return an error. They are the same from run to run, until I change the value for mc.cores. I don't yet see what governs the value of the first error index, but subsequent errors occur every addition mc.cores slots.
Example: running 383 jobs across 70 cores, I report
* FailCount: 5
* FailIndex: 61 131 201 271 341
And notice that the diff(FailIndex) is consistently 70
Example: running the same 383 jobs (in the same order) across 80 cores, I report
* FailCount: 5
* FailIndex: 51 131 211 291 371
The 5 specific errors are all the same and are occurring within the creation of a plot using ggcyto::ggcyto. The error only occurs when a plot is attempted with an empty dataset (ggcyto has been fixed to be graceful with empty data, but I'm not running the latest). The dataset is read from disk within the forked process using flowCore::read.FCS. But, there is no reason that the dataset should be empty. It is as if somehow running under mclapply was effecting the io of read.FCS.
So far I tried,
* Running job using lapply instead of mclapply. Result: all jobs complete; FailCount: 0.
* Using simplified version of parallelization, directly using mccollect/mcparallel. Result: same pattern of error.
* Changing mc.cores option (the number of cores used). Result: the jobs which produce the error change as described, every mc.cores-th job fails.
* As suggested<https://community.rstudio.com/t/bug-ggsave-does-not-work-when-called-in-mclapply-in-rstudio-ide-same-code-works-perfect-at-cli/7991/2>, calling `suppressGraphics` (from R.devices) at top level of each forked process. Result: same pattern.
* Removing dependency on data.table (rbindlist). Result: same pattern.
* calling mclapply with mc.preschedule=FALSE. Result: only one job fails, whose index is the second one reported when mc.preschedule=TRUE for the same mc.core value.
I thought to additionally try:
* pass an affinity.list to mclapply to ensure that forked jobs would not run on same processor as caller of mclapply (given a "hunch" that the error producing process may be running on the "head" node). Result: same as mc.preschedule=FALSE along. Not sure my implementation is correct:
I have found the following reports of similar issues with no resolution
* mclapply encounters errors depending on core id?<https://stackoverflow.com/questions/52745779/mclapply-encounters-errors-depending-on-core-id>
* mclapply fails with data.table<https://stackoverflow.com/questions/54959207/mclapply-fails-with-data-table>
* Bug: ggsave() does not work when called in mclapply() in RStudio IDE (same code works perfect at CLI)<https://community.rstudio.com/t/bug-ggsave-does-not-work-when-called-in-mclapply-in-rstudio-ide-same-code-works-perfect-at-cli/7991>
I appreciate any insight or guidance in how to better approach sleuthing the root cause or fixing this matter, or other reports of similar problems.
malcolm_cook using stowers.org
R version 3.5.2 (2018-12-20)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
attached base packages:
 parallel stats graphics grDevices utils datasets methods
other attached packages:
 R.devices_2.16.0 gtools_3.8.1
 ggcyto_1.10.2 flowWorkspace_3.30.2
 ncdfFlow_2.28.1 BH_1.69.0-1
 RcppArmadillo_0.9.400.3.0 flowCore_1.48.1
 ggplot2_3.1.1 ash_1.0-15
loaded via a namespace (and not attached):
 tidyselect_0.2.5 purrr_0.3.2 lattice_0.20-38
 pcaPP_1.9-73 colorspace_1.4-1 stats4_3.5.2
 base64enc_0.1-3 XML_3.98-1.19 rlang_0.3.4
 R.oo_1.22.0 hexbin_1.27.3 pillar_1.4.0
 R.utils_2.8.0 glue_1.3.1 withr_2.1.2
 Rgraphviz_2.26.0 BiocGenerics_0.28.0 RColorBrewer_1.1-2
 matrixStats_0.54.0 plyr_1.8.4 robustbase_0.93-5
 stringr_1.4.0 zlibbioc_1.28.0 munsell_0.5.0
 gtable_0.3.0 R.methodsS3_1.7.1 mvtnorm_1.0-10
 latticeExtra_0.6-28 Biobase_2.42.0 Cairo_1.5-10
 DEoptimR_1.0-8 Rcpp_1.0.1 KernSmooth_2.23-15
 corpcor_1.6.9 scales_1.0.0 graph_1.60.0
 IDPmisc_1.1.19 gridExtra_2.3 stringi_1.4.3
 dplyr_0.8.1 grid_3.5.2 tools_3.5.2
 magrittr_1.5 lazyeval_0.2.2 tibble_2.1.1
 cluster_2.0.9 crayon_1.3.4 rrcov_1.4-7
 pkgconfig_2.0.2 MASS_7.3-51.4 flowViz_1.46.1
 data.table_1.12.2 assertthat_0.2.1 R6_2.4.0
[[alternative HTML version deleted]]
More information about the R-sig-hpc