[Bioc-devel] bplapply Processes Sometimes Stall
Morgan, Martin
Martin.Morgan at roswellpark.org
Mon Jan 4 03:07:57 CET 2016
Hi Dario -- the most likely explanation, without a reproducible example, is that the code used on workers sometimes puts R into a state that it cannot recover from.
The first approach to debug this is to run the code serially, e.g., using SerialParam and perhaps register(SerialParam()) (to make serial evaluation the default in a bplapply() invoked without a BPPARAM argument).
BiocParallel 1.5.12 is from the 'devel' branch of Bioconductor, which is supposed to be used (currently) on R-devel; please always use the appropriate version of R, with packages installed using biocLIte() when reporting problems.
Probably this belongs on support.bioconductor.org, where others may more easily benefit from your experience.
There are a couple of things that have come up while looking in to your problem and how R can get into the situation where several processes share a socket connection in the CLOSE_WAIT state; I'm still exploring solutions but it is not obvious that these would address whatever your underlying issue might be; R might be more helpful in saying that something has gone wrong, without being able to say exactly what.
Martin
________________________________________
From: Bioc-devel [bioc-devel-bounces at r-project.org] on behalf of Dario Strbenac [dstr7320 at uni.sydney.edu.au]
Sent: Friday, January 01, 2016 9:00 PM
To: bioc-devel at r-project.org
Subject: Re: [Bioc-devel] bplapply Processes Sometimes Stall
Good day,
I haven't been able to make a small and reproducible example, but I am using bpstart and bpstop to run a loop with 25 workers multiple times on a large bioinformatics dataset. After a few times of running the loop successfully, a small number of the R workers use 100% CPU endlessly :
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3300 dario 20 0 1190832 837212 17988 R 100.0 0.2 3848:00 R
5014 dario 20 0 1194528 829084 8224 R 99.8 0.2 3843:44 R
5015 dario 20 0 1194532 829088 8224 R 99.8 0.2 3843:44 R
There are also three connections belonging to the R processes waiting to close :
~$ lsof -i | grep CLOSE
R 3300 dario 1025u IPv4 160778259 0t0 TCP localhost:11881->localhost:49379 (CLOSE_WAIT)
R 5014 dario 1025u IPv4 160778259 0t0 TCP localhost:11881->localhost:49379 (CLOSE_WAIT)
R 5015 dario 1025u IPv4 160778259 0t0 TCP localhost:11881->localhost:49379 (CLOSE_WAIT)
~$ lsof -i | grep -c R
256
I use :
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)
with BiocParallel 1.5.12
--------------------------------------
Dario Strbenac
PhD Student
University of Sydney
Camperdown NSW 2050
Australia
_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.
More information about the Bioc-devel
mailing list