[Bioc-devel] lpsymphony - BiocParallel crash

Martin Morgan martin.morgan at roswellpark.org
Thu Sep 22 01:32:36 CEST 2016


On 09/21/2016 10:42 AM, Christian Arnold wrote:
> Dear Bioconductor developers,
>
>
> I am having a somewhat mysterious and challenging problem which we
> believe is a bug in either (1) BiocParallel or (2) the lpsymphony
> library from either Bioconductor or the SYMPHONY backend.

Thank you for the clear report

I think the problem is that lpsymphony is compiled by default to use 
openmp, and this parallelization environment interferes with the fork 
processes that BiocParallel::MulticoreParam() uses.

I was lead to this conclusion by using the 'gdb' debugger and 
interrupting R when it hung -- there were a number of openmp threads 
persisting, as in the abbreviated session below

$ Rdev -d gdb
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
...
(gdb) r
...
 > source("lp.R", echo=T, max=Inf)    ## your script
 > library(BiocParallel)
 > library(lpsymphony)
 > lpsymphonyTest <- function(x) {
+     ## Example from the R help of lpsymphony_solve_LP
+     obj <- c(2, 4, 3)
+     mat <- matrix(c(3, 2, 1, 4, 1, 3, 2, 2, 2), nrow = 3)
+     dir <- c("<=", "<=", "<=")
+     rhs <- c(60, 40, 80)
+     max <- TRUE
+     test = lpsymphony_solve_LP(obj, mat, dir, rhs, max = max)
+     Sys.sleep(1)
+     return(1)
+ }

 > nIterationsAndCores = 2
 > register(MulticoreParam(workers = nIterationsAndCores))
 > ## register(bpstart(MulticoreParam(workers = nIterationsAndCores)))
 >
 > # First run: Two iterations, should run in parallel
 > commonGenes <- bplapply(X = seq(2), FUN = lpsymphonyTest)
 > # Second run: One iteration, should run on only one core
 > commonGenes <- bplapply(X = seq(1), FUN = lpsymphonyTest)
[New Thread 0x7fffeb483700 (LWP 29894)]
[New Thread 0x7fffeac82700 (LWP 29895)]
[New Thread 0x7fffea481700 (LWP 29896)]
[New Thread 0x7fffe9c80700 (LWP 29897)]
[New Thread 0x7fffe947f700 (LWP 29898)]
[New Thread 0x7fffe8c7e700 (LWP 29899)]
[New Thread 0x7fffe847d700 (LWP 29900)]

 > # Third run: Two iterations again, should run in parallel but crashed 
and never finishes
 > commonGenes <- bplapply(X = seq(2), FUN = lpsymphonyTest)
^C
...
(gdb) info threads
   Id   Target Id         Frame
* 1    Thread 0x7ffff7fc97c0 (LWP 29865) "R" 0x00007ffff5a91d13 in 
select () at ../sysdeps/unix/syscall-template.S:84
   2    Thread 0x7fffeb483700 (LWP 29894) "R" 0x00007ffff5f8cb4f in ?? 
() from /usr/lib/x86_64-linux-gnu/libgomp.so.1
   3    Thread 0x7fffeac82700 (LWP 29895) "R" 0x00007ffff5f8cb4f in ?? 
() from /usr/lib/x86_64-linux-gnu/libgomp.so.1
   4    Thread 0x7fffea481700 (LWP 29896) "R" 0x00007ffff5f8cb4f in ?? 
() from /usr/lib/x86_64-linux-gnu/libgomp.so.1
   5    Thread 0x7fffe9c80700 (LWP 29897) "R" 0x00007ffff5f8cb4f in ?? 
() from /usr/lib/x86_64-linux-gnu/libgomp.so.1
   6    Thread 0x7fffe947f700 (LWP 29898) "R" 0x00007ffff5f8cb4f in ?? 
() from /usr/lib/x86_64-linux-gnu/libgomp.so.1
   7    Thread 0x7fffe8c7e700 (LWP 29899) "R" 0x00007ffff5f8cb4f in ?? 
() from /usr/lib/x86_64-linux-gnu/libgomp.so.1
   8    Thread 0x7fffe847d700 (LWP 29900) "R" 0x00007ffff5f8cb4f in ?? 
() from /usr/lib/x86_64-linux-gnu/libgomp.so.1


I compiled lpsymphony without openmp by changing the top-level 
lpsymphony/configure file to explicitly exclude openmp

$ svn diff lpsymphony/configure
Index: lpsymphony/configure
===================================================================
--- lpsymphony/configure	(revision 121222)
+++ lpsymphony/configure	(working copy)
@@ -123,6 +123,7 @@
          else
  	    (cd src/SYMPHONY && \
  		./configure \
+		--enable-openmp=no \
  		--enable-static --disable-shared --with-pic \
  		--with-application=no --disable-dependency-tracking \
  		--disable-zlib --disable-bzlib \

and then installing with R CMD INSTALL lpsymphony. Your test script then 
succeeded; providing weak verification that openmp was the problem.

There are several things.

1. since SYMPHONY is already using openmp, and it is a sophisticated 
piece of software, probably there is little value to using BiocParallel 
on top of it?

2. It seems that one can use SnowParam() effectively; this requires 
modifying lpsymphonyTest() to require(lpsymphony)

lpsymphonyTest <- function(x) {
     ## Example from the R help of lpsymphony_solve_LP
     require(lpsymphony)
     ...

The cost of starting the independent processes can be amortized across a 
session by starting them manually at the beginning (and stopping them at 
the end)

param = bpstart(SnowParam(workers = nIterationsAndCores))
register(param)
...
bpstop(param)

3. It would be good if the maintainer of lpsymphony exposed the 
enable-openmp (and other?) flags so that one could R CMD INSTALL 
--configure-args="--enable-openmp=no" lpsymphony

4. I'll work to come up with a simpler example and see if I can make 
BiocParallel robust to use of openmp; this might be challenging for me.

Hope that helps, if does not solve, the problem, and thank you for the 
report.

Martin

>
> The problem is, in a nutshell, that R silently crashes or, to be more
> precise, never finishes the calculation when using the only function
> from lpsymphony in combination with BiocParallel.
>
> We updated the latest version of lpsymphony last week, did not help.
> This can be reproduced with two different configurations, using
> different library versions of both BiocParallel and lpsymphony,
> respectively. We also tested this on two different machines (one of
> which runs R in the devel version, one does not; see the two
> sessionInfo() at the end of this message)
>
>
> The following code can reproduce the problem:
>
>
> /library(BiocParallel)/
>
> /library(lpsymphony)/
>
>
> /lpsymphonyTest <- function(x){/
>
> /# Example from the R help of lpsymphony_solve_LP/
>
> /obj <- c(2, 4, 3)/
>
> /mat <- matrix(c(3, 2, 1, 4, 1, 3, 2, 2, 2), nrow = 3)/
>
> /dir <- c("<=", "<=", "<=")/
>
> /rhs <- c(60, 40, 80)/
>
> /max <- TRUE/
>
> /test = lpsymphony_solve_LP(obj, mat, dir, rhs, max = max)/
>
> /Sys.sleep(1)/
>
> /return(1)/
>
> /}/
>
>
> /# First run: Two iterations, should run in parallel/
>
> /nIterationsAndCores = 2/
>
> /register(MulticoreParam(workers = nIterationsAndCores))/
>
> /commonGenes <- bplapply(X = seq(2), FUN = lpsymphonyTest) /
>
>
> /# Second run: One iteration, should run on only one core/
>
> /commonGenes <- bplapply(X = seq(1), FUN = lpsymphonyTest) /
>
>
> /# Third run: Two iterations again, should run in parallel but crashed
> and never finishes/
>
> /commonGenes <- bplapply(X = seq(2), FUN = lpsymphonyTest) /
>
>
> What happens in our case is that the third run never finishes. The two R
> processes are in state “SLEEP” rather than running. Once you execute the
> bplapply loop for only one iteration, everything after with number of
> iterations > 1 crashes. It works fine for any number of iterations > 1
> despite of the order, but again, executing only a single iteration
> causes the problems afterwards. Note that using registering again for
> the second and third run does not help and yields the same behavior.
>
>
>
> In fact, we have another issue, namely that the parallelization with
> BiocParallel in combination with the IHW package (which calls lpsymphony
> in the background) does not yield running times as expected but instead
> much, much longer ones. However, let's focus on this one first, because
> we think they might be related.
>
>
> Thanks for your help,
>
>
> Best
>
> Christian
>
>
>
>
> These are the two configurations where this problem can be reproduced:
>
>
> (1)
>
> /R Under development (unstable) (2016-06-30 r70858)/
>
> //
>
> /Platform: x86_64-pc-linux-gnu (64-bit)/
>
> //
>
> /Running under: Ubuntu 16.04.1 LTS/
>
> //
>
> /
> /
>
> //
>
> /locale:/
>
> //
>
> /[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8
> LC_COLLATE=en_US.UTF-8 /
>
> //
>
> /[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
> LC_PAPER=de_DE.UTF-8 LC_NAME=C /
>
> //
>
> /[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
> LC_IDENTIFICATION=C /
>
> //
>
> /attached base packages:/
>
> //
>
> /[1] stats graphics grDevices utils datasets methods base /
>
> //
>
> /other attached packages:/
>
> //
>
> /[1] lpsymphony_1.1.2 BiocParallel_1.7.8/
>
> //
>
> /loaded via a namespace (and not attached):/
>
> //
>
> /[1] parallel_3.4.0 tools_3.4.0 /
>
>
>
> (2)
>
> /R version 3.3.1 (2016-06-21)/
>
> //
>
> /Platform: x86_64-pc-linux-gnu (64-bit)/
>
> //
>
> /Running under: CentOS release 6.5 (Final)/
>
> //
>
> /
> //locale:/
>
> //
>
> /[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 /
>
> //
>
> /[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
> LC_MESSAGES=en_US.UTF-8 /
>
> //
>
> /[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C /
>
> //
>
> /[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C /
>
> //
>
> /
> //attached base packages:/
>
> //
>
> /[1] stats graphics grDevices utils datasets methods base /
>
> //
>
> /
> //other attached packages:/
>
> //
>
> /[1] lpsymphony_1.0.2 BiocParallel_1.6.6/
>
> //
>
> /
> //loaded via a namespace (and not attached):/
>
> //
>
> /[1] parallel_3.3.1 tools_3.3.1 /
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>


This email message may contain legally privileged and/or...{{dropped:2}}



More information about the Bioc-devel mailing list