[Rd] nested parallel workers

Simon Urbanek simon.urbanek at r-project.org
Mon Mar 30 23:51:17 CEST 2015


On Mar 30, 2015, at 4:40 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote:

> On 03/25/2015 07:48 PM, Simon Urbanek wrote:
>> On Mar 25, 2015, at 3:46 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote:
>> 
>>> Hi Simon,
>>> 
>>> I'm having trouble with nested parallel workers, specifically, forking inside socket connections.
>>> 
>> 
>> You simply can't by definition - when you fork *all* the workers share the same connection inherited from the parent, so you cannot use any I/O operations that you didn't start in the worker since reading in one worker affects all the workers.
>> 
> 
> Sorry if I'm missing the obvious here -
> I thought since the fork workers were shut down by the time the SOCK worker returned to its master conflicting I/O wouldn't be a problem.
> 

If the workers are done and don't use I/O then all is well. However, it's not easy to guarantee that they don't use I/O since they all already come with active sockets, so e.g. on exit they may flush the socket buffers which would confuse the recipient. Interestingly your example works fine on OS X but fails on Linux. I'll try to dig deeper in a quiet minute --- in principle it should be sufficient to close all FDs right away, which you can do when using mcparallel() but not using mclapply().

Cheers,
Simon



> There are quite a few examples floating around where SOCK workers are spawned on a cluster and multicore workers are called within them. If I understand correctly this should not be done (or at least not encouraged). Instead, nested parallel should only be done with distributed memory workers, SOCK, MPI etc.
> 
> Thanks.
> Valerie
> 
> 
>> Cheers,
>> Simon
>> 
>> 
>>> When mclapply is called inside a SOCK, PSOCK or FORK worker I get an
>>> error in unserialize().
>>> 
>>> cl <- makeCluster(1, "SOCK")
>>> 
>>> fun = function(i) {
>>>  library(parallel)
>>>  mclapply(1:2, sqrt)
>>> }
>>> 
>>> Failure occurs after multiple calls to clusterApply:
>>> 
>>>> clusterApply(cl, 1, fun)
>>> [[1]]
>>> [[1]][[1]]
>>> [1] 1
>>> 
>>> [[1]][[2]]
>>> [1] 1.414214
>>> 
>>>> clusterApply(cl, 1, fun)
>>> [[1]]
>>> [[1]][[1]]
>>> [1] 1
>>> 
>>> [[1]][[2]]
>>> [1] 1.414214
>>> 
>>>> clusterApply(cl, 1, fun)
>>> Error in unserialize(node$con) : error reading from connection
>>> 
>>> 
>>> This example is from Martin and may be a different problem.
>>> 
>>> ~/tmp >cat test1.R
>>> ## like mclapply
>>> ## should run 'forever' but terminates semi-randomly
>>> library(parallel)
>>> children <- parallel:::children
>>> 
>>> while (TRUE) {
>>>    n <- 8            ## n == dectectCores()
>>>    jobs <- lapply(seq_len(n), function(i) mcparallel(Sys.sleep(20)))
>>>    mccollect(children(jobs), FALSE)
>>>    parallel:::mckill(children(jobs), tools::SIGTERM)
>>>    leni <- length(mccollect(children(jobs)))
>>>    message("leni: ", leni)
>>> }
>>> 
>>> ~/tmp >R-dev --vanilla --slave -f test1.R
>>> leni: 6
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> leni: 8
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> ~/tmp >
>>> 
>>> 
>>> Thanks.
>>> Valerie
>>> 
>>> 
>>>> sessionInfo()
>>> R Under development (unstable) (2015-03-18 r68009)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>> Running under: Fedora 21 (Twenty One)
>>> 
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>> [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>> 
>>> attached base packages:
>>> [1] parallel  stats     graphics  grDevices utils     datasets  methods
>>> [8] base
>>> 
>>> loaded via a namespace (and not attached):
>>> [1] snow_0.3-13
>>> 
>>> 
>>> --
>>> Computational Biology / Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N, Seattle, WA 98109
>>> 
>>> Email: vobencha at fredhutch.org
>>> Phone: (206) 667-3158
>>> 
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>> 
> 



More information about the R-devel mailing list