[Rd] nested parallel workers
Simon Urbanek
simon.urbanek at r-project.org
Mon Mar 30 23:51:17 CEST 2015
On Mar 30, 2015, at 4:40 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote:
> On 03/25/2015 07:48 PM, Simon Urbanek wrote:
>> On Mar 25, 2015, at 3:46 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote:
>>
>>> Hi Simon,
>>>
>>> I'm having trouble with nested parallel workers, specifically, forking inside socket connections.
>>>
>>
>> You simply can't by definition - when you fork *all* the workers share the same connection inherited from the parent, so you cannot use any I/O operations that you didn't start in the worker since reading in one worker affects all the workers.
>>
>
> Sorry if I'm missing the obvious here -
> I thought since the fork workers were shut down by the time the SOCK worker returned to its master conflicting I/O wouldn't be a problem.
>
If the workers are done and don't use I/O then all is well. However, it's not easy to guarantee that they don't use I/O since they all already come with active sockets, so e.g. on exit they may flush the socket buffers which would confuse the recipient. Interestingly your example works fine on OS X but fails on Linux. I'll try to dig deeper in a quiet minute --- in principle it should be sufficient to close all FDs right away, which you can do when using mcparallel() but not using mclapply().
Cheers,
Simon
> There are quite a few examples floating around where SOCK workers are spawned on a cluster and multicore workers are called within them. If I understand correctly this should not be done (or at least not encouraged). Instead, nested parallel should only be done with distributed memory workers, SOCK, MPI etc.
>
> Thanks.
> Valerie
>
>
>> Cheers,
>> Simon
>>
>>
>>> When mclapply is called inside a SOCK, PSOCK or FORK worker I get an
>>> error in unserialize().
>>>
>>> cl <- makeCluster(1, "SOCK")
>>>
>>> fun = function(i) {
>>> library(parallel)
>>> mclapply(1:2, sqrt)
>>> }
>>>
>>> Failure occurs after multiple calls to clusterApply:
>>>
>>>> clusterApply(cl, 1, fun)
>>> [[1]]
>>> [[1]][[1]]
>>> [1] 1
>>>
>>> [[1]][[2]]
>>> [1] 1.414214
>>>
>>>> clusterApply(cl, 1, fun)
>>> [[1]]
>>> [[1]][[1]]
>>> [1] 1
>>>
>>> [[1]][[2]]
>>> [1] 1.414214
>>>
>>>> clusterApply(cl, 1, fun)
>>> Error in unserialize(node$con) : error reading from connection
>>>
>>>
>>> This example is from Martin and may be a different problem.
>>>
>>> ~/tmp >cat test1.R
>>> ## like mclapply
>>> ## should run 'forever' but terminates semi-randomly
>>> library(parallel)
>>> children <- parallel:::children
>>>
>>> while (TRUE) {
>>> n <- 8 ## n == dectectCores()
>>> jobs <- lapply(seq_len(n), function(i) mcparallel(Sys.sleep(20)))
>>> mccollect(children(jobs), FALSE)
>>> parallel:::mckill(children(jobs), tools::SIGTERM)
>>> leni <- length(mccollect(children(jobs)))
>>> message("leni: ", leni)
>>> }
>>>
>>> ~/tmp >R-dev --vanilla --slave -f test1.R
>>> leni: 6
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> leni: 8
>>> leni: 7
>>> leni: 7
>>> leni: 7
>>> ~/tmp >
>>>
>>>
>>> Thanks.
>>> Valerie
>>>
>>>
>>>> sessionInfo()
>>> R Under development (unstable) (2015-03-18 r68009)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>> Running under: Fedora 21 (Twenty One)
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] parallel stats graphics grDevices utils datasets methods
>>> [8] base
>>>
>>> loaded via a namespace (and not attached):
>>> [1] snow_0.3-13
>>>
>>>
>>> --
>>> Computational Biology / Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N, Seattle, WA 98109
>>>
>>> Email: vobencha at fredhutch.org
>>> Phone: (206) 667-3158
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>
More information about the R-devel
mailing list