[Rd] nested parallel workers

Valerie Obenchain vobencha at fredhutch.org
Tue Mar 31 00:39:37 CEST 2015


On 03/30/2015 02:51 PM, Simon Urbanek wrote:
>
> On Mar 30, 2015, at 4:40 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote:
>
>> On 03/25/2015 07:48 PM, Simon Urbanek wrote:
>>> On Mar 25, 2015, at 3:46 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote:
>>>
>>>> Hi Simon,
>>>>
>>>> I'm having trouble with nested parallel workers, specifically, forking inside socket connections.
>>>>
>>>
>>> You simply can't by definition - when you fork *all* the workers share the same connection inherited from the parent, so you cannot use any I/O operations that you didn't start in the worker since reading in one worker affects all the workers.
>>>
>>
>> Sorry if I'm missing the obvious here -
>> I thought since the fork workers were shut down by the time the SOCK worker returned to its master conflicting I/O wouldn't be a problem.
>>
>
> If the workers are done and don't use I/O then all is well. However, it's not easy to guarantee that they don't use I/O since they all already come with active sockets, so e.g. on exit they may flush the socket buffers which would confuse the recipient. Interestingly your example works fine on OS X but fails on Linux. I'll try to dig deeper in a quiet minute --- in principle it should be sufficient to close all FDs right away, which you can do when using mcparallel() but not using mclapply().
>

I see. Thanks for the explanation.

Valerie


> Cheers,
> Simon
>
>
>
>> There are quite a few examples floating around where SOCK workers are spawned on a cluster and multicore workers are called within them. If I understand correctly this should not be done (or at least not encouraged). Instead, nested parallel should only be done with distributed memory workers, SOCK, MPI etc.
>>
>> Thanks.
>> Valerie
>>
>>
>>> Cheers,
>>> Simon
>>>
>>>
>>>> When mclapply is called inside a SOCK, PSOCK or FORK worker I get an
>>>> error in unserialize().
>>>>
>>>> cl <- makeCluster(1, "SOCK")
>>>>
>>>> fun = function(i) {
>>>>   library(parallel)
>>>>   mclapply(1:2, sqrt)
>>>> }
>>>>
>>>> Failure occurs after multiple calls to clusterApply:
>>>>
>>>>> clusterApply(cl, 1, fun)
>>>> [[1]]
>>>> [[1]][[1]]
>>>> [1] 1
>>>>
>>>> [[1]][[2]]
>>>> [1] 1.414214
>>>>
>>>>> clusterApply(cl, 1, fun)
>>>> [[1]]
>>>> [[1]][[1]]
>>>> [1] 1
>>>>
>>>> [[1]][[2]]
>>>> [1] 1.414214
>>>>
>>>>> clusterApply(cl, 1, fun)
>>>> Error in unserialize(node$con) : error reading from connection
>>>>
>>>>
>>>> This example is from Martin and may be a different problem.
>>>>
>>>> ~/tmp >cat test1.R
>>>> ## like mclapply
>>>> ## should run 'forever' but terminates semi-randomly
>>>> library(parallel)
>>>> children <- parallel:::children
>>>>
>>>> while (TRUE) {
>>>>     n <- 8            ## n == dectectCores()
>>>>     jobs <- lapply(seq_len(n), function(i) mcparallel(Sys.sleep(20)))
>>>>     mccollect(children(jobs), FALSE)
>>>>     parallel:::mckill(children(jobs), tools::SIGTERM)
>>>>     leni <- length(mccollect(children(jobs)))
>>>>     message("leni: ", leni)
>>>> }
>>>>
>>>> ~/tmp >R-dev --vanilla --slave -f test1.R
>>>> leni: 6
>>>> leni: 7
>>>> leni: 7
>>>> leni: 7
>>>> leni: 7
>>>> leni: 7
>>>> leni: 7
>>>> leni: 7
>>>> leni: 8
>>>> leni: 7
>>>> leni: 7
>>>> leni: 7
>>>> ~/tmp >
>>>>
>>>>
>>>> Thanks.
>>>> Valerie
>>>>
>>>>
>>>>> sessionInfo()
>>>> R Under development (unstable) (2015-03-18 r68009)
>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>> Running under: Fedora 21 (Twenty One)
>>>>
>>>> locale:
>>>> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>> [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>>> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>> [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] parallel  stats     graphics  grDevices utils     datasets  methods
>>>> [8] base
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] snow_0.3-13
>>>>
>>>>
>>>> --
>>>> Computational Biology / Fred Hutchinson Cancer Research Center
>>>> 1100 Fairview Ave. N, Seattle, WA 98109
>>>>
>>>> Email: vobencha at fredhutch.org
>>>> Phone: (206) 667-3158
>>>>
>>>> ______________________________________________
>>>> R-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>
>>
>



More information about the R-devel mailing list