[R-sig-hpc] Rmpi: mpi.close.Rslaves() 'hangs'
Marius Hofert
marius.hofert at uwaterloo.ca
Thu Sep 28 09:51:34 CEST 2017
Hi,
Thanks. I see the loop.
What are you suggesting? That this is a bug introduced to Rmpi? ... or
to use an older version of openmpi for now (?).
Cheers,
M
On Thu, Sep 28, 2017 at 9:31 AM, Ei-ji Nakama <nakama at ki.rim.or.jp> wrote:
> Hi,
>
> 2017-09-28 15:48 GMT+09:00 Marius Hofert <marius.hofert at uwaterloo.ca>:
>> If I execute the minimal working example with this new
>> mpi.close.Rslave2() at the end, something strange happens: *While*
>> doing the computation, 'htop' doesn't show the two cores separately,
>> but *after* executing it, the two cores show up and I need to manually
>> 'kill -9 <PID>' them.
>>
>> Any ideas?
>
> Look at the modifications made to Rmpi/inst/slavedaemon.R
> MPI_Comm_disconnect is looping even on slave ...
>
>> On Thu, Sep 28, 2017 at 6:55 AM, Ei-ji Nakama <nakama at ki.rim.or.jp> wrote:
>>> diff -ruN Rmpi.orig/inst/slavedaemon.R Rmpi/inst/slavedaemon.R
>>> --- Rmpi.orig/inst/slavedaemon.R 2013-02-23 13:07:54.000000000 +0900
>>> +++ Rmpi/inst/slavedaemon.R 2017-09-28 11:45:19.598288064 +0900
>>> @@ -16,6 +16,9 @@
>>> repeat
>>> try(eval(mpi.bcast.cmd(rank=0,comm=.comm, nonblock=.nonblock,
>>> sleep=.sleep),envir=.GlobalEnv),TRUE)
>>> print("Done")
>>> -invisible(mpi.comm.disconnect(.comm))
>>> +if(Sys.getenv("PMIX_NAMESPACE")=="")
>>> + invisible(mpi.comm.disconnect(.comm))
>>> +else
>>> + invisible(mpi.comm.free(.comm))
>>> invisible(mpi.comm.set.errhandler(0))
>>> mpi.quit()
>
> --
> Best Regards,
> --
> Eiji NAKAMA <nakama (a) ki.rim.or.jp>
> "\u4e2d\u9593\u6804\u6cbb" <nakama (a) ki.rim.or.jp>
More information about the R-sig-hpc
mailing list