[R-sig-hpc] Rmpi: mpi.close.Rslaves() 'hangs'

Ei-ji Nakama nakama at ki.rim.or.jp
Thu Sep 28 10:10:55 CEST 2017


Hi,

MPI has several implementations.
Rmpi also avoids various MPI defects, I do not know them all.
So it can not be judged that this is better.
Well, it may be a bug(or specification?!) in MPI_Comm_disconnect in
openmpi 2.x...

2017-09-28 16:51 GMT+09:00 Marius Hofert <marius.hofert at uwaterloo.ca>:
> Hi,
>
> Thanks. I see the loop.
>
> What are you suggesting? That this is a bug introduced to Rmpi? ... or
> to use an older version of openmpi for now (?).
>
> Cheers,
> M
>
>
>
> On Thu, Sep 28, 2017 at 9:31 AM, Ei-ji Nakama <nakama at ki.rim.or.jp> wrote:
>> Hi,
>>
>> 2017-09-28 15:48 GMT+09:00 Marius Hofert <marius.hofert at uwaterloo.ca>:
>>> If I execute the minimal working example with this new
>>> mpi.close.Rslave2() at the end, something strange happens: *While*
>>> doing the computation, 'htop' doesn't show the two cores separately,
>>> but *after* executing it, the two cores show up and I need to manually
>>> 'kill -9 <PID>' them.
>>>
>>> Any ideas?
>>
>> Look at the modifications made to Rmpi/inst/slavedaemon.R
>> MPI_Comm_disconnect is looping even on slave ...
>>
>>> On Thu, Sep 28, 2017 at 6:55 AM, Ei-ji Nakama <nakama at ki.rim.or.jp> wrote:
>>>> diff -ruN Rmpi.orig/inst/slavedaemon.R Rmpi/inst/slavedaemon.R
>>>> --- Rmpi.orig/inst/slavedaemon.R    2013-02-23 13:07:54.000000000 +0900
>>>> +++ Rmpi/inst/slavedaemon.R    2017-09-28 11:45:19.598288064 +0900
>>>> @@ -16,6 +16,9 @@
>>>>  repeat
>>>>      try(eval(mpi.bcast.cmd(rank=0,comm=.comm, nonblock=.nonblock,
>>>> sleep=.sleep),envir=.GlobalEnv),TRUE)
>>>>  print("Done")
>>>> -invisible(mpi.comm.disconnect(.comm))
>>>> +if(Sys.getenv("PMIX_NAMESPACE")=="")
>>>> +    invisible(mpi.comm.disconnect(.comm))
>>>> +else
>>>> +    invisible(mpi.comm.free(.comm))
>>>>  invisible(mpi.comm.set.errhandler(0))
>>>>  mpi.quit()
>>
>> --
>> Best Regards,
>> --
>> Eiji NAKAMA <nakama (a) ki.rim.or.jp>
>> "\u4e2d\u9593\u6804\u6cbb"  <nakama (a) ki.rim.or.jp>



-- 
Best Regards,
--
Eiji NAKAMA <nakama (a) ki.rim.or.jp>
"\u4e2d\u9593\u6804\u6cbb"  <nakama (a) ki.rim.or.jp>



More information about the R-sig-hpc mailing list