[R-sig-hpc] Rmpi loads 2 versions of the same library [possible cause]
Ross Boylan
ross at biostat.ucsf.edu
Fri Mar 14 04:50:10 CET 2014
On Thu, 2014-03-13 at 12:57 -0700, Ross Boylan wrote:
> I'm not so happy to report that the original problem that motivated
> the
> whole exercise remains; in fact it's gotten slightly worse.
> mpi.isend.Robj does not seem to be working properly. I am sending to
> a
> fake receiver (at rank 1) that does nothing but print a message when
> it
> gets a message. r is a list with
Switching to mpi.send.Robj allowed everything to work. I speculate that
R was garbage collecting the bytes to be sent before MPI_send had
finished transmitting them.
1. Messages from mpi.isend were arriving at the MPI level. The problem
was that they were corrupt, and when the receiver (in Rmpi code R code)
tried to unserialize them it threw an error and stopped the process.
2. I tried to compare the bytes sent to the bytes received, and they
don't entirely fit the garbage collection theory since the first
difference was at the 10th byte, though most differences were later. I
would expect at least the first part of the buffer to go out correctly.
However, I'm not sure if I had the correct before bytes. (When I tried
to save the object being sent everything worked, and so I had to compare
different runs).
It is also possible fuller use will disprove the theory that mpi.send
solves the problem.
Ross
More information about the R-sig-hpc
mailing list