[R-sig-hpc] Rmpi long vector support

Simon Urbanek simon.urbanek at r-project.org
Thu Aug 8 02:03:08 CEST 2013


Jim,

On Aug 7, 2013, at 11:59 AM, Jim Gattiker <j.gattiker at gmail.com> wrote:

> To bcast.Robj, Rmpi uses serialize() to pack the object into "raw", which
> operates as a vector of bytes. R supports vectors up to 2^31 elements.

that is not true. R supports vectors up to 2^52 elements. That is way beyond current RAM sizes and certainly more than 2^31:

> n=6e9
> x=integer(n)
> a=serialize(x,NULL)
> length(a)
[1] 2.4e+10
> log2(length(a))
[1] 34.48232

However Rmpi does not. You have to use XLENGTH and R_xlen_t in the C code if you want to go beyond 2^31.

Cheers,
Simon


> Your
> object, serialized, is close to that. I'm a little puzzled though,
> 350x350000 matrix works for me; perhaps there's more to your actual call.
> 
> Setting up the slave environment explicitly seems to me to be better
> practice, i.e. using mpi.bcast.Robj2Slave and related calls, then the
> applyLB() doesn't contain the data. As I'm reading it, I think the way this
> is set up currently will send the data to a slave for each application of
> the apply.
> 
> If you're facing larger data sizes, a solution is to explicitly cut up the
> object, mpi.bcast.Robj2Slave the pieces in turn, and then use an
> mpi.bcast.cmd to collect them together. Another approach would be to write
> to data to a file, and direct the slaves to read it into their environments
> with an mpi.bcast.cmd.
> 
> As a comment: I can't tell your application from the code, but if the
> slaves are to be working each on only part of the data, it's better to send
> the slave just the part of the data it needs.
> 
>    cheers,
>       jim
> 
> 
> 
> 
> On Tue, Aug 6, 2013 at 6:37 PM, Ei-ji Nakama <nakama at ki.rim.or.jp> wrote:
> 
>> Hi,
>> 
>> <WARNING>
>> It is not yet completed...
>> </WARNING>
>> http://prs.ism.ac.jp/~nakama/Rhpc/
>> 
>> 2013/8/7, Xiaochun Sun <xiaoch.sun at gmail.com>:
>>> The same code worked fine for smaller dataset, such as 350x25000. Does
>> that
>>> mean the current Rmpi don't allow long vectors to be passed to slaves?
>> Any
>>> idea on that or any alternatives to Rmpi?
>> 
>> Its can treat the considerably big data.
>> 
>> --
>> EI-JI Nakama  <nakama (a) ki.rim.or.jp>
>> "\u4e2d\u9593\u6804\u6cbb"  <nakama (a) ki.rim.or.jp>
>> 
>> _______________________________________________
>> R-sig-hpc mailing list
>> R-sig-hpc at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
> 
> 



More information about the R-sig-hpc mailing list