[R-sig-hpc] Rmpi long vector support

Hao Yu hyu at stats.uwo.ca
Thu Aug 8 03:42:54 CEST 2013


Hi Simon,

Thank for pointing out this XLENGTH and R_xlen_t. I will add them in the
next release of Rmpi.

However I do run into the following issue. On a debian system with R 3.0.1
(8GB ram), I can run
> n=6e8
> x=integer(n)
> a=serialize(x,NULL)
> length(a)
[1] 2.4e+09

However, on a win7 64 with R 3.0.1 (16GB ram), I got
> n=6e8
> x=integer(n)
> a=serialize(x,NULL)
Error: long vectors not supported yet: ../include/Rinlinefuns.h:100

Is this a bug or xlenght is not implemented in R win version?

Thanks,

Hao

Simon Urbanek wrote:
> Jim,
>
> On Aug 7, 2013, at 11:59 AM, Jim Gattiker <j.gattiker at gmail.com> wrote:
>
>> To bcast.Robj, Rmpi uses serialize() to pack the object into "raw",
>> which
>> operates as a vector of bytes. R supports vectors up to 2^31 elements.
>
> that is not true. R supports vectors up to 2^52 elements. That is way
> beyond current RAM sizes and certainly more than 2^31:
>
>> n=6e9
>> x=integer(n)
>> a=serialize(x,NULL)
>> length(a)
> [1] 2.4e+10
>> log2(length(a))
> [1] 34.48232
>
> However Rmpi does not. You have to use XLENGTH and R_xlen_t in the C code
> if you want to go beyond 2^31.
>
> Cheers,
> Simon
>
>
>> Your
>> object, serialized, is close to that. I'm a little puzzled though,
>> 350x350000 matrix works for me; perhaps there's more to your actual
>> call.
>>
>> Setting up the slave environment explicitly seems to me to be better
>> practice, i.e. using mpi.bcast.Robj2Slave and related calls, then the
>> applyLB() doesn't contain the data. As I'm reading it, I think the way
>> this
>> is set up currently will send the data to a slave for each application
>> of
>> the apply.
>>
>> If you're facing larger data sizes, a solution is to explicitly cut up
>> the
>> object, mpi.bcast.Robj2Slave the pieces in turn, and then use an
>> mpi.bcast.cmd to collect them together. Another approach would be to
>> write
>> to data to a file, and direct the slaves to read it into their
>> environments
>> with an mpi.bcast.cmd.
>>
>> As a comment: I can't tell your application from the code, but if the
>> slaves are to be working each on only part of the data, it's better to
>> send
>> the slave just the part of the data it needs.
>>
>>    cheers,
>>       jim
>>
>>
>>
>>
>> On Tue, Aug 6, 2013 at 6:37 PM, Ei-ji Nakama <nakama at ki.rim.or.jp>
>> wrote:
>>
>>> Hi,
>>>
>>> <WARNING>
>>> It is not yet completed...
>>> </WARNING>
>>> http://prs.ism.ac.jp/~nakama/Rhpc/
>>>
>>> 2013/8/7, Xiaochun Sun <xiaoch.sun at gmail.com>:
>>>> The same code worked fine for smaller dataset, such as 350x25000. Does
>>> that
>>>> mean the current Rmpi don't allow long vectors to be passed to slaves?
>>> Any
>>>> idea on that or any alternatives to Rmpi?
>>>
>>> Its can treat the considerably big data.
>>>
>>> --
>>> EI-JI Nakama  <nakama (a) ki.rim.or.jp>
>>> "\u4e2d\u9593\u6804\u6cbb"  <nakama (a) ki.rim.or.jp>
>>>
>>> _______________________________________________
>>> R-sig-hpc mailing list
>>> R-sig-hpc at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-hpc mailing list
>> R-sig-hpc at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>
>>
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>


-- 
Department of Statistics & Actuarial Sciences
Office Phone#:(519)-661-3622
Fax Phone#:(519)-661-3813
The University of Western Ontario
London, Ontario N6A 5B7
http://www.stats.uwo.ca/yu



More information about the R-sig-hpc mailing list