[Rd] Arrays Partial unserialization

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Aug 31 16:51:51 CEST 2012


On 31/08/2012 15:41, Duncan Murdoch wrote:
> On 31/08/2012 9:47 AM, Damien Georges wrote:
>> Hi all,
>>
>> I'm working with some huge array in R and I need to load several ones to
>> apply some functions that requires to have all my arrays values for each
>> cell...
>>
>> To make it possible, I would like to load only a part (for example 100
>> cells) of all my arrays, apply my function, delete all cells loaded,
>> loaded following cells and so on.
>>
>> Is it possible to unserialize (or load) only a defined part of an R
>> array ?
>> Do you know some tools that might help me?
>
> I don't know of any tools to do that, but there are tools to maintain
> large objects in files, and load only parts of them at a time, e.g. the
> ff package.  Or you could simply use readBin and writeBin to do the same
> yourself.

Serialization is essentially serial, so you can only read the serialized 
format from the beginning.  So too are the compression algorithms used 
by default.

>> Finally, I did lot of research to find the way array (and all other R
>> object) are serialized into binary object, but I found nothing
>> explaining really algorithms involved. If someone has some information
>> on this topic, I'm interesting in.
>
> You can read the source for this; it is in src/main/serialize.c.

And there is an extensive commentary in the 'R Internals' manual.


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list