[Rd] Arrays Partial unserialization
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Aug 31 16:51:51 CEST 2012
On 31/08/2012 15:41, Duncan Murdoch wrote:
> On 31/08/2012 9:47 AM, Damien Georges wrote:
>> Hi all,
>>
>> I'm working with some huge array in R and I need to load several ones to
>> apply some functions that requires to have all my arrays values for each
>> cell...
>>
>> To make it possible, I would like to load only a part (for example 100
>> cells) of all my arrays, apply my function, delete all cells loaded,
>> loaded following cells and so on.
>>
>> Is it possible to unserialize (or load) only a defined part of an R
>> array ?
>> Do you know some tools that might help me?
>
> I don't know of any tools to do that, but there are tools to maintain
> large objects in files, and load only parts of them at a time, e.g. the
> ff package. Or you could simply use readBin and writeBin to do the same
> yourself.
Serialization is essentially serial, so you can only read the serialized
format from the beginning. So too are the compression algorithms used
by default.
>> Finally, I did lot of research to find the way array (and all other R
>> object) are serialized into binary object, but I found nothing
>> explaining really algorithms involved. If someone has some information
>> on this topic, I'm interesting in.
>
> You can read the source for this; it is in src/main/serialize.c.
And there is an extensive commentary in the 'R Internals' manual.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list