[Rd] RData File Specification?

Simon Urbanek simon.urbanek at r-project.org
Sat Aug 25 04:01:12 CEST 2007


Ian,

On Aug 23, 2007, at 4:21 PM, Cook, Ian wrote:

> I am developing a tool for converting a large data frame stored in  
> an uncompressed binary (XDR) RData file to a delimited text file.   
> The data frame is too large to load() and extract rows from on a  
> typical PC.  I'm looking to parse through the file and extract  
> individual entries without loading the whole thing into memory.
>
> In terms of some C source functions, instead of doing RestoreToEnv 
> (R_Unserialize(connection)) which is essentially what load() does,  
> I'm looking to get the documentation I would need to build a  
> function "SaveToCSV()" so that I could do SaveToCSV(R_Unserialize 
> (connection)).
>
> Where can I get documentation on the RData file format?  Does a  
> spec document exist?
>

I don't think so - basically the sources are all the documentation  
I'm aware of. It's a bit messy, because R supports so many old formats.

However, if you want a stand-alone program that handles  
(uncompressed) XDR2 only, then I may have saved you a bit of work. I  
have a utility (based on the R sources) that allows you to scan  
through XDR2 files and to extract individual objects into a separate  
XDR2 file (this happens to be quite useful when you have a workspace  
that doesn't load into R and yet you want to save some pieces of it).  
Have a look at
http://urbanek.info/rdcopy.c

(you can either run it as "./rdcopy foo" to list the objects or "./ 
rdcopy foo -v" to show the full structure (all SEXPs with their  
offsets) or "./rdcopy foo bar 19" to copy SEXP at offset 19 from foo  
into a separate XDR2 file bar (use offset from the first call to copy  
entire objects).

It's not prefect, but servers its purpose (it resolves references by  
copying them instead of re-indexing, but it doesn't detect loops).  
Maybe it helps, even though the task you describe is still far from  
trivial.

Cheers,
Simon



More information about the R-devel mailing list