[Rd] sysdata.rda vs. rda files in data directory

Ulrike Grömping groemping at beuth-hochschule.de
Sat Mar 23 18:12:54 CET 2013


Dear developeRs,

my package FrF2.catlg128 holds large catalogues and is supposed to gain 
additional ones. All the catalogues are intended for the user.
So far, the catalogues were stored in the data directory, and LazyData 
was "no". I understand that this is not considered wise any more (if it 
ever was), so that I want to change to LazyData "yes" with the next 
release (which will also get some additional catalogues).

I have tried out using separate data files in the data directory (like 
before) and one sysdata.rda file in the R directory (exporting all 
catalogues from the namespace); there is a large difference in the 
installed sizes between those two ways: the approach with sysdata.rda 
uses only about half the size of the separate data files approach (5.6 
Mb vs 11.7Mb).

As I would like to be able to query the available data in the package 
via data(package="FrF2.catlg128")) even before the package is loaded,  I 
want to have a data directory with a datalist file in there. This 
appears to be compatible with using a sysdata.rda file in the R 
directory. (From a tidyness point of view, I would prefer the data file 
to sit in the data directory as well; however, that about doubles the 
installed size again (11.4 vs 5.6Mb) even if I use just the one 
sysdata.rda file.)

Regarding the installed package size, the best option is obviously one 
sysdata.rda file in the R directory, but I want the datalist file for 
the reason given above. A data directory without data files throws a 
warning, so that I have to include a dummy data file (and documentation 
for it) for allowing me to have a datalist file.
Finally my questions: Is there a better way to achieve what I am looking 
for? And if not: is there any reason against combining a sysdata.rda 
file in the R directory with a datalist file (that lists the data from 
the sysdata.rda file) in the data directory, be it policy-wise or 
perhaps in terms of memory usage within an R session?

Best regards,
Ulrike



More information about the R-devel mailing list