[Rd] large sysdata.rda file --- strategies?

Dr. Peter Ruckdeschel peter.ruckdeschel at itwm.fraunhofer.de
Mon Feb 11 00:07:58 CET 2013

Dear Uwe,
> On 07.02.2013 15:41, Dr. Peter Ruckdeschel wrote:
>> Hi,
>> to speed up computations in our RobASt family of packages, we use
>> interpolation on a grid of precomputed values which we save together
>> with the interpolating functions (results of splinefun essentially)
>> in sysdata.rda in the R folder of our pkg.
>> After adding grids for some more models, this file has grown
>> considerably, even after application of tools::resaveRdaFiles.
>> At the moment we are at about 2MB (compressed) and 8.8 MB
>> (uncompressed) and hence R CMD check --as-cran issues a NOTE.
>> We want to comply with cran policies,
>>        http://cran.r-project.org/web/packages/policies.html
>> in particular with
>>> Where a large amount of data is required (even after compression),
>>> consideration should be given to a separate data-only package which
>>> can be updated only rarely (since older versions of packages are
>>> archived in perpetuity).
>> Q1: Are packages only consisting of a sysdata.rda file thinkable for
>> submission on CRAN ? Are such pkgs the way to go for w.r.t. to the
>> cited policy?
> Yes, given this package needs less updates than the main package, one
> should consider such a data only package that needs rare updates and
> does not flood the space with archived versions.
Fine. We are going to do this then (albeit possibly not yet with
the next release). 

BTW: Of course the code to generate the grids would be accessible in
the main package to be compliant with open source ideas.
>> If this is the case, how  would one document such a package, in
>> particular
>> if we do not export any objects in the NAMESPACE file?
>> In addition, with a sysdata.rda-only pkg,  R CMD check issues a warning
>> "Found directory 'R' with no source files"  Of course a workaround is
>> adding a comment-only file comment.R to the R folder.
> If the checks will be changed not to warn in such a case, this can
> only happen for R >= 3.0.0, so your workaround to tell the checks you
> really intended such a package with R folder not containing any code
> sounds plausible for now.
>> Q2: Is there a lazy load / lazy data mechanism available for
>> sysdata.rda ? If so how would one enforce it?
> It is lazy loaded. From WRE:
> "if the ‘R’ subdirectory contains a file ‘sysdata.rda’ [...] this will
> be lazy-loaded into the namespace/package environment"
Ah must have missed this.

Many thanks for your comments.

Best, Peter
> Best,
> Uwe
>> Any suggestions appreciated,
>> Best, Peter

Dr. habil. Peter Ruckdeschel, Abteilung Finanzmathematik, F3.17
Fraunhofer ITWM, Fraunhofer Platz 1, 67663 Kaiserslautern
Telefon:  +49 631/31600-4699   Fax    :  +49 631/31600-5699
E-Mail :  peter.ruckdeschel at itwm.fraunhofer.de

More information about the R-devel mailing list