[Rd] Compressing data for package builds

Simon Urbanek simon.urbanek at r-project.org
Fri Aug 17 02:48:16 CEST 2012


On Aug 16, 2012, at 5:08 PM, steven mosher wrote:

> Hi,
> 
> I have two  .rda files that I need to include in a package.  I've placed
> them both in a data directory
> after  save()  the are around  150Kb  each.
> 
> When I try to check the package I get the following warning
> 
> Warning: large data file(s) saved inefficiently:
>                size ASCII compress
>  zagoskin.rda 137Kb FALSE     none
> 
>  Note: significantly better compression could be obtained
>        by using R CMD build --resave-data
>               old_size new_size compress
>  modpoll.rda     124Kb     78Kb       xz
>  zagoskin.rda    137Kb      6Kb    bzip2
> 
> Both of these files modpoll.rda and zagoskin.rda  have already been
> compressed from megabytes down to Kb.
> 
> Also,, the  instructions    "R CMD build --resave-data"  doesnt do anything
> that I can see so I must be using it wrong.

R CMD build is how you preferably should be creating your package tar ball, so you simply add the --resave-data argument to your already existing R CMD build call which creates the tar ball from your source directory. So can you elaborate on "doesn't do anything I can see"? In what sense? No output? No compression?

Cheers,
Simon


> Is there a piece of the puzzle I am missing or instructions better than
> these: I tried  LazyDataCompression and my
> data.rdb is 90Kb.
> 
> "Package *tools* has a couple of functions to help with data images:
> checkRdaFiles reports on the way the image was saved, and resaveRdaFiles will
> re-save with a different type of compression, including choosing the best
> type for that particular image.
> 
> Some packages using ŒLazyData‚ will benefit from using a form of
> compression other than gzip in the installed lazy-loading database. This
> can be selected by the --data-compress option to R CMD INSTALL or by using
> the ŒLazyDataCompression‚ field in the DESCRIPTION file. Useful values are
> bzip2, xz and the default, gzip. The only way to discover which is best is
> to try them all and look at the size of the pkgname/data/Rdata.rdb file."
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list