[Rd] [Q] Example Data Files for Package

Achim Zeileis Achim.Zeileis at wu-wien.ac.at
Tue May 17 22:01:48 CEST 2005


Paul:

> If the example data files for a package are extremely large,

What is extremely large here? Mega/giga/terrabytes?

> should they just be put in a different package specifically
> for example data? If so, are these data packages normally
> submitted to CRAN/BioC or mentioned in README with directions
> specifying where to download on developer site?

That depends on the value of "extremely large". Up to a certain extent
(a few hundred KB or so), I would put them into the package. Otherwise
include a README or (maybe even better) a man page that has the example
code and explains where to obtain the data.

> If these files are all dot-tsv (.tsv), they would seem
> appropriate for the data directory.

For large data sets consider to save() the data to .rda to reduce memory
usage. Also try the `compress' option.
To give you an impression: the spam data set in package kernlab has
dimension 4601 x 58 and is 180K in compressed binary format.

> "Writing R Extensions"
> doesn't specifically mention this as a valid file extension
> for that directory. Glancing around at some other packages,
> I see a number that create an 'extdata' directory. Is there
> a consensus on where alternative data files should be put
> within a package?

In most situations, I would recommend to include data just as usual in
the data directory.

my EUR 0.02
Z



More information about the R-devel mailing list