[R-pkg-devel] Data-generating scripts in R packages

Jack Wasey jack at jackwasey.com
Tue Mar 22 21:03:48 CET 2016


You can leave them in R/ (and write them as functions) but use
.Rbuildignore to exclude them from the distributed package. People can use
the source, eg github, to regenerate the package and all its data if they
wish. This way, R CMD check and tests can cover them easily while
developing.

However, if the source data is small, I would think including it in the
distributed source package may be a good idea for reproducibility, without
actually invoking the reconstruction of the data on install (or CRAN
check). Or you can pull the data from the Internet as needed, and leave the
source data out of the package, and just include the pre-processing
function, which you need not export in the namespace.

On Tue, Mar 22, 2016 at 9:15 AM, Brian G. Peterson <brian at braverock.com
<javascript:;>> wrote:
> On Tue, 2016-03-22 at 08:52 -0400, Kevin Coombes wrote:
>> I'm currently developing an R package that includes a small data set
>> along with the functions that I want to export. I have an R script that
>> generates the data set; the computation time is long (well, relative to
>> the size of the data set). So, my plan is to run the script and save()
>> the data set as an *.rda file that I can put in the data directory.  (It
>> is possible that some users of the package will _only_ be interested in
>> the data set.)
>>
>> But, I'd like to keep the script with the package, both because it shows
>> how to use some of the functions and because I might want to modify how
>> the data set is generated in the future.  My question: What is the "best
>> practice" for where in the package directory structure to store such a
>> script?
>
> I'd probably put the script in the 'demo' directory so that it would be
> easy for a user to run it.
>
> My team has taken to documenting things like that as RStudio notebooks
> (.R files with rmarkdown comments that may be compiled as an annotated
> document), but that's just a suggestion and certainly not necessary.
>
> The other place things like this could go would be in the inst/
> directory.  We have a package that includes several 'parser' scripts for
> different data providers, and we chose to include these in a directory
> under inst/
>
> Regards,
>
> Brian
>
> ______________________________________________
> R-package-devel at r-project.org <javascript:;> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list