[R-pkg-devel] How to store large data to be used in an R package?
Dirk Eddelbuettel
edd @end|ng |rom deb|@n@org
Tue Mar 26 15:35:13 CET 2024
On 25 March 2024 at 11:12, Jairo Hidalgo Migueles wrote:
| I'm reaching out to seek some guidance regarding the storage of relatively
| large data, ranging from 10-40 MB, intended for use within an R package.
| Specifically, this data consists of regression and random forest models
| crucial for making predictions within our R package.
|
| Initially, I attempted to save these models as internal data within the
| package. While this approach maintains functionality, it has led to a
| package size exceeding 20 MB. I'm concerned that this would complicate
| submitting the package to CRAN in the future.
|
| I would greatly appreciate any suggestions or insights you may have on
| alternative methods or best practices for efficiently storing and accessing
| this data within our R package.
Brooke and I wrote a paper on one way of addressing it via a 'data' package
accessibly via an Additional_repositories: entry supported by a drat repo.
See https://journal.r-project.org/archive/2017/RJ-2017-026/index.html for the
paper which contains a nice slow walkthrough of all the details.
Dirk
--
dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
More information about the R-package-devel
mailing list