[R-pkg-devel] Writing a data package with large files

Alex Hallam @|exh@||@m6@28 @end|ng |rom gm@||@com
Sat Jul 6 15:27:04 CEST 2019


I have been working on making a data package. The goal is to one day push
it to CRAN,
but I am having 2 problems (one warning and one note) from R CMD. I think
the problems
are due to having large files (a 453M csv.7z raw file and a 75M .rda file)

Below is my R CMD check.

── R CMD check results ───────────────────────────────── cfsales 0.0.0.9000
────
Duration: 2m 0.9s

❯ checking data for ASCII and uncompressed saves ... WARNING
    Warning: package needs dependence on R (>= 2.10)

❯ checking installed package size ... NOTE
    installed size is 133.1Mb
    sub-directories of 1Mb or more:
      data  133.0Mb

0 errors ✔ | 1 warning ✖ | 1 note ✖
Error: R CMD check found WARNINGs
Execution halted

Exited with status 1.

1. WARNING - Warning: package needs dependence on R (>= 2.10). I am not
sure where to start looking to fix this problem.

2. NOTE - I have two big files that I think are flagging this note.
train.rda and train.csv.7z. Is there any guidance on how
to deal with large files?

Additionally, train.rda is just a sample of the full data. The original
data has 52 stores and I only
take 13 stores. If I were to take the full set then R CMD check throws an
error as opposed the the current note I am getting now. If there is a way
to use the full data set without getting an error I would love to hear it.

The link to this package is here

https://github.com/alexhallam/cfsales

The location of the problem files are here

https://github.com/alexhallam/cfsales/tree/master/data-raw

-- 

Thanks!
-Alex

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list