[Rd] Issue with data() function

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Sat Oct 24 11:28:58 CEST 2020

On 23/10/2020 9:25 p.m., Therneau, Terry M., Ph.D. via R-devel wrote:
> I found an issue with the data() command this evening when working on the survival package.
> 1. I have a lot of data sets in the package, almost all used in at least one vignette,
> help file, or test.  As a space saving measure, I have bundled many of them together,
> i.e., the file data/cancer.rda contains 19 data sets, many of them small. The resulting
> file (using xz compression) is quite a bit smaller than the individual ones.  (I still get
> a warning note about size from R CMD check, but I'm no longer 2x the limit.)
> 2. Consider the lung data set.  All of these fail:
>      data(lung)
>      data("lung")
>      data(lung, package="survival")
>    a. The lung.Rd file had \usage{data(lung)}; that error was not caught by R CMD check.
> (Several other .Rd files as well.)
>    b. In broader examples for teaching, I sometimes load data from other packages, e.g
> data(aidssi, package="mstate").  But this does not work for survival.  (The larger
> survival data sets that are in separate .rda files can be found.)
>    c. What does work is survival::lung.  Might it be useful to add a comment to data.Rd to
> this effect?

You don't describe how this dataset is being included in your package. 
Have you moved it from data/lung.rda to data/cancer.rda?  Currently (in 
survival 3.2-7) each of these works for me:

  library(survival); data(lung)

  library(survival); data("lung")

  # Without library(survival):
  data(lung, package="survival")

I think if the lung dataset is now being included in cancer.rda, you'd need

   data(cancer, package="survival")

or equivalent to load it (and the rest of the datasets there).

> 3. Creating a separate package 'survivaldata' is of course one route, and is suggested in
> the "Writing R Extensions" guide.  But this is not possible since survival is a
> recommended package: it can't load any non-recommended package for it's tests or
> vignettes.  Longer term, perhaps there is way around this constraint?

Maybe the solution is to put your datasets into the "datasets" package, 
or make "survivaldata" a recommended package, or just leave things as 
they are and ignore the warnings about package size.  I think that's a 
negotiation you should have with R Core.

Duncan Murdoch

More information about the R-devel mailing list