[R] Issue with dataset inclusion in CRAN packages
csrabak
crabak at acm.org
Sun Jun 26 23:18:24 CEST 2011
Em 26/6/2011 17:43, Frank Harrell escreveu:
> I was glad to see the new rpart.plot package by Stephen Milborrow. I was
> however a bit concerned that Stephen distributed a dataset I created, and
> renamed the dataset (from titanic3 to ptitanic) in the process [with some
> justification, as some variables were omitted]. Fortunately Stephen
> included the script he used to download the dataset from our web site, and
> gave full credit to us. What concerns me is that the rpart.plot package
> does not contain many functions but the package is as large as packages
> containing hundreds of functions. This is due to the inclusion of the
> dataset. I would prefer that authors provide the URL so that users can
> easily install the binary R binary dataframe directly from our web site (we
> even have an automated way to do this: require(Hmisc); getHdata(titanic3)).
> This will allow users to profit from possible future data corrections as
> well as making the package much more compact. Thanks for listening. I'm
> writing to r-help because this may applied to other R packages as well.
>
Frank,
I can understand your concern and at first thought would even second it.
On the other hand, I think there are reasonable explanations why all
authors prefer to include the datasets, especially if the data will be
used in examples:
1) Docs written based in the datasets are synced with the dataframes
offered with the package;
2) In several environments access to the web may be restricted and the
getHdata or read.table("<url>") be not allowed.
my 0.019999...
Regards,
--
Cesar Rabak
More information about the R-help
mailing list