[Rd] R CMD build --resave-data

Hervé Pagès hpages at fhcrc.org
Wed Apr 13 02:53:32 CEST 2011


Hi Uwe,

On 11-04-11 08:13 AM, Uwe Ligges wrote:
>
>
> On 11.04.2011 02:47, Hervé Pagès wrote:
>> Hi,
>>
>> More about the new --resave-data option
>>
>> As mentioned previously here
>>
>> https://stat.ethz.ch/pipermail/r-devel/2011-April/060511.html
>>
>> 'R CMD build' and 'R CMD INSTALL' handle this new option
>> inconsistently. The former does --resave-data="gzip" by default.
>> The latter doesn't seem to support the --resave-data= syntax:
>> the --resave-data flag must either be present or not. And by
>> default 'R CMD INSTALL' won't resave the data.
>>
>> Also, because now 'R CMD build' is resaving the data, shouldn't it
>> reinstall the package in order to be able to do this correctly?
>>
>> Here is why. There is this new warning in 'R CMD check' that complains
>> about files not of a type allowed in a 'data' directory:
>>
>>
>> http://bioconductor.org/checkResults/2.8/bioc-LATEST/Icens/lamb1-checksrc.html
>>
>>
>>
>> The Icens package also has .R files under data/ with things like:
>>
>> bet <- matrix(scan("CMVdata", quiet=TRUE),nc=5,byr=TRUE)
>>
>> i.e. the R code needs to access some of the text files located
>> in the data/ folder. So in order to get rid of this warning I
>> tried to move those text files to inst/extdata/ and I modified
>> the code in the .R file so it does:
>>
>> CMVdata_filepath <- system.file("extdata", "CMVdata", package="Icens")
>> bet <- matrix(scan(CMVdata_filepath, quiet=TRUE),nc=5,byr=TRUE)
>>
>> But now 'R CMD build' fails to resave the data because the package
>> was not installed first and the CMVdata file could not be found.
>>
>> Unfortunately, for a lot of people that means that the safe way to
>> build a source tarball now is with
>>
>> R CMD build --keep-empty-dirs --no-resave-data
>
>
> Hervé,
>
> actually is makes some sense to have these defaults from a CRAN
> maintainer's point of view:
>
> --keep-empty-dirs:
> we found many packages containing empty dirs unnecessarily and the idea
> is to exclude them at the build state rather than at the later
> installation stage. Note that the package maintainer is supposed to run
> build (and knows if the empty dirs are to be included, the user who runs
> INSTALL does not).
>
> --no-resave-data:
> We found many packages with unsufficiently compressed data. This should
> be fixed when building the package, not later when installing it, since
> the reduces size is useful in the source tarball already.
>
> So it does make some sense to have different defaults in build as
> opposed to INSTALL from my point of view (although I could live with
> different, tough).

If you deliberately ignore the fact that 'R CMD INSTALL' is also used
by developers to install from the *package source tree* (by opposition
to end users who use it to install from a *source tarball*, even though
they don't use it directly), then you have a point. So maybe I should
have been more explicit about the problem that it can be for the
*developer* to have 'R CMD build' and 'R CMD INSTALL' behave
differently by default.

Of course I'm not suggesting that 'R CMD INSTALL' should behave
differently (by default) depending on whether it's used on a source
tarball (mode 1) or a package source tree (mode 2).

I'm suggesting that, by default, the 3 commands (R CMD build +
R CMD INSTALL in mode 1 and 2) behave consistently.

With the latest changes, and by default, 'R CMD INSTALL' is still doing
the right thing, but not 'R CMD build' anymore.

I perfectly understand the intention behind those new flags, which is
to try to "optimize" the resulting source tarball but what would you
think if 'gcc' had some optimization flags that can generate broken
executables (under some circumstances) and if these flags were enabled
by default?

Note that I would have no problem with 'R CMD build' trying to resave
the data by default if the current implementation of that feature
was working properly, but unfortunately it's broken (see my previous
email for the details).

Thanks,
H.

>
> If you need further arguments for the discussion: I also tend to use
> --no-vignettes nowadays if my code does not change considerably. ;-)
>
> Best wishes,
> Uwe
>
>
>
>> I hope the list of options/flags that we need to use to "fix" 'R CMD
>> build' (and make it consistent with R CMD INSTALL) is not going to
>> grow too much ;-)
>>
>> Thanks,
>> H.
>>
>>


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list