[Rd] R CMD check for the R code from vignettes

Yihui Xie xie at yihui.name
Mon Jun 2 21:38:28 CEST 2014


Well, there is still misunderstanding: there is nothing that really
stops you from tangling the vignettes, and I do not disagree that
tangle can be useful in certain cases. I'm talking about whether
package authors _must_ tangle all their vignettes, or leave some to
users to run Stangle() or knitr::purl() on the vignettes if they
really want the potentially broken R script. The prerequisite of this
question is that the current tangle functions ignore inline
expressions, and it is not totally clear whether this is good or bad.

The harm that you mentioned was not from disabling tangle, but from
the tangle function, or if you insist the current tangle function
should be what we expect, then the harm came from the improper use of
inline expressions.

I do not think it is trivial to improve the tangle function so that it
can generate an R script that fully reproduce what was done in weave.
Inline expressions are not the only thing that need improvement. If
you argue for reproducible research, I can open the Pandora's box to
make tangle even less desirable and further away from reproducible
research. For example, how about reproducing graphics using the
tangled R script? (How do you make tangle use the same graphical
device(s) as weave? How do you pass the chunk options
width/height/fig/prefix.string to tangle?) How about chunk hook
functions in getOption('SweaveHooks')? (What if they have significant
side effects such as clearing up the workspace as documented in
help(RweaveLatex)? How to reproduce these side effects in the tangled
R script?) These open-ended questions apply to both Sweave and knitr.
As the author of knitr, I feel it is very difficult to answer them.
Instead of patching tangle functions with uncertainty, personally I'd
like to stay primarily in the weave world (I admit I'm lazy and lack
confidence).

I work like Kevin Coombes in the sense that the number of times I
invoke weave is orders of magnitude greater than tangle. If you
produced a report using weave, I do not think you should expect other
people to reproduce the computation using the tangled code.

My conclusion: Is tangle useful? Yes. Must we tangle package
vignettes? Perhaps no.

Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Web: http://yihui.name


On Mon, Jun 2, 2014 at 12:44 PM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
> On 03/06/2014, 12:58 AM, Yihui Xie wrote:
>>
>> Yes, I completely agree the tangle code should run without errors, if
>> the package author has provided such a script. However, I think it is
>> also the package author's right to choose not to provide such a
>> script, for reasons that I stated in the beginning (1. redundancy; 2.
>> tangle functions ignore inline expressions that should not be
>> ignored).
>>
>> It seems that I still need to clarify it: I'm not talking about
>> disabling _running_ the tangled code, but disabling _generating_ the
>> code _optionally_. Unless someone is arguing that the tangled code
>> _must_ be generated from vignettes, I do not think anybody in this
>> discussion really has a conflict with anybody else.
>
>
> I think that it's not a vignette if you can't tangle it.  Including \Sexpr
> expressions in the tangled code is the sort of option I would support much
> more than suppressing the ability to tangle.  (I don't think \Sexpr
> expressions should be included by default, but there's enough flexibility in
> the system that it shouldn't be hard to include them optionally.)
>
>
>>
>> Please also note that I do not expect R core or CRAN maintainers to do
>> any extra work: package authors can easily disable tangle by
>> themselves without anything special flags to R CMD build or R CMD
>> check. The vignettes are still built normally (in terms of "weave"). I
>> brought forward the discussion to hear the possible harm that I was
>> potentially not aware of when we disable tangle for R package
>> vignettes (e.g. does it affect the quality of the package?). So far I
>> have not heard real harm (I admit my judgment is subjective).
>
>
> Several of us have told you the real harm:  it means that users can't easily
> extract a script that replicates the computations done in the vignette.
> That's a useful thing to be able to do.
>
> Duncan Murdoch



More information about the R-devel mailing list