[Bioc-devel] Sweave changes (keep.source = TRUE or FALSE?)

Duncan Murdoch murdoch at stats.uwo.ca
Fri Dec 8 01:13:04 CET 2006


Kevin R. Coombes wrote:
> One of the problems I have with using R and BioConductor is that 
> backwards compatibility rarely seems to be considered when new versions 
> are released.  (That statement may be wrong, but it is the impression I 
> have formed from watching things change over time.) Someone gets an idea 
> for a structural change that can potentially break tons of existing 
> code, and because it is theoretically better (and may even really be 
> better; that's not the point), they go ahead and  implement the change. 
>   

I can't think of any times where this has happened since I've been 
involved with the R project.  The thread you posted to is certainly not 
an example: it is about a change that may be released next April.  If 
5-6 months discussion isn't enough time to consider backwards 
compatibility, then I think it's just impossible to meet your standards.
> And when some poor user raises a question on one of these mailing lists, 
> the most common answer seems to be "upgrade to the latest versions of R 
> and BioConductor, and modify your code accordingly".  And then do it 
> again next quarter.
>   
I don't see that this has anything to do with this thread, but if you 
want support for free from me, then you'll have to use the same version 
of R as I do.  I think that's the motivation behind the message you've 
heard from others.  If you want to hire someone to support an outdated 
version of R I think you'll be able to do it, but don't expect volunteer 
support.
> By contrast, I have documents written in TeX in the early 1980's that 
> still compile and still produce EXACTLY the same output. And I still 
> have a lot of old perl scripts that still do exactly what they are 
> supposed to do.  In those cases, Donald Knuth and Larry Wall have acted 
> as "benevolent dictators" who insist that the people proposing changes 
> have to at least give serious thought to how to ensure that existing 
> code doesn't break.
>   
> I suspect my reaction to the proposed changes in Sweave results 
> precisely because I like Sweave so much, and use it for every I analysis 
> I perform.  Its primary virtue is for the production of documents that 
> will need to remain available for a long time. That's why I want the 
> documentation and the code in the same file, after all, so I can return 
> to it when it's time to write the methods section to the manuscript that 
> resulted from all those computations. And I can return to it again when 
> someone sends me a question after the paper gets published. And by that 
> time, I will probably have upgraded R and BioConductor, and I want the 
> figures that I generate tomorrow to still be the same as the figures I 
> sent to the publisher yesterday. And if I generated the actual PDF that 
> I sent to the publisher from an Sweave file and if it includes code 
> samples, then I don't want those code samples to change in the PDF file 
> I produce tomorrow.
>   
Then you shouldn't be updating R every quarter.   You should update less 
frequently, and document which versions were used to produce each 
document.  We make an effort to keep old versions available, but it is 
not a priority to maintain perfect backwards compatibility:  for that to 
be sensible would depend on the assumption that all decisions are 
perfect when the are made, and so they should be irrevocable.  That's 
just not realistic.

You certainly shouldn't be using R-devel for anything at all.  The 
public statement about what sort of changes are allowed there
(from the development guidelines on developer.r-project.org) is:

For r-devel i.e. what becomes x.y.0 releases only one rule seems
necessary:

DO NOT add code that cannot become reasonably complete by the next release.


> And yes; I do propose not changing the DEFAULT behavior of any existing 
> function. That's what backwards compatibility means.  If you add 
> additional features and you are going to put in an option to let the On
> user control the behavior anyway, then the default option should ensure 
> that code that works now will continue to produce the same results in 
> the future. Even in new versions of R. Even in new versions of 
> BioConductor.
>   
Maintaining that degree of backwards compatibility is not the R policy.  
I think you would not be able to convince many people that it should be.

Duncan Murdoch
> Best,
> 	Kevin
>
> Friedrich Leisch wrote:
>   
>>>>>>> On Wed, 06 Dec 2006 12:37:22 -0600,
>>>>>>> Kevin R Coombes (KRC) wrote:
>>>>>>>               
>>   > Hi,
>>   > I don't really think anyone believes that the parse&deparse behavior was 
>>   > exactly a "feature". Instead, I think the primary issue is one of 
>>   > backwards compatibility.
>>
>>   > You are proposing to change the behavior of Sweave in a manner that will 
>>   > cause old code to break. Here "break" has two meanings. Some automatic 
>>   > development tools will stop working on existing valid code.  In 
>>   > addition, existing valid code will produce results that differ from what 
>>   > they produced previously.
>>
>>   > To deal with this, you are going to add an option that will allow users 
>>   > to get the old behavior.  However, you propose to set the default value 
>>   > of the option to require users to go back and modify all their old code 
>>   > in order to prevent things from breaking. It seems obvious to me that 
>>   > the default behavior should be the one that does not break old code or 
>>   > require the editing of old code in order to get the old behavior.
>>
>>   > The reason I use Sweave (for virtually every analysis I do any more) is 
>>   > that I can guarantee that when I can go back to the code six months from 
>>   > now, I can regenerate the analysis and I can regenerate the 
>>   > documentation, and I know that I will get the same results. Changing the 
>>   > default behavior of Sweave violates that guarantee, since the 
>>   > documentation will not be identical to what it was before. Personally, I 
>>   > am willing to pay the cost with NEW analyses to invoke the new behavior 
>>   > explicitly (which I do agree is the preferred behavior) because I think 
>>   > the goal of backwards compatibility is more important.
>>
>>   > In other words, I disagree with your characterization of the 
>>   > parse&deparse behavior as a "bug".  It did not cause incorrect results 
>>   > in the documentation or the code, and everyone using Sweave knew about 
>>   > the behavior.
>>
>> Give me a break, that is simply nonsense. Sweave guarantees that you
>> can reproduce your results USING THE VERSION OF R THAT WAS USED FOR
>> THE ORIGINAL ANALYSIS, and that will still be true, because in R 2.4.x
>> there is no option keep.source.  Running a new version of R means
>> dozens of R functions will have changed, plotting functions may yield
>> figures that look different etc. etc. ... what you propose is in
>> essence that we are not allowed to change the default behaviour of ANY
>> R FUNCTION. Is that what you are proposing?
>>
>> And note that the changes I propose will not change any numerical
>> results or figures, nor will I "break code", the only thing that
>> changes is that the formatting of input lines looks different (and in
>> most cases better, that's why we want to do it).
>>
>> It's not like I am changing Sweave behavior every other week, actually
>> it is the first one at all. I have thought a lot whether I want to do
>> it or not, and I really think it is a good idea. What is great about R
>> is that it is allowed to change (if changes are transparent and
>> announced early enough). There is a certain operation system with a
>> market share of about 90% where backwards compatibility is more
>> important than development or security, and I don'tthink that should
>> be our role model.
>>
>> Best,
>> Fritz
>>



More information about the Bioc-devel mailing list