[Rd] FW: [R] The Quality & Accuracy of R

Tue Jan 27 01:30:19 CET 2009

Peter Dalgaard wrote:

Now that I've asked you in, I probably should at least chip in with a 
couple of brief notes on the issue:

- not everything can be validated, and it's not like the commercial 
companies are validating everything. E.g. nonlinear regression code will 
give different results on different architectures, or even different 
compilers on the same architecture, and may converge on one and not on 
another.

(Muenchen)==> Good point. The test suites that I ran when installing mainframe software were quite simple. Just one of each of various statistical methods, and I doubt any of them iterated back then. You would want to choose carefully the things to test to minimize such problems. The process I ran listed differences between my locally computed version and the one the company ran. If all went well, I would just see the different dates and times scroll by.

- end-user validation is in principle a good thing, but please notice 
that what we currently do is part of a build from sources, and requires 
that build tools are installed. (E.g., we don't just run things, we also 
compare them to known outputs.) It's not entirely trivial to push these 
techniques to the end user.

(Muenchen)==> It sounds like this is quite different from what I expected. The test suites I have seen were just standard code and datasets. They ran in the program I was installing so no extra tools were required. Known outputs did ship with the products for the comparison.

- a good reason to want post-install validation is that validity can 
depend on other part of the system outside developer control (e.g. an 
overzealous BLAS optimization, sacrificing accuracy and/or standards 
compliance for speed, can cause trouble). This is also a reason for not 
making too far-reaching statements about validity.

(Muenchen)==> Yes, and the combination of argument settings is probably close to infinite. We would want to emphasize that although testing is done, it's impossible for ANY organization to test all conditions.

- I'm not too happy about maintaining the same information in multiple 
places. One thing we learned from the FDA document is how easily factual 
errors creep in and how silly we'd look if, say, the location of a key 
server got stated incorrectly, or say that we release one patch version 
when in fact the most recent one had two. This kind of authoritative 
document itself needs a verification process to ensure that it is correct.

(Muenchen)==> Having maintained multiple docs that contained common sections, I can certainly agree it is hard to keep them synchronized. However, if there can only be one document, should it be focused on a small (albeit important) sliver of statistical use? If, as I suspect, the great majority of R users face this question, would it not make sense to address the bigger problem?

Would it be possible to address both audiences in the same document, by putting the information of general interest before the clinical-specific info? Would a more generic title lose the clinical audience?

Thanks,
Bob

> -----Original Message-----
> From: Peter Dalgaard [mailto:p.dalgaard at biostat.ku.dk] 
> Sent: Saturday, January 24, 2009 4:53 AM
> To: Muenchen, Robert A (Bob)
> Cc: R-help at r-project.org
> Subject: Re: [R] The Quality & Accuracy of R
> 
> Bob,
> 
> Your point is well taken, but it also raises a number of issues 
> (post-install testing to name one) for which the R-devel list would be 
> more suitable. Could we move the discussion there?
> 
> 	-Peter
> 
> 
> Muenchen, Robert A (Bob) wrote:
>> Hi All,
>>
>>  
>>
>> We have all had to face skeptical colleagues asking if software made by
>> volunteers could match the quality and accuracy of commercially written
>> software. Thanks to the prompting of a recent R-help thread, I read, "R:
>> Regulatory Compliance and Validation Issues, A Guidance Document for the
>> Use of R in Regulated Clinical Trial Environments
>> (http://www.r-project.org/doc/R-FDA.pdf). This is an important document,
>> of interest to the general R community. The question of R's accuracy is
>> such a frequent one, it would be beneficial to increase the visibility
>> of the non-clinical  information it contains. A document aimed at a
>> general audience, entitled something like, "R: Controlling Quality and
>> Assuring Accuracy" could be compiled from the these sections:
>>
>>  
>>
>> 1.      What is R? (section 4)
>>
>> 2.      The R Foundation for Statistical Computing (section  3)
>>
>> 3.      The Scope of this Guidance Document (section 2)
>>
>> 4.      Software Development Life Cycle (section 6)
>>
>>  
>>
>> Marc Schwartz, Frank Harrell, Anthony Rossini, Ian Francis and others
>> did such a great job that very few words would need to change. The only
>> addition I suggest is to mention how well R did in, Keeling & Parvur's
>> "A comparative study of the reliability to nine statistical software
>> packages, May 1, 2007 Computational Statistics & Data Analysis, Vol.51,
>> pp 3811-3831. 
>>
>>  
>>
>> Given the importance of this issue, I would like to see such a document
>> added to the PDF manuals in R's Help.
>>
>>  
>>
>> The document mentions (Sect. 6.3) that a set of validation tests, data
>> and known results are available. It would be useful to have an option to
>> run that test suite in every R installation, providing clear progress,
>> "Validating accuracy of t-tests...Validating accuracy of linear
>> regression...." Whether or not people chose to run the tests, they would
>> at least know that such tests are available. Back in my mainframe
>> installation days, this step was part of many software installations and
>> it certainly gave the impression that those were the companies that took
>> accuracy seriously. Of course the other companies probably just ran
>> their validation suite before shipping, but seeing it happen had a
>> tremendous impact.  I don't know how much this would add to download,
>> but if it was too much, perhaps it could be implemented as a separate
>> download. 
>>
>>  
>>
>> I hope these suggestions can help mitigate the concerns so many non-R
>> users have.

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907