[R] Re: Validation of R

Mon Apr 21 20:52:20 CEST 2003

I think there may be some exaggeration of how much effort and co-ordination is
needed in order to "validate" R (at least in a non official sense). The QA tools
already in R are incredible good. What is need is for people to actually use
them. If you make a package with code in the tests directory and in that code
you compare results with known results, and stop() if there is an error, then
the package will fail to build and will produce an error message indicating a
problem. Furthermore, the QA tools for checking documentation are exceptional.
If you make the package interesting enough that others may want to use it, and
submit it to CRAN, then the tests are run as part of the development cycle (I
believe) so the feedback to R core is automatic (although debugging may get
bounced back to you, especially if the problem is your code and not R itself).

For tests which may not be of special interest to others, you can set this up
yourself to run automatically and indicate only when there is a problem. In
addition to the tests in my packages on CRAN I have tests that I run myself for
days. These do simulations, estimations, compare results with known results, or
at least previous results, and do calculations multiple ways to test that
results are the same (for example, that the roots of a state space model are the
same as the roots of an equivalent ARMA model). I run about six hours of these
regularly on Linux and on Solaris with a few R release candidates and try to run
the whole suite at least once before a new release. This does not take any
"hands on time," it just takes computer time. On Linux I start it before going
to work (R 1.7.0beta was being released in the morning, my time) and the main
part is done when I get home. The hands on time is to devise meaningful,
comprehensive tests (and to debug when there are problems).

There may be less work involved in doing (un-official) validation than there is
in advertising how much is actually being done. Perhaps the simplest approach is
for individuals to put together packages of tests with descriptions that explain
the extent of the testing which is done, and then submit the packages to CRAN.

Paul Gilbert
Head Statistician/Statisticien en chef, 
Department of Monetary and Financial Analysis, 
     /Département des Études monétaires et financiers, 
Bank of Canada/Banque du Canada

Richard Rowe wrote:
> 
> OK - to hard reality.
> 
> R has become mainstream among practitioners BECAUSE IT IS
> GOOD.  Practitioners have been voting with their feet/time for years, but
> with recent publicity the tide is becoming a flood.
> 
> At some stage we (as in the R community, not the over-worked core) are
> going to have to do something to 'protect' our members in the commercial
> community (and with the push of 'accountability' and its legions of
> analphabet clerks into the academic/research community soon the rest of us).
> 
> I suggest that those interested in 'validation' form a group and set about
> systematically 'validating' R processes.
> Prebuilt 'evil' datasets (like Anscombe) and simulation using a range of
> different pseudorandom generators to generate data is probably the best
> way.  I once attended a lecture by John Tukey where he described the
> 'tests' a measure should be put through wrt input structures (I remember
> one was characterised as a 'rabbit punch', another as 'knee-in-the-groin'),
> such a repertoire of exercises could be put in place.  Testers need to
> recognise that standard functions can be exposed to the weirdest
> distributions when they are called as an intermediate step in another
> calculation.
> Code needs to do exactly what it is documented to do, and to squawk loudly
> when asked to do what it doesn't.
> 
> I am sure those who have built so much code would really appreciate getting
> a note from the validating group confirming it has been tested and hasn't
> broken down, or else getting a note documenting exactly where and how code
> doesn't work as expected ...
> 
> R is open source.  We all have access to the code.  We could also have open
> source published test datasets and outcomes ... which would actually
> present a challenge to the COTS industry to match.
> 
> If it is to happen then someone who feels strongly about this needs to get
> the ball rolling, and there would probably need to be a sympathetic conduit
> into/from the core.
> 
> For QA purposes the 'testers' will need to be independent of the
> 'producers' ...
> 
> Richard Rowe
> Senior Lecturer
> Department of Zoology and Tropical Ecology, James Cook University
> Townsville, Queensland 4811, Australia
> fax (61)7 47 25 1570
> phone (61)7 47 81 4851
> e-mail: Richard.Rowe at jcu.edu.au
> http://www.jcu.edu.au/school/tbiol/zoology/homepage.html
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help