[R] Re: Validation of R

Richard Rowe Richard.Rowe at jcu.edu.au
Sat Apr 19 03:09:24 CEST 2003


OK - to hard reality.

R has become mainstream among practitioners BECAUSE IT IS 
GOOD.  Practitioners have been voting with their feet/time for years, but 
with recent publicity the tide is becoming a flood.

At some stage we (as in the R community, not the over-worked core) are 
going to have to do something to 'protect' our members in the commercial 
community (and with the push of 'accountability' and its legions of 
analphabet clerks into the academic/research community soon the rest of us).

I suggest that those interested in 'validation' form a group and set about 
systematically 'validating' R processes.
Prebuilt 'evil' datasets (like Anscombe) and simulation using a range of 
different pseudorandom generators to generate data is probably the best 
way.  I once attended a lecture by John Tukey where he described the 
'tests' a measure should be put through wrt input structures (I remember 
one was characterised as a 'rabbit punch', another as 'knee-in-the-groin'), 
such a repertoire of exercises could be put in place.  Testers need to 
recognise that standard functions can be exposed to the weirdest 
distributions when they are called as an intermediate step in another 
calculation.
Code needs to do exactly what it is documented to do, and to squawk loudly 
when asked to do what it doesn't.

I am sure those who have built so much code would really appreciate getting 
a note from the validating group confirming it has been tested and hasn't 
broken down, or else getting a note documenting exactly where and how code 
doesn't work as expected ...

R is open source.  We all have access to the code.  We could also have open 
source published test datasets and outcomes ... which would actually 
present a challenge to the COTS industry to match.

If it is to happen then someone who feels strongly about this needs to get 
the ball rolling, and there would probably need to be a sympathetic conduit 
into/from the core.

For QA purposes the 'testers' will need to be independent of the 
'producers' ...

Richard Rowe
Senior Lecturer
Department of Zoology and Tropical Ecology, James Cook University
Townsville, Queensland 4811, Australia
fax (61)7 47 25 1570
phone (61)7 47 81 4851
e-mail: Richard.Rowe at jcu.edu.au
http://www.jcu.edu.au/school/tbiol/zoology/homepage.html



More information about the R-help mailing list