[R] Unit Testing Frameworks: summary and brief discussion
Paul Murrell
p.murrell at auckland.ac.nz
Thu May 10 03:35:15 CEST 2007
Hi
Paul Gilbert wrote:
> Tony
>
> Thanks for the summary.
>
> My ad hoc system is pretty good for catching flagged errors, and
> numerical errors when I have a check. Could you (or someone else)
> comment on how easy it would be with one of these more formal frameworks
> to do three things I have not been able to accomplish easily:
>
> - My code gives error and warning messages in some situations. I want to
> test that the errors and warnings work, but these flags are the correct
> response to the test. In fact, it is an error if I don't get the flag.
> How easy is it to set up automatic tests to check warning and error
> messages work?
>
> - For some things it is the printed format that matters. How easy is it
> to set up a test of the printed output? (Something like the Rout files
> used in R CMD check.) I think this is what Tony Plate is calling
> transcript file tests, and I guess it is not automatically available. I
> am not really interested in something I would have to change with each
> new release of R, and I need it to work cross-platform. I want to know
> when something has changed, in R or my own code, without having to
> examine the output carefully.
>
> - (And now the hard one.) For some things it is the plotted output that
> matters. Is it possible to set up automatic tests of plotting? I can
> already test that plots run. I want to know if they "look very
> different". And no, I don't have a clue where to start on this one.
For text-based graphics formats, you can just use diff; for raster
formats, you can do per pixel comparisons. These days there is
ImageMagick to do a compare and it will even produce an image of the
difference. I have an old package called graphicsQC (not on CRAN) that
implemented some of these ideas (there was a talk at DSC 2003, see
http://www.stat.auckland.ac.nz/~paul/index.html). A student worked on a
much better approach more recently, but I haven't put that up on the web
yet. Let me know if you'd like to take a look at the newer package (it
would help to have somebody nagging me to get it finished off).
Paul
> Paul Gilbert
>
> anthony.rossini at novartis.com wrote:
>> Greetings -
>>
>> I'm finally finished review, here's what I heard:
>>
>> ============ from Tobias Verbeke:
>>
>> anthony.rossini at novartis.com wrote:
>>> Greetings!
>>>
>>> After a quick look at current programming tools, especially with regards
>>> to unit-testing frameworks, I've started looking at both "butler" and
>>> "RUnit". I would be grateful to receieve real world development
>>> experience and opinions with either/both. Please send to me directly
>>> (yes, this IS my work email), I will summarize (named or anonymous, as
>>> contributers desire) to the list.
>>>
>> I'm founding member of an R Competence Center at an international
>> consulting company delivering R services
>> mainly to the financial and pharmaceutical industries. Unit testing is
>> central to our development methodology
>> and we've been systematically using RUnit with great satisfaction,
>> mainly because of its simplicity. The
>> presentation of test reports is basic, though. Experiences concerning
>> interaction with the RUnit developers
>> are very positive: gentle and responsive people.
>>
>> We've never used butler. I think it is not actively developed (even if
>> the developer is very active).
>>
>> It should be said that many of our developers (including myself) have
>> backgrounds in statistics (more than in cs
>> or software engineering) and are not always acquainted with the
>> functionality in other unit testing frameworks
>> and the way they integrate in IDEs as is common in these other languages.
>>
>> I'll soon be personally working with a JUnit guru and will take the
>> opportunity to benchmark RUnit/ESS/emacs against
>> his toolkit (Eclipse with JUnit- and other plugins, working `in perfect
>> harmony' (his words)). Even if in my opinion the
>> philosophy of test-driven development is much more important than the
>> tools used, it is useful to question them from
>> time to time and your message reminded me of this... I'll keep you
>> posted if it interests you. Why not work out an
>> evaluation grid / check list for unit testing frameworks ?
>>
>> Totally unrelated to the former, it might be interesting to ask oneself
>> how ESS could be extended to ease unit testing:
>> after refactoring a function some M-x ess-unit-test-function
>> automagically launches the unit-test for this particular
>> function (based on the test function naming scheme), opens a *test
>> report* buffer etc.
>>
>> Kind regards,
>> Tobias
>>
>> ============ from Tony Plate:
>>
>> Hi, I've been looking at testing frameworks for R too, so I'm interested
>> to hear of your experiences & perspective.
>>
>> Here's my own experiences & perspective:
>> The requirements are:
>>
>> (1) it should be very easy to construct and maintain tests
>> (2) it should be easy to run tests, both automatically and manually
>> (3) it should be simple to look at test results and know what went wrong
>> where
>>
>> I've been using a homegrown testing framework for S-PLUS that is loosely
>> based on the R transcript style tests (run *.R and compare output with
>> *.Rout.save in 'tests' dir). There are two differences between this
>> test framework and the standard R one:
>> (1) the output to match and the input commands are generated from an
>> annotated transcript (annotations can switch some tests in or out
>> depending on the version used)
>> (2) annotations can include text substitutions (regular expression
>> style) to be made on the output before attempting to match (this helps
>> make it easier to construct tests that will match across different
>> versions that might have minor cosmetic differences in how output is
>> formatted).
>>
>> We use this test framework for both unit-style tests and system testing
>> (where multiple libraries interact and also call the database).
>> One very nice aspect of this framework is that it is easy to construct
>> tests -- just cut and paste from a command window. Many tests can be
>> generated very quickly this way (my impression is that is is much much
>> faster to build tests by cutting and pasting transcripts from a command
>> window than it is to build tests that use functions like all.equal() to
>> compare data structures.) It is also easy to maintain tests in the face
>> of change (e.g., with a new version of S-PLUS or with bug fixes to
>> functions or with changed database contents) -- I use ediff in emacs to
>> compare test output with the stored annotated transcript and can usually
>> just use ediff commands to update the transcript.
>>
>> This has worked well for us and now we are looking at porting some code
>> to R. I've not seen anything that offers these conveniences in R.
>>
>> It wouldn't be too difficult to add these features to the built-in R
>> testing framework, but I've not had success in getting anyone in R core
>> to listen to even consider changes, so I've not pursued that route after
>> an initial offer of some simple patches to tests.mk and wintests.mk.
>>
>> RUnit doesn't have transcript-style tests, but it wasn't very difficult
>> to add support for transcript-style tests to it. I'll probably go ahead
>> and use some version of that for our porting project. (And offer it to
>> the community if the RUnit maintainers want to incorporate it.) I also
>> like the idea that RUnit has some code analysis tools -- that might
>> support some future project that allowed one to catalogue the number of
>> times each code path through a function was exercised by the tests.
>>
>> I just looked at 'butler' and it looks very much like RUnit to me -- and
>> I didn't see any overview that explained differences. Do you know of
>> any differences?
>>
>> cheers,
>>
>> Tony Plate
>>
>>
>> ============== from Paul Gilbert:
>>
>> Tony
>>
>> While this is not exactly your question, I have been using my own system
>> based on make and the tools use by R CMD build/check to do something I
>> think of as unit testing. This pre-dates the unit-testing frameworks, in
>> fact, some of it predates R. I actually wrote something on this at one
>> point: Paul Gilbert. R package maintenance. R News, 4(2):21-24,
>> September 2004.
>>
>> I have occasionally thought about trying to use RUnit, but never done
>> much because I am relatively happy with what I have. (Inertia is an
>> issue too.) I would be happy to hear if you do an assessment of the
>> various tools.
>>
>> Best,
>> Paul Gilbert
>>
>>
>> ============= From Seth Falcon:
>>
>> Hi Tony,
>>
>> anthony.rossini at novartis.com writes:
>>> After a quick look at current programming tools, especially with regards
>>> to unit-testing frameworks, I've started looking at both "butler" and
>>> "RUnit". I would be grateful to receieve real world development
>>> experience and opinions with either/both. Please send to me directly
>>> (yes, this IS my work email), I will summarize (named or anonymous, as
>>> contributers desire) to the list.
>> I've been using RUnit and have been quite happy with it. I had not
>> heard of butler until I read your mail (!).
>>
>> RUnit behaves reasonably similarly to other *Unit frameworks and this
>> made it easy to get started with as I have used both JUnit and PyUnit
>> (unittest module).
>>
>> Two things to be wary of:
>>
>> 1. At last check, you cannot create classes in unit test code and
>> this makes it difficult to test some types of functionality. I'm
>> really not sure to what extent this is RUnit's fault as opposed
>> to limitation of the S4 implemenation in R.
>>
>> 2. They have chosen a non-default RNG, but recent versions provide a
>> way to override this. This provided for some difficult bug
>> hunting when unit tests behaved differently than hand-run code
>> even with set.seed().
>>
>> The maintainer has been receptive to feedback and patches. You can
>> look at the not-so-beautiful scripts and such we are using if you look
>> at inst/UnitTest in: Category, GOstats, Biobase, graph
>>
>> Best Wishes,
>>
>> + seth
>>
>>
>> =================== Discussion:
>>
>> After a bit more cursory use, it looks like RUnit is probably the right
>> approach at this time (sorry Hadley!). Both RUnit and butler have a
>> range of testing facilities and programming support tools. I support the
>> above statements about feasibility and problems -- except that I didn't
>> get a chance to checkout the S4 issues that Seth raised above. The one
>> piece that I found missing in my version was some form of GUI based
>> tester, i.e. push a button and test, but I think I've not thought through
>> some of the issues with environments and closures yet that might cause
>> problems.
>>
>> Thanks to everyone for responses! It's clear that there is a good start
>> here, but lots of room for improvement exists.
>>
>> Best regards / Mit freundlichen Grüssen,
>> Anthony (Tony) Rossini
>> Novartis Pharma AG
>> MODELING & SIMULATION
>> Group Head a.i., EU Statistical Modeling
>> CHBS, WSJ-027.1.012
>> Novartis Pharma AG
>> Lichtstrasse 35
>> CH-4056 Basel
>> Switzerland
>> Phone: +41 61 324 4186
>> Fax: +41 61 324 3039
>> Cell: +41 79 367 4557
>> Email : anthony.rossini at novartis.com
>>
>> [[alternative HTML version deleted]]
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> ====================================================================================
>
> La version française suit le texte anglais.
>
> ------------------------------------------------------------------------------------
>
> This email may contain privileged and/or confidential inform...{{dropped}}
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
paul at stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/
More information about the R-help
mailing list