[R] R and clinical studies
Terry Therneau
therneau at mayo.edu
Mon Mar 19 13:45:14 CET 2007
A strength of R is that there is a wide variety of contribuitions to the
package, giving it great breadth.
A weakness of R is that there is a wide variety of contributers to the
package, some of whom spend a lot of time on the task of function correctness,
and some of whom spend little; some worry about backward compatability, some
sneer at the idea; some spend a lot of time on maintainance, and some don't
have the time to do so or move on to other things.
The survival code, for instance, has a set of exact test cases. These are
small data sets where the correct answer has been carefully worked out by
hand. S (Splus or R) passes all the tests, SAS passes most of them. (Most of
the tests are documented in an appendix of Therneau and Grambsch, Springer,
2000). These test cases has been a great help in creating and debugging the
code, but overall represent a large amount of work. Most code that does not
have a corporate sponsor will not have the resources to do this. I have them
mostly because the survival library's genesis has been spread out over 20
years, and individual bits were important parts of clinical trials and so
HAD to be right.
(Aside. SAS has a deserved repuation for accuracy. It has an undeserved
one for infallability --- one of my favorite bug reports for the S code
started out "I've found a mistake in the coxph function, it gives a different
answer than SAS". It turned out in that case that the S and SAS data sets
in their example were not quite the same. As an earlier poster said, data
management and manipulation is the root of most errors.)
Our group uses SAS for data manipulation primarily, and a mix of SAS and
S-Plus for the analysis. It would be difficult to become a pure S shop, but
we've had no trouble with the mix.
Terry Therneau
Biostatistics, Mayo Clinic
More information about the R-help
mailing list