[Rd] survival package
therneau at mayo.edu
Tue Feb 10 22:32:04 CET 2009
R core team -
It took far, far longer than I anticipated, but I have finally finished the
next release of the survival code. Primary changes
1. The source has been migrated to Rforge. This will now be the primary
source. I've used SCCS -> rcs -> cvs and now svn. Further changes will be
someone else's problem. I expect that maintaince will now begin to migrate
to a larger group and my role diminish; I hope to focus more energy on coxme.
I tried to carry forward the entire history of the package. In the current
US software copyright mess we might one day have to use that long history. I
think I started the SCCS tree in around 1987.
I expect to have survival, date, coxme, and bdsmatrix as separate packages
within the package tree.
2. I have merged in a large number of changes/updates that had occured at
Mayo, along with all the changes I found in the R source (version 2.32 - when
I started this task, 2.34 when I finished). Major changes
Complete rewrite of the survfit function(s); most of the work is now done in
S instead of C. Added Turnbull's method for general censoring and also
competing risk estimates.
There are no longer any dependencies on the 'date' library, and it has been
removed to a separate branch. The expected survival routines now work
with any of Date, posix, timeDate, or date.
Survreg was re-written to use R.h style instead of S.h. Common code now
works with both packages. It's a big improvement in several ways:
easier to code, much easier to read, less clunky.
Aalen's additive regression model
The survival rate tables have been updated -- the 2000 US rate tables were
finally released in Aug 2008.
There are 20-30 other small upgrades and fixes that had accrued over time.
The file Changelog.09 has more details.
3. I've tried very hard to make all the code work in both Splus and R. Our
group still uses Splus more than R so I have to support both. However, I
think that with the recent change in Splus, which will focus them more on
vertical markets, and the uptake of R here due to the genetics libraries,
that this dual support will no longer be needed within 2-3 years.
A year ago I would have defended the extra work necessary to make code
work in both dialects much more vigorously. It isn't all that much work if
you think ahead. Splus still has a presence in certain markets, and code that
works in both is good evangelism. But -- said gain in utility keeps shrinking.
In the R directory files that work with both are named whatever.S,
those tested only in R are .R. If you think they should all be named
.R, then now is the time to do it.
4. I am very impressed with the package system and R CMD check. I had a
cozy Splus/Unix enviroment with a custom makefile that made things very easy
and delayed my changeover. This was a significant learning curve, but knowing
what I know now I should have started earlier. Kudos to you all.
5. Known issues:
i. An additional test suit (book5/book6) corresponding to the
not-yet-published extension of my book's appendix of validity tests was
added. It found an issue with "stderr of expected survival/Cox model with
noninteger case weights/Efron approximation". Not yet addressed, of a
1/n vs 1/(n-1) size.
ii. The stderr of terms/survreg/penalized model is different between Splus
and R by 50% or more. I haven't yet figured out which formula is correct.
iii. The CMD check script still complains about some of my .Rd files; almost
all involve imperfect documentation of weights, subset, or ... options. I've
been working on this very hard for the last month, and just can't raise any
excitment about these last few warnings at the moment.
Both i and ii exist in the current R code, so there is no loss in updating
before working them out.
More information about the R-devel