[Rd] R-devel Digest, Vol 98, Issue 19
Terry Therneau
therneau at mayo.edu
Tue Apr 19 14:21:12 CEST 2011
The replies so far have helped me see the issues more clearly.
Further comments:
1. This issue started with a bug report from a user:
library(survival)
fform <- as.formula(Surv(time, status) ~ age)
myfun <- function(dform, ddata) {
predict(coxph(dform, data=ddata), newdata=ddata)
}
Gabor's suggestion to change the call is a useful idea but not
completely relevant: I'm trying to make their code work.
If work-arounds are the solution, then adding model=TRUE to the coxph
call is sufficient. (That is why the same code with lm() does work).
2. Duncan argues that one should not expect the construct to work. I
respectfully disagree. Returning to my simpler lm example, an
expression is present in a context where all the variables are known, it
is a surprise that it does not work. Maybe not a surprise to the inner
circle of developers, but to most users.
Looking at model.frame.lm, the final result is
eval(fcall, environment of the formula, parent.frame())
The terms we need are in the parent frame, why does eval ignore the
third argument? (I haven't looked at the R source to see if it is on
purpose). Is there a way to persuade it to use that arg?
A careful reading of ?model.frame backs up Duncan's argument: it is
documented not to work. I still don't like it -- too much a reprise of
the old Unix argument "That's not a bug it's a feature".
(Minor note: The next to last sentence before Value has
"containing the variables used in formula plus those specified ...
Unlike model.frame.default, it returns the "
Is the ... a reminder to finish the sentence? It doesn't quite parse
as is.)
3. Brian R notes that adding model=TRUE is safer. Agreed. The original
S version of lm etc did not, in order to keep objects smaller and the
survival code still contains that legacy --- how time changes our
perceptions of "big". Should I take this as a formal suggestion to
change the default in coxph and survreg? (If I further change the
default to y=FALSE it will break at least 1 package (survey), and I'd
guess several others.)
4. Peter D: thanks for agreeing that there is a problem.
I spent a lot of time and energy fixing model frame evaluation issues
in Splus, only to be to told at the end that they didn't dare implement
it "because it might break something". That made me turn my back on the
whole debate and I haven't participated or kept up with the discussion.
The heart of my fix then --and it did fix a lot of problems without
breaking anything I could find -- was that the data= argument should be
an additional place to look rather than an alternate one.
More information about the R-devel
mailing list