[Rd] model.frame and predvars

Terry Therneau therneau at mayo.edu
Tue Jun 5 19:43:03 CEST 2012



On 06/05/2012 11:32 AM, Prof Brian Ripley wrote:
> On 05/06/2012 16:17, Terry Therneau wrote:
>> I was looking at how the model.frame method for lm works and comparing
>> it to my own for coxph.
>> The big difference is that I try to retain xlevels and predvars
>> information for a new model frame, and lm does not.
>> I use a call to model.frame in predict.coxph, which is whyI went that
>> route, but never noted the difference till now (preparing for my course
>> in Nashville).
>>
>> Could someone shed light on the rationale for non-preservation?
>>
>
> T'other way round ... it would have needed a conscious decision to 
> preserve them: these all predate xlevels and predvars.
>

That's true, but lots of other things have been changed, e.g. 
model.frame=T.
I was wondering if there was a positive reason not to do it; this 
answers the question.


> model.matrix.lm does make use of xlevels, and I think that explains 
> the difference as most lm() auxiliaries use the model matrix.
>
But it does not keep predvars, which is interesting.  So it fixes one 
issue but leaves another undone.

> And I don't see predvars used in survival:::model.frame.coxph.

Actually it does, as is proven by the example I showed (ns 
reconstruction of the first two lines was the same).  Having commented 
source code helps to see it of course, I replace formula in the call 
with the terms object from the prior fit before invoking model.frame.  
This causes inheritance of the both "specials" and predvars.

So, should the survival library change to the old behavior?  I prefer my 
current one, in that the only time I would do model.frame(prior.fit, 
data=new) would be if I want the same treatment of special variables.  
If I wanted a new, de novo decision on constrasts or codings I would 
have just fit a new model.   Is there a plausible situation where the 
old behavior is prefereable?  Discounting backwards compatability of 
course, which is always a thorny issue.


>> Terry T.
>>
>>
>> Simple example
>>
>> > library(survival)
>>
>> > lfit <- lm(time ~ factor(ph.ecog) + ns(age, 3), data=lung)
>> > ltemp <- model.frame(lfit, data=lung[1:2,])
>> > ltemp
>> time factor(ph.ecog) ns(age, 3).1 ns(age, 3).2 ns(age, 3).3
>> 1 306 1 -0.1428571 0.4285714 0.7142857
>> 2 455 0 0.0000000 0.0000000 0.0000000
>>
>> > lfit$model[1:2,]
>> time factor(ph.ecog) ns(age, 3).1 ns(age, 3).2 ns(age, 3).3
>> 1 306 1 0.4443546 0.3286161 0.1900511
>> 2 455 0 0.5697239 0.3618440 -0.1297479
>>
>> > levels(ltemp[[2]])
>> [1] "0" "1"
>>
>> > levels(lfit$model[[2]])
>> [1] "0" "1" "2" "3"
>>
>> > cfit <- coxph(Surv(time, status) ~ factor(ph.ecog) + ns(age,3), lung)
>> > model.frame(cfit, data= lung[1:2,])
>> Surv(time, status) factor(ph.ecog) ns(age, 3).1 ns(age, 3).2 ns(age, 
>> 3).3
>> 1 306 1 0.4443546 0.3286161 0.1900511
>> 2 455 0 0.5697239 0.3618440 -0.1297479
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



More information about the R-devel mailing list