[R] coxph linear.predictors

Bond, Stephen Stephen.Bond at cibc.com
Wed Oct 27 21:06:27 CEST 2010


What you suggest is exactly what I intended. The big problem is that 

X*beta  != linear.predictor  # it works for glm 

That's a big loose end. The survival documentation on p41 implies that the above should be an equality, but it is not. And I could not produce the surv vector using a variety of formulas, so I guess this is the spirit of SAS disguising itself as an R package.

Stephen Bond 

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Wednesday, October 27, 2010 2:17 PM
To: Bond, Stephen
Cc: r-help at r-project.org
Subject: Re: [R] coxph linear.predictors


On Oct 27, 2010, at 1:45 PM, Bond, Stephen wrote:

>>>
>>> coef(fit)*model.matrix(fit)[1,1]
>>>    age
>>> 11.69021
>>>
>
>> I don't know what that might be and you are not telling us what you
>> think it is.
>
> I think this the calculation of the linear predictor, multiplying
>
>> (the beta*X)
>
> I expected that coef(fit)*model.matrix(fit)[1,1]= fit 
> $linear.predictors[1]
>
> And since it does not, I am lost.
>
> Ultimately I need unconditional death probability from day 1 to day  
> t in the end.

Now it sounds like you do want cap-lambda, the cumulative hazard.  
Let's call it Lambda

Since S(t) = exp(-Lambda(t)), you can invert to get Lambda(t) = - 
log(S(t))

?survfit

Possibly:
plot(-log( survfit(fit)$surv ))

> Survfit takes too much time as it does a lot of things I do not  
> need. I hope I should be able to get the death prob for every  
> subject from a single call to survfit and then rescale with  
> exp(linear.predictor) for each subject.

I have run into slowness with that approach myself. The rms package  
provides survest and Predict methods, but if you wanted to stay in  
survival, then why not create a survfit()$surv object based on a  
vector of means and then calculate an exp(newdata) by which you would  
multiply the S(mean|t)? There is also a predict.coxph function with  
its type="risk" parameter that you might want to check.

>
> Just a comment: as there is not deltat argument, I think that coxph  
> assumes deltat=1 so the hazard is the conditional probability over  
> the next time interval ( t(i-1), t(i) ). With dense data sampled  
> daily this is a day interval.

That's not the way I understand it. The Lambda() and S() functions  
only change at the particular event times on a continuous time so they  
are not assumed to have integer values. Pretty sure that Therneau  
avoids presenting extraction methods for instantaneous hazards from  
Cox models. In his book "Modeling Survival Data" he mostly sticks to  
discussing cumulative hazard functions rather than estimating an  
interval or instantaneous hazard. And fortunately he visits rhelp so  
it's quite possible that my errors in this area will get expert review  
and correction.


>
> Your comments are appreciated.
>
> Stephen Bond
>
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: Wednesday, October 27, 2010 1:15 PM
> To: Bond, Stephen
> Cc: r-help at r-project.org
> Subject: Re: [R] coxph linear.predictors
>
>
> On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote:
>
>> I would like to be able to construct hazard rates (or unconditional
>> death prob)
>
> Hazards are not probabilities (since probabilities are constrained to
> the range [0,1] and hazards are unbounded upward.)
>
>> for many subjects from a given survfit.
>>
>> This will involve adjusting the ( n.event/n.risk)
>> with (coxph object )$linear.predictors
>> I must be having another silly day as I cannot reproduce the linear
>> predictor:
>>
>> fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian)
>> fit$linear.predictors[1]
>> [1] 2.612756
>
> That's the linear predictor (the beta*X) and that particular number
> only applies to the first case.
>
>>
>> coef(fit)*model.matrix(fit)[1,1]
>>    age
>> 11.69021
>>
>
> I don't know what that might be and you are not telling us what you
> think it is.
>
>
>> The above is based on the help listing for coxph.object
>> coefficients: the coefficients of the linear predictor, which  
>> multiply
>>         the columns of the model matrix.  If the model is
>>         over-determined there will be missing values in the vector
>>         corresponding to the redundant columns in the model matrix.
>>
>> Also, please comment whether n.event/n.risk
>
> The Nelson-Aalen estimator of the cumulative hazard as a function of
> intervals prior to t is sum( n-event(t)/ n.risk(t))
>
>> gives the baseline hazard exp(alpha) ?
>
> No. The "baseline hazard", as you are calling this, would be an
> estimate for persons with all covariates = 0, so in this case is for
> women of age=0. (Not a particularly interpretable result in many
> situations. The baseline hazard following treated ovarian cancer for
> neonates is not medically sensible.)
>
> What is the purpose of this request? Is someone telling you you need
> to provide estimates for instantaneous hazards?
>
>
> -- 
>
> David Winsemius, MD
> West Hartford, CT
>

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list