[R] results of a survival analysis change when converting the data to counting process format
Göran Broström
gor@n@bro@trom @end|ng |rom umu@@e
Thu Aug 22 21:48:51 CEST 2019
On 2019-08-18 19:10, Ferenci Tamas wrote:
> Dear All,
>
> Consider the following simple example:
>
> library( survival )
> data( veteran )
>
> coef( coxph(Surv(time, status) ~ trt + prior + karno, data = veteran) )
> trt prior karno
> 0.180197194 -0.005550919 -0.033771018
>
> Note that we have neither time-dependent covariates, nor time-varying
> coefficients, so the results should be the same if we change to
> counting process format, no matter where we cut the times.
>
> That's true if we cut at event times:
>
> veteran2 <- survSplit( Surv(time, status) ~ trt + prior + karno,
> data = veteran, cut = unique( veteran$time ) )
>
> coef( coxph(Surv(tstart,time, status) ~ trt + prior + karno, data = veteran2 ) )
> trt prior karno
> 0.180197194 -0.005550919 -0.033771018
>
> But quite interestingly not true, if we cut at every day:
>
> veteran3 <- survSplit( Surv(time, status) ~ trt + prior + karno,
> data = veteran, cut = 1:max(veteran$time) )
>
> coef( coxph(Surv(tstart,time, status) ~ trt + prior + karno, data = veteran3 ) )
> trt prior karno
> 0.180197215 -0.005550913 -0.033771016
>
> The difference is not large, but definitely more than just a rounding
> error, or something like that.
>
> What's going on? How can the results get wrong, especially by
> including more cutpoints?
All results are wrong, but they are useful (paraphrasing George EP Box).
Göran
>
> Thank you in advance,
> Tamas
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list