[R] Comparing two different 'survival' events for the same subject using survdiff?

Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST) calum.polwart at nhs.net
Mon Apr 29 13:56:54 CEST 2013

> It isn't that complex:
> myDataLong <- data.frame(Time=c(A, C), Censored=c(B, D), group=rep(0:1, times=c(length(A), length(C))))
> Fit = survfit(Surv(Time, Censored==0) ~ group, data=myDataLong)
> plot(Fit, col=1:2)
> survdiff(Surv(Time, Censored==0) ~ group, data=myDataLong)

Yes - for the example its not complex - but once we get down to having more data columns I think it may...  Maybe I ignore those and just build 'myDataLong' for this specific test.

> However, your approach (a 'wide' data frame) suggests that there are equal numbers in the two survival
> studies.  Are they even the same people?  Is it even the same study?  If so, this is a competing risks question
> and would have to be approached differently.

Yes its the same patients. The two events are technically independant of each other but the hope is that the easier outcome measure would predict the other...  I'm not familliar with competing risks and so will have to read up on it but it isn't a scenario where A or B happens, A happens and B happens and you might expect A happened because B happened...

> And, of course, absence of evidence is not evidence of absence.  Failing to reject the null hypothesis that the
> distributions are different is not proof that the distributions are equal.

Yes absolutely - however I'm half expecting to detect a difference and so then dismiss using A as a surrogate of B...


-----Original Message-----
From: Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST) [mailto:calum.polwart at nhs.net]
Sent: Monday, April 29, 2013 4:48 AM
To: r-help at r-project.org
Subject: [R] Comparing two different 'survival' events for the same subject using survdiff?

I have a dataset which for the sake of simplicity has two endpoints.  We would like to test if two different end-points have the same eventual meaning.  To try and take an example that people might understand better:

Lets assume we had a group of subjects who all received a treatment.  The could stop treatment for any reason (side effects, treatment stops working etc).  Getting that data is very easy.  Measuring if treatment stops working is very hard to capture... so we would like to test if duration on treatment (easy) is the same as time to treatment failure (hard).

My data might look like this:

A = c(9.77,  0.43,  0.03,  3.50,  7.07,  6.57,  8.57,  2.30,  6.17,  3.27,  2.57,  0.77) B = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1)  # 1 = yes (censored) C = c( 9.80,  0.43,  5.93,  8.43,  6.80,  2.60,  8.93,  8.37, 12.23,  5.83, 13.17,  0.77) D = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1) # 1 = yes (censored) myData = data.frame (TimeOnTx = A, StillOnTx = B, TimeToFailure = C, NotFailed = D)

We can do a survival analysis on those individually:
OnTxFit = survfit (Surv ( TimeOnTx, StillOnTx==0 ) ~ 1 , data = myData)

FailedFit = survfit (Surv ( TimeToFailure , NotFailed==0 ) ~ 1 , data = myData)


But how can I do a survdiff type of comparison between the two?  Do I have to restructure the data so that Time's are all in one column, Event in another and then a Group to indicate what type of event it is?  Seems a complex way to do it (especially as the dataset is of course more complex than I've just shown)... so I thought maybe I'm missing something...


This message may contain confidential information. If yo...{{dropped:29}}

More information about the R-help mailing list