[R] Survival Analysis with two different events
Terry Therneau
therneau at mayo.edu
Mon Jun 30 16:11:11 CEST 2008
sickboyedd <sickboyedd <at> gmail.com> writes:
>
>
> Hello all,
>
> I am hoping to use survival analysis to examine whether parasite attack
> increases nest death in a species of social wasp. I therefore have data for
>
> 1. Whether the nest "died" in the 6 week census period ("Status", where
> 1=died, 0=survived)
> 2. The day number of death/last recorded day it was observed alive.
> 3. Whether the nest was attacked by the parasite (0/1 as with 1.)
> 4. The day number of attack/ last recorded day the nest was observed without
> a parasite.
>
> i.e. example dataset:
>
> status death para paraday
> 0 42 0 42
> 1 32 0 42
> 1 25 1 13
> 0 42 1 25 ...
>
> I've looked over r-help, as well as in Crawley etc., but I have yet to find
> a solution. Can anyone point me in the right direction or literature?
>
The classic solution in biomedical work is a time-dependent covariate. Create
a new data set like this:
time1 time2 status parasite
0 42 0 0
0 32 1 0
0 13 0 0
13 25 1 1
...
The key is lines 3 and 4, which show the colony parasite free from day 0 to
13, and with parasite from day 13 to 25. Then one uses a Cox model with
fit <- coxph(Surv(time1, time2, status) ~ parasite)
summary(fit)
It estimates the increase in death rate with parasite versus no parasite. These
models were originally developed for treatment regimens that change over time.
A given colony (subject) can have as many lines of data as you want, subject to
the fact that the time intervals can't overlap (which would correspond to two
copies of the same person alive at the same time). The status variable for a
multi-line dataset =1 if THIS interval ends with an event. Look at the survival
analysis chapter of Venables & Ripley, Modern Applied Statistics with S, for
further insight. (or many other books)
Terry Therneau
More information about the R-help
mailing list