[R] Survival analysis with truncated data

Terry Therneau therneau at mayo.edu
Thu Nov 14 16:44:46 CET 2013


I think that your data is censored, not truncated.
   For a fault introduced 1/2005 and erased 2/2006, duration = 13 months
   For a fault introduced 4/2010 and still in existence at the last observation 12/2010, 
duration> 8 months.
   For a fault introduced before 2004, erased  3/2005, in a machine installed 2/1998, the 
duration is somewhere between 15 and 87 months.
   For a fault introduced before 2004, smachine installed 5/2000, still present 11/2010 at 
last check, the duration is > 126 months.

For type=interval2 the data would be (13,13), (8,NA), (15,87), (126, NA).

Terry T.


On 11/14/2013 05:00 AM, r-help-request at r-project.org wrote:
> Hi,
>
> I would like to know how to handle truncated data.
> My intend is to have the survival curve of a software fault in order
> to have some information
> about fault lifespan.
>
> I have some observations of a software system between 2004 and 2010.
> The system was first released in 1994.
> The event considered is the disappearance of a software fault. The
> faults can have been
> introduced at any time, between 1994 and 2010. But for fault
> introduced before 2004,
> there is not mean to know their age.
>
> I used the Surv and survfit functions with type interval2.
> For the faults that are first observed in 2004, I set the lower bound
> to the lifespan
> observed between 2004 and 2010.
>
> How could I set the upper bound ? Using 1994 as a starting point to not seems
> to be meaningful. Neither is using only the lower bound.
>
> Should I consider another survival estimator ?
>
> Thanks in advance.



More information about the R-help mailing list