[Rd] survival package: Surv handles invalid intervals (start time > stop (PR#14221)

buhr at biostat.wisc.edu buhr at biostat.wisc.edu
Tue Feb 23 22:25:09 CET 2010


This is a multi-part message in MIME format.
--------------080605010703060205070700
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

In the latest version of the survival package (2.35-8), the Surv 
function handles invalid intervals (where start time is greater than 
stop time) by issuing a warning that NAs have been created and then 
setting the left endpoint of the offending intervals to NA. However, the 
code that sets the endpoint to NA subsets incorrectly, and in some 
circumstances an arbitrary selection of intervals will have an endpoint 
set to NA.

For example, for the interval/event data:

     interval    event
     (NA, 10]    1
     (1,   5]    1
     (6,   4]    1

the appropriate Surv call **should** result in a warning and the left 
endpoint of the third, invalid interval being set to NA, as here:

>  Surv(c(NA,1,6),c(10,5,4),c(1,1,1))
[1] (NA,10 ] ( 1, 5 ] (NA, 4 ]
Warning message:
In Surv(c(NA, 1, 6), c(10, 5, 4), c(1, 1, 1)) :
   Stop time must be>  start time, NA created
>

However, the Surv call **actually** results in:

>  Surv(c(NA,1,6), c(10,5,4), c(1,1,1))
[1] (NA,10 ] (NA, 5 ] ( 6, 4 ]
Warning message:
In Surv(c(NA, 1, 6), c(10, 5, 4), c(1, 1, 1)) :
   Stop time must be>  start time, NA created
>

Note that the endpoint of the valid, second interval has been set to NA 
in place of the invalid, third interval.

A similar problem exists for type="interval2" type data. The 
**expected** behavior is:

>  Surv(c(NA,1,6), c(10,5,4), type="interval2")
[1] 10-     [ 1, 5] [NA, 4]
Warning message:
In Surv(c(NA, 1, 6), c(10, 5, 4), type = "interval2") :
   Invalid interval: start>  stop, NA created
>

but the **actual** behavior is:

>  Surv(c(NA,1,6), c(10,5,4), type="interval2")
[1] 10-     [NA, 5] [ 6, 4]
Warning message:
In Surv(c(NA, 1, 6), c(10, 5, 4), type = "interval2") :
   Invalid interval: start>  stop, NA created
>

The attached patch fixes the problem.

-- 
Kevin Buhr<buhr at biostat.wisc.edu>                Phone: +1 608 265 4587
Assistant Scientist                                Fax: +1 608 263 0415
Statistical Data Analysis Center
Room 211, WARF Office Building, 610 Walnut St., Madison, WI  53726-2397


--------------080605010703060205070700
Content-Type: text/x-patch;
 name="Surv-subset-bug.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="Surv-subset-bug.patch"

diff --git a/survival/R/Surv.S b/survival/R/Surv.S
index 9257ea0..50b3b85 100644
--- a/survival/R/Surv.S
+++ b/survival/R/Surv.S
@@ -56,7 +56,7 @@ Surv <- function(time, time2, event,
 	if (!is.numeric(time2)) stop("Stop time is not numeric")
 	who3 <- !(is.na(time) | is.na(time2))
 	if (any (time[who3]>= time2[who3])) {
-	    time[time[who3]>= time2[who3]] <- NA
+	    time[who3][time[who3]>= time2[who3]] <- NA
 	    warning("Stop time must be > start time, NA created")
 	    }
 	if (is.logical(event)) status <- as.numeric(event)
@@ -105,7 +105,7 @@ Surv <- function(time, time2, event,
 
 	temp <- (time[status==3] > time2[status==3])
 	if (any(temp & !is.na(temp))) {
-	    time[temp] <- NA
+	    time[status==3][temp] <- NA
 	    warning("Invalid interval: start > stop, NA created")
 	    }
 

--------------080605010703060205070700--



More information about the R-devel mailing list