[R] FW: Survival analysis

Cem Girit girit at comcast.net
Sat Oct 22 13:24:28 CEST 2011


Hello Sarah and David,

	Thank you for your help. I followed your advice and suggestions. But
I still have problems with the R-results: 

	1) Mean Survival Time and the Error of the Mean:

	 The mean survival time calculation is not just taking the average
of the survival times as you have suggested. The mean survival time is
defined as 

				Mean=sum(S(ti-1)*(ti-ti-1)) for i=0 to D

	Where S(Ti) is the survival probability, ti are the event times, D
is the last event.

	And the standard error of the mean as

				Standard error=
sqrt((m/(m-1))*sum(Ai^2*di/(ni*(ni-di))), sum is for i=1 to D-1

	where Ai=sum(S(tj)*(tj+1-tj), for j=i to D-1, d is the number of
death cases, n is the number at risk, m=sum(dj) for j=1 to D.  

	The results by the survfit routine do not agree with the results of
these formulae as obtained by SAS. 

R results:
----------------------------------------------------------------------------
-----
> print(fit,print.rmean=TRUE)
Call: survfit(formula = Surv(dT, vT) ~ gT, conf.type = "log-log")

           records n.max n.start events *rmean *se(rmean) median 0.95LCL
gT=DrugA         9     9       9      	6   	50.0       3.11     	48
32
gT=DrugB         9     9       9      	3   	54.8       2.93    	 NA
42
gT=DrugC         9     9       9     	 4   	53.8       2.74     	NA
42
gT=Vehicle       9     9       9      	8   	42.3       2.38     	40
37

SAS results:
----------------------------------------------------------------------------
------------------------------------------------------------------
Group		Total	Failed	Censored	% Censored	Mean
Survival Time	Standard Error of Survival Time
Drug A		9	6	3		33.33		48.2222
2.6931
Drug B		9	3	6		66.67		42.7778
0.1697
Drug C		9	4	5		55.56		46.0000
0.7258
Vehicle		9	8	1		11.11		40.5556
1.1283
Total		36	21	15		41.67		

	2. Quantiles

	The survfit routines produces the median and the 95% Confidence
intervals but not 25% and 75% quantiles. Or maybe it does, but I do not know
how to calculate and extract them from its output. 

SAS results:
------------------------------------------------------------------------
Quartile Estimates				
Drug A	Point Estimate	                      95% Confidence Interval

			Transform	Lower	Upper
75			LOGLOG	46	
Median	48	LOGLOG	32	
25		45	LOGLOG	32	48

Drug B	Point Estimate	95% Confidence Interval		
			Transform	Lower	Upper
75			LOGLOG		
Median		LOGLOG	42	
25		43	LOGLOG	42

Drug C	Point Estimate	95% Confidence Interval		
		Transform	Lower	Upper
75			LOGLOG	x	
Median		LOGLOG	42	47
25		47	LOGLOG	42	

Vehicle	Point Estimate	95% Confidence Interval		
		Transform	Lower	Upper
75		44	LOGLOG	38	
Median	40	LOGLOG	37	45
25		38	LOGLOG	37	40	

The data and the code I used for these calculations are given below:

> dT
 [1] 37 41 40 38 38 37 44 45 48 43 48 46 54 60 32 45 55 62 42 62 62 62 47 42
59 43
[27] 60 60 51 43 50 51 47 42 47 51

> vT
 [1] 1 1 1 1 1 1 1 1 0 1 1 1 0 0 1 1 1 0 1 0 0 0 0 1 0 1 0 0 0 1 0 0 1 1 1 0

> gT
 [1] Vehicle Vehicle Vehicle Vehicle Vehicle Vehicle Vehicle Vehicle Vehicle
[10] DrugA   DrugA   DrugA   DrugA   DrugA   DrugA   DrugA   DrugA   DrugA  
[19] DrugB   DrugB   DrugB   DrugB   DrugB   DrugB   DrugB   DrugB   DrugB  
[28] DrugC   DrugC   DrugC   DrugC   DrugC   DrugC   DrugC   DrugC   DrugC  
Levels: DrugA DrugB DrugC Vehicle

> fit<-survfit(Surv(dT,vT)~gT,conf.type="log-log",conf.int=0.95)

> print(fit,print.rmean=TRUE)

> print(fit,print.rmean=TRUE)
Call: survfit(formula = Surv(dT, vT) ~ gT, conf.type = "log-log", conf.int =
0.95)

           records n.max n.start events *rmean *se(rmean) median 0.95LCL
0.95UCL
gT=DrugA         9     9       9      6   50.0       3.11     48      32
NA
gT=DrugB         9     9       9      3   54.8       2.93     NA      42
NA
gT=DrugC         9     9       9      4   53.8       2.74     NA      42
NA
gT=Vehicle       9     9       9      8   42.3       2.38     40      37
45
    * restricted mean with upper limit =  61 
>

	Sincerely,

Cem Girit
 

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Thursday, October 20, 2011 6:25 PM
To: Sarah Goslee
Cc: Cem Girit; r-help
Subject: Re: [R] Survival analysis


On Oct 20, 2011, at 5:05 PM, Sarah Goslee wrote:

> Hi,
>
> Please send your information to the r-help list, not just to me, but 
> do note that the list is plain-text only.
>
> But surely all you are looking for is:
>> dt<-
>> c
>> (37,41,40,38,38,37,44,45,48,43,48,46,54,60,32,45,55,62,42,62,62,62,47
>> ,42,59,43,60,60,51,43,50,51,47,42,47,51
>> )
>> mean(dt)
> [1] 48.16667
>> sd(dt)/sqrt(length(dt))
> [1] 1.404923
>
> I have no idea what bizarre formula SAS uses to calculate standard 
> error, but the means match.
>
> And you'll note that the lengthy R output you pasted in works just 
> fine, and *does* include the standard errors and confidence limits *of 
> the groups you specified* in your formula. Maybe one of the excellent 
> introduction to R guides available online would be of use to you.

I suspect that Cem Girit is attempting to use survival::survfit and
survival::print.survfit, to get group means.  (He also spelled his vT
variable differently in two places.)  He seems confused about how to offer
arguments to the print.rmean and rmean paramters. He had been advised by
Therneau to read:

?print.survfit   # where the details of the rmean calculation are  
discussed

Both parameters are expecting a value of TRUE to be invoked, and he was both
misnaming them and mis-specifying them. Their default values
are:

 > getOption('survfit.rmean')
NULL
 > getOption("survfit.print.rmean")
NULL

After changing the name of "vt" to "vT":

 > print(fit,print.rmean=TRUE)
Call: survfit.formula(formula = Surv(dT, vT) ~ gT)

            records n.max n.start events *rmean *se(rmean) median 0.95LCL
0.95UCL
gT=DrugA         9     9       9      6   50.0       3.11     48       
45      NA
gT=DrugB         9     9       9      3   54.8       2.93     NA       
43      NA
gT=DrugC         9     9       9      4   53.8       2.74     NA       
47      NA
gT=Vehicle       9     9       9      8   42.3       2.38     40       
38      NA
     * restricted mean with upper limit =  61

If an overall rmean were desired it would be obtained thusly:

 > print(fit,print.rmean=TRUE)
Call: survfit.formula(formula = Surv(dT, vT) ~ 1)

    records      n.max    n.start     events     *rmean *se(rmean)      
median
       36.0       36.0       36.0       21.0       50.5         
1.7       47.0
    0.95LCL    0.95UCL
       43.0         NA
     * restricted mean with upper limit =  62


-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list