[R] FW: Survival analysis
Cem Girit
girit at comcast.net
Sat Oct 22 13:24:28 CEST 2011
Hello Sarah and David,
Thank you for your help. I followed your advice and suggestions. But
I still have problems with the R-results:
1) Mean Survival Time and the Error of the Mean:
The mean survival time calculation is not just taking the average
of the survival times as you have suggested. The mean survival time is
defined as
Mean=sum(S(ti-1)*(ti-ti-1)) for i=0 to D
Where S(Ti) is the survival probability, ti are the event times, D
is the last event.
And the standard error of the mean as
Standard error=
sqrt((m/(m-1))*sum(Ai^2*di/(ni*(ni-di))), sum is for i=1 to D-1
where Ai=sum(S(tj)*(tj+1-tj), for j=i to D-1, d is the number of
death cases, n is the number at risk, m=sum(dj) for j=1 to D.
The results by the survfit routine do not agree with the results of
these formulae as obtained by SAS.
R results:
----------------------------------------------------------------------------
-----
> print(fit,print.rmean=TRUE)
Call: survfit(formula = Surv(dT, vT) ~ gT, conf.type = "log-log")
records n.max n.start events *rmean *se(rmean) median 0.95LCL
gT=DrugA 9 9 9 6 50.0 3.11 48
32
gT=DrugB 9 9 9 3 54.8 2.93 NA
42
gT=DrugC 9 9 9 4 53.8 2.74 NA
42
gT=Vehicle 9 9 9 8 42.3 2.38 40
37
SAS results:
----------------------------------------------------------------------------
------------------------------------------------------------------
Group Total Failed Censored % Censored Mean
Survival Time Standard Error of Survival Time
Drug A 9 6 3 33.33 48.2222
2.6931
Drug B 9 3 6 66.67 42.7778
0.1697
Drug C 9 4 5 55.56 46.0000
0.7258
Vehicle 9 8 1 11.11 40.5556
1.1283
Total 36 21 15 41.67
2. Quantiles
The survfit routines produces the median and the 95% Confidence
intervals but not 25% and 75% quantiles. Or maybe it does, but I do not know
how to calculate and extract them from its output.
SAS results:
------------------------------------------------------------------------
Quartile Estimates
Drug A Point Estimate 95% Confidence Interval
Transform Lower Upper
75 LOGLOG 46
Median 48 LOGLOG 32
25 45 LOGLOG 32 48
Drug B Point Estimate 95% Confidence Interval
Transform Lower Upper
75 LOGLOG
Median LOGLOG 42
25 43 LOGLOG 42
Drug C Point Estimate 95% Confidence Interval
Transform Lower Upper
75 LOGLOG x
Median LOGLOG 42 47
25 47 LOGLOG 42
Vehicle Point Estimate 95% Confidence Interval
Transform Lower Upper
75 44 LOGLOG 38
Median 40 LOGLOG 37 45
25 38 LOGLOG 37 40
The data and the code I used for these calculations are given below:
> dT
[1] 37 41 40 38 38 37 44 45 48 43 48 46 54 60 32 45 55 62 42 62 62 62 47 42
59 43
[27] 60 60 51 43 50 51 47 42 47 51
> vT
[1] 1 1 1 1 1 1 1 1 0 1 1 1 0 0 1 1 1 0 1 0 0 0 0 1 0 1 0 0 0 1 0 0 1 1 1 0
> gT
[1] Vehicle Vehicle Vehicle Vehicle Vehicle Vehicle Vehicle Vehicle Vehicle
[10] DrugA DrugA DrugA DrugA DrugA DrugA DrugA DrugA DrugA
[19] DrugB DrugB DrugB DrugB DrugB DrugB DrugB DrugB DrugB
[28] DrugC DrugC DrugC DrugC DrugC DrugC DrugC DrugC DrugC
Levels: DrugA DrugB DrugC Vehicle
> fit<-survfit(Surv(dT,vT)~gT,conf.type="log-log",conf.int=0.95)
> print(fit,print.rmean=TRUE)
> print(fit,print.rmean=TRUE)
Call: survfit(formula = Surv(dT, vT) ~ gT, conf.type = "log-log", conf.int =
0.95)
records n.max n.start events *rmean *se(rmean) median 0.95LCL
0.95UCL
gT=DrugA 9 9 9 6 50.0 3.11 48 32
NA
gT=DrugB 9 9 9 3 54.8 2.93 NA 42
NA
gT=DrugC 9 9 9 4 53.8 2.74 NA 42
NA
gT=Vehicle 9 9 9 8 42.3 2.38 40 37
45
* restricted mean with upper limit = 61
>
Sincerely,
Cem Girit
-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net]
Sent: Thursday, October 20, 2011 6:25 PM
To: Sarah Goslee
Cc: Cem Girit; r-help
Subject: Re: [R] Survival analysis
On Oct 20, 2011, at 5:05 PM, Sarah Goslee wrote:
> Hi,
>
> Please send your information to the r-help list, not just to me, but
> do note that the list is plain-text only.
>
> But surely all you are looking for is:
>> dt<-
>> c
>> (37,41,40,38,38,37,44,45,48,43,48,46,54,60,32,45,55,62,42,62,62,62,47
>> ,42,59,43,60,60,51,43,50,51,47,42,47,51
>> )
>> mean(dt)
> [1] 48.16667
>> sd(dt)/sqrt(length(dt))
> [1] 1.404923
>
> I have no idea what bizarre formula SAS uses to calculate standard
> error, but the means match.
>
> And you'll note that the lengthy R output you pasted in works just
> fine, and *does* include the standard errors and confidence limits *of
> the groups you specified* in your formula. Maybe one of the excellent
> introduction to R guides available online would be of use to you.
I suspect that Cem Girit is attempting to use survival::survfit and
survival::print.survfit, to get group means. (He also spelled his vT
variable differently in two places.) He seems confused about how to offer
arguments to the print.rmean and rmean paramters. He had been advised by
Therneau to read:
?print.survfit # where the details of the rmean calculation are
discussed
Both parameters are expecting a value of TRUE to be invoked, and he was both
misnaming them and mis-specifying them. Their default values
are:
> getOption('survfit.rmean')
NULL
> getOption("survfit.print.rmean")
NULL
After changing the name of "vt" to "vT":
> print(fit,print.rmean=TRUE)
Call: survfit.formula(formula = Surv(dT, vT) ~ gT)
records n.max n.start events *rmean *se(rmean) median 0.95LCL
0.95UCL
gT=DrugA 9 9 9 6 50.0 3.11 48
45 NA
gT=DrugB 9 9 9 3 54.8 2.93 NA
43 NA
gT=DrugC 9 9 9 4 53.8 2.74 NA
47 NA
gT=Vehicle 9 9 9 8 42.3 2.38 40
38 NA
* restricted mean with upper limit = 61
If an overall rmean were desired it would be obtained thusly:
> print(fit,print.rmean=TRUE)
Call: survfit.formula(formula = Surv(dT, vT) ~ 1)
records n.max n.start events *rmean *se(rmean)
median
36.0 36.0 36.0 21.0 50.5
1.7 47.0
0.95LCL 0.95UCL
43.0 NA
* restricted mean with upper limit = 62
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list