[R] smooth non cumulative baseline hazard in Cox model
Mayeul KAUFFMANN
mayeul.kauffmann at tiscali.fr
Sun Jul 4 18:46:01 CEST 2004
Hi everyone.
There's been several threads on baseline hazard in Cox model but I think
they were all on cumulative baseline hazard,
for instance
http://tolstoy.newcastle.edu.au/R/help/01a/0464.html
http://tolstoy.newcastle.edu.au/R/help/01a/0436.html
"basehaz" in package survival seems to do a cumulative hazard.
extract from the basehaz function:
sfit <- survfit(fit)
H <- -log(sfit$surv)
Since sfit$surv is monotonic, H will be monotonic too, which makes me think
it is a cumulative function. I think H(t) it is the "sum" ("integration") of
lambda's from 0 to t (Am I right?)
What I need might be lambda(t) or lambda(t)dt (I do not know for sure),
something involving the instantaneous baseline risk.
But, for sure, I 've seen elsewhere what I need.
Specifically, here:
http://econ.worldbank.org/files/13214_CivilPeace.pdf
that is page 41 of HEGRE, Håvard; ELLINGSEN, Tanja; GATES, Scott; GLEDITSCH,
Nils Petter (2001). "Toward a Democratic Civil Peace? Democracy, Political
Change, and Civil War, 1816-1992". American Political Science Review, vol.
95, no 1.
I'm doing the same job as Hegre et al. (studying civil wars) but with the
counting process formulation of the Cox model. (I use intervals, my formula
looks like Surv(start,stop,status)~ etc.).
Like Hegre and alii (who use the stata software) I would like to have a
curve showing what is the (instantaneous) overall risk of civil war at a
given time, taking away the effect of the covariates.
For those who also use the SAS software (which I'm not, unfortunately), the
job I need to be done seems to be done by the SMOOTH macro described in
"Survival Analysis Using the SAS System: A Practical Guide" by Paul D.
Allison (see below).
There is a graph (output 5.22 "smoothes hazard functions for two financial
aid groups") P. 170 of his book which shows another example (except I only
need it for 1 group, at mean values)
I hope I am clear enough. Thank you a lot for any help.
(sorry for mistakes in English as I'm on non native English speaker)
Mayeul KAUFFMANN
Univ. Pierre Mendes France
Grenoble - France
------------
SMOOTH Macro (SAS)
------------
%macro smooth (data=_last_, time=, width=, survival=survival);
/*********************************************************************
MACRO SMOOTH produces graphs of smoothed hazard functions using output
from either PROC LIFETEST or PROC PHREG. With PROC LIFETEST, it uses the
data set produced by the OUTSURV option in the PROC statement. With PROC
PHREG, it uses the data set produced by the BASELINE statement. SMOOTH
employs a kernel smoothing method described by H. Ramlau-Hansen (1983),
"Smoothing Counting Process Intensities by Means of Kernel Functions,"
The Annals of Statistics 11, 453-466. If there is more than one survival
curve in the input data set, SMOOTH will produce multiple smoothed
hazard curves on the same axes.
There are four parameters:
DATA is the name of the data set containing survivor function
estimates. The default is the most recently created data set.
TIME is name of the variable containing event times.
SURVIVAL is the name of a variable containing survivor function
estimates (the default is SURVIVAL, which is the automatic name in
PROC LIFETEST).
WIDTH is bandwidth of smoothing function. The default is 1/5 of the range
of event times.
Example of usage:
%smooth(data=my.data,time=duration,width=8,survival=s)
Author: Paul D. Allison, University of Pennsylvania
allison at ssc.upenn.edu
*************************************************************************/
data _inset_;
set &data end=final;
retain _grp_ _censor_ 0;
t=&time;
survival=&survival;
if t=0 and survival=1 then _grp_=_grp_+1;
keep _grp_ t survival;
if final and _grp_ > 1 then call symput('nset','yes');
else if final then call symput('nset','no');
if _censor_ = 1 then delete;
if survival in (0,1) then delete;
run;
proc iml;
use _inset_;
read all var {t _grp_};
%if &width ne %then %let w2=&width;
%else %let w2=(max(t)-min(t))/5;
w=&w2;
z=char(w,8,2);
call symput('width',z);
numset=max(_grp_);
create _plt_ var{ lambda s group};
setin _inset_ ;
do m=1 to numset;
read all var {t survival _grp_} where (_grp_=m);
n=nrow(survival);
lo=t[1] + w;
hi=t[n] - w;
npt=50;
inc=(hi-lo)/npt;
s=lo+(1:npt)`*inc;
group=j(npt,1,m);
slag=1//survival[1:n-1];
h=1-survival/slag;
x = (j(npt,1,1)*t` - s*j(1,n,1))/w;
k=.75*(1-x#x)#(abs(x)<=1);
lambda=k*h/w;
append;
end;
quit;
%if &nset = yes %then %let c==group;
%else %let c=;
proc gplot data=_plt_;
plot lambda*s &c / vaxis=axis1 vzero haxis=axis2;
axis1 label=(angle=90 f=titalic 'Hazard Function' ) minor=none ;
axis2 label=(f=titalic "Time (bandwidth=&width)") minor=none;
symbol1 i=join color=black line=1;
symbol2 i=join color=red line=2;
symbol3 i=join color=green line=3;
symbol4 i=join color=blue line=4;
run;
quit;
%mend smooth;
More information about the R-help
mailing list