[R] How to specify "newdata" in a Cox-Modell with a time dependent interaction term?
John Fox
jfox at mcmaster.ca
Sat Jun 16 17:22:33 CEST 2012
Dear Jurgen,
> fit <- survfit(mod.allison.5, newdata.1, individual=TRUE)
> fit
Call: survfit(formula = mod.allison.5, newdata = newdata.1, individual = TRUE)
records n.max n.start events median 0.95LCL 0.95UCL
19809 432 432 114 NA NA NA
plot(fit) # shows one survival curve
See ?survfit.coxph.
Best,
John
On Sat, 16 Jun 2012 16:23:41 +0200
Jürgen Biedermann <juergen.biedermann at googlemail.com> wrote:
> Dear John,
>
> Thank you very much for the quick answer. I will use the mentioned citation of your book in the future.
>
> However:
> The Id argument is necessary for the function to know that the different rows belong to one individual.
>
> Or how the Help-File says:
>
> id:
> optional variable name of subject identifiers. If this is present, then each group
> of rows with the same subject id represents the covariate path through time of
> a single subject, and the result will contain one curve per subject. If the coxph
> fit had strata then that must also be specified in newdata. If missing, then each
> individual row of newdata is presumed to represent a distinct subject and there
> will be nrow(newdata) times the number of strata curves in the result (one for
> each strata/subject combination).
>
> I tried your proposal anyway, but, as expected, fifty different survival curves appeared.
>
> So, there is still no solution.
>
> Best regards
> Jürgen
>
> --
> -----------------------------------
> Jürgen Biedermann
> Bergmannstraße 3
> 10961 Berlin-Kreuzberg
> Mobil: +49 176 247 54 354
> Home: +49 30 940 53 044
> e-mail: juergen.biedermann at gmail.com
>
>
> --------- Korrespondenz ----------
>
> Betreff: Re: How to specify "newdata" in a Cox-Modell with a time dependent interaction term?
> Von: John Fox <jfox at mcmaster.ca>
> An: J?rgen Biedermann <juergen.biedermann at googlemail.com>
> Datum: 16.06.2012 15:55
> > Dear Jürgen,
> >
> > All the values of your Id variable are equal to 1; how can that make sense? Removing the argument Id=id may get you what you want.
> >
> > By the way, the document to which you refer was an appendix to a 2002 book that has been superseded by Fox and Weisberg, An R Companion to Applied Regression, Second Edition (Sage, 2011). Appendices for that book are available at<http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix.html>.
> >
> > I hope this helps,
> > John
> >
> > ------------------------------------------------
> > John Fox
> > Sen. William McMaster Prof. of Social Statistics
> > Department of Sociology
> > McMaster University
> > Hamilton, Ontario, Canada
> > http://socserv.mcmaster.ca/jfox/
> >
> > On Sat, 16 Jun 2012 15:04:48 +0200
> > Jürgen Biedermann<juergen.biedermann at googlemail.com> wrote:
> >> Dear Mr. Therneau, Mr. Fox, or to whoever, who has some time...
> >>
> >> I don't find a solution to use the "survfit" function (package: survival) for a defined pattern of covariates with a Cox-Model including a time dependent interaction term. Somehow the definition of my "newdata" argument seems to be erroneous.
> >> I already googled the problem, found many persons having the same or a similar problem, but still no solution.
> >> I want to stress that my time-dependent covariate does not depend on the failure of an individual (in this case it doesn't seem sensible to predict a survivor function for an individual). Rather one of my effects declines with time (time-dependent coefficient).
> >>
> >> For illustration, I use the example of John Fox's paper "Cox Proportional - Hazards Regression for Survival Data".
> >> http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-cox-regression.pdf
> >>
> >> Do you know any help? See code below
> >>
> >> Thanks very much in advance
> >> Jürgen Biedermann
> >>
> >> #----------------------------------------
> >> #Code
> >>
> >> Rossi<- read.table("http://cran.r-project.org/doc/contrib/Fox-Companion/Rossi.txt", header=T)
> >>
> >> Rossi.2<- fold(Rossi, time='week',
> >> event='arrest', cov=11:62, cov.names='employed')
> >>
> >> # see below for the fold function from John Fox
> >>
> >> # modeling an interaction with time (Page 14)
> >>
> >> mod.allison.5<- coxph(Surv(start, stop, arrest.time) ~
> >> fin + age + age:stop + prio,
> >> data=Rossi.2)
> >> mod.allison.5
> >>
> >> # Attempt to get the survivor function of a person with age=30, fin=0 and prio=5
> >>
> >> newdata.1<- data.frame(unique(Rossi.2[c("start","stop")]),fin=0,age=30,prio=5,Id=1,arrest.time=0)
> >> fit<- survfit(mod.allison.5,newdata.1,id="Id")
> >>
> >> Error message:
> >>
> >> >Fehler in model.frame.default(data = newdata.1, id = "Id", formula =
> >> Surv(start, :
> >> Variablenlängen sind unterschiedlich (gefunden für '(id)')
> >>
> >> --> failure, length of variables are different.
> >>
> >> #-----------------------------------------------------------------
> >> fold<- function(data, time, event, cov,
> >> cov.names=paste('covariate', '.', 1:ncovs, sep=""),
> >> suffix='.time', cov.times=0:ncov, common.times=TRUE, lag=0){
> >> vlag<- function(x, lag) c(rep(NA, lag), x[1:(length(x)-lag)])
> >> xlag<- function(x, lag) apply(as.matrix(x), 2, vlag, lag=lag)
> >> all.cov<- unlist(cov)
> >> if (!is.list(cov)) cov<- list(cov)
> >> ncovs<- length(cov)
> >> nrow<- nrow(data)
> >> ncol<- ncol(data)
> >> ncov<- length(cov[[1]])
> >> nobs<- nrow*ncov
> >> if (length(unique(c(sapply(cov, length), length(cov.times)-1)))> 1)
> >> stop(paste(
> >> "all elements of cov must be of the same length and \n",
> >> "cov.times must have one more entry than each element of
> >> cov."))
> >> var.names<- names(data)
> >> subjects<- rownames(data)
> >> omit.cols<- if (!common.times) c(all.cov, cov.times) else all.cov
> >> keep.cols<- (1:ncol)[-omit.cols]
> >> nkeep<- length(keep.cols)
> >> if (is.numeric(event)) event<- var.names[event]
> >> times<- if (common.times) matrix(cov.times, nrow, ncov+1, byrow=T)
> >> else data[, cov.times]
> >> new.data<- matrix(Inf, nobs, 3 + ncovs + nkeep)
> >> rownames<- rep("", nobs)
> >> colnames(new.data)<- c('start', 'stop', paste(event, suffix, sep=""),
> >> var.names[-omit.cols], cov.names)
> >> end.row<- 0
> >> for (i in 1:nrow){
> >> start.row<- end.row + 1
> >> end.row<- end.row + ncov
> >> start<- times[i, 1:ncov]
> >> stop<- times[i, 2:(ncov+1)]
> >> event.time<- ifelse (stop == data[i, time]& data[i, event] ==
> >> 1, 1, 0)
> >> keep<- matrix(unlist(data[i, -omit.cols]), ncov, nkeep, byrow=T)
> >> select<- apply(matrix(!is.na(data[i, all.cov]), ncol=ncovs),
> >> 1, all)
> >> rows<- start.row:end.row
> >> cov.mat<- xlag(matrix(unlist(data[i, all.cov]),
> >> nrow=length(rows)), lag)
> >> new.data[rows[select], ]<-
> >> cbind(start, stop, event.time, keep, cov.mat)[select,]
> >> rownames[rows]<- paste(subjects[i], '.', seq(along=rows), sep="")
> >> }
> >> row.names(new.data)<- rownames
> >> as.data.frame(new.data[new.data[, 1] != Inf&
> >> apply(as.matrix(!is.na(new.data[, cov.names])), 1, all), ])
> >> }
> >> #-----------------------------------------------------------------
> >>
More information about the R-help
mailing list