[R] How to specify "newdata" in a Cox-Modell with a time dependent interaction term?

Jürgen Biedermann juergen.biedermann at googlemail.com
Sat Jun 16 16:23:41 CEST 2012


Dear John,

Thank you very much for the quick answer. I will use the mentioned 
citation of your book in the future.

However:
The Id argument is necessary for the function to know that the different 
rows belong to one individual.

Or how the Help-File says:

id:
optional variable name of subject identifiers. If this is present, then 
each group
of rows with the same subject id represents the covariate path through 
time of
a single subject, and the result will contain one curve per subject. If 
the coxph
fit had strata then that must also be specified in newdata. If missing, 
then each
individual row of newdata is presumed to represent a distinct subject 
and there
will be nrow(newdata) times the number of strata curves in the result 
(one for
each strata/subject combination).

I tried your proposal anyway, but, as expected, fifty different survival 
curves appeared.

So, there is still no solution.

Best regards
Jürgen

-- 
-----------------------------------
Jürgen Biedermann
Bergmannstraße 3
10961 Berlin-Kreuzberg
Mobil: +49 176 247 54 354
Home: +49 30 940 53 044
e-mail: juergen.biedermann at gmail.com


--------- Korrespondenz ----------

Betreff: Re: How to specify "newdata" in a Cox-Modell with a time 
dependent interaction term?
Von: John Fox <jfox at mcmaster.ca>
An: J�rgen Biedermann <juergen.biedermann at googlemail.com>
Datum: 16.06.2012 15:55
> Dear Jürgen,
>
> All the values of your Id variable are equal to 1; how can that make sense? Removing the argument Id=id may get you what you want.
>
> By the way, the document to which you refer was an appendix to a 2002 book that has been superseded by Fox and Weisberg, An R Companion to Applied Regression, Second Edition (Sage, 2011). Appendices for that book are available at<http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix.html>.
>
> I hope this helps,
>   John
>
> ------------------------------------------------
> John Fox
> Sen. William McMaster Prof. of Social Statistics
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> http://socserv.mcmaster.ca/jfox/
> 	
> On Sat, 16 Jun 2012 15:04:48 +0200
>   Jürgen Biedermann<juergen.biedermann at googlemail.com>  wrote:
>> Dear Mr. Therneau, Mr. Fox, or to whoever, who has some time...
>>
>> I don't find a solution to use the "survfit" function (package: survival)  for a defined pattern of covariates with a Cox-Model including a time dependent interaction term. Somehow the definition of my "newdata" argument seems to be erroneous.
>> I already googled the problem, found many persons having the same or a similar problem, but still no solution.
>> I want to stress that my time-dependent covariate does not depend on the failure of an individual (in this case it doesn't seem sensible to predict a survivor function for an individual). Rather one of my effects declines with time (time-dependent coefficient).
>>
>> For illustration, I use the example of John Fox's paper "Cox Proportional - Hazards Regression for Survival Data".
>> http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-cox-regression.pdf
>>
>> Do you know any help? See code below
>>
>> Thanks very much in advance
>> Jürgen Biedermann
>>
>> #----------------------------------------
>> #Code
>>
>> Rossi<- read.table("http://cran.r-project.org/doc/contrib/Fox-Companion/Rossi.txt", header=T)
>>
>> Rossi.2<- fold(Rossi, time='week',
>>       event='arrest', cov=11:62, cov.names='employed')
>>
>> # see below for the fold function from John Fox
>>
>> # modeling an interaction with time (Page 14)
>>
>> mod.allison.5<- coxph(Surv(start, stop, arrest.time) ~
>>       fin + age + age:stop + prio,
>>       data=Rossi.2)
>> mod.allison.5
>>
>> # Attempt to get the survivor function of a person with age=30, fin=0 and prio=5
>>
>> newdata.1<- data.frame(unique(Rossi.2[c("start","stop")]),fin=0,age=30,prio=5,Id=1,arrest.time=0)
>> fit<- survfit(mod.allison.5,newdata.1,id="Id")
>>
>> Error message:
>>
>>   >Fehler in model.frame.default(data = newdata.1, id = "Id", formula =
>> Surv(start,  :
>>     Variablenlängen sind unterschiedlich (gefunden für '(id)')
>>
>> -->  failure, length of variables are different.
>>
>> #-----------------------------------------------------------------
>> fold<- function(data, time, event, cov,
>>       cov.names=paste('covariate', '.', 1:ncovs, sep=""),
>>       suffix='.time', cov.times=0:ncov, common.times=TRUE, lag=0){
>>       vlag<- function(x, lag) c(rep(NA, lag), x[1:(length(x)-lag)])
>>       xlag<- function(x, lag) apply(as.matrix(x), 2, vlag, lag=lag)
>>       all.cov<- unlist(cov)
>>       if (!is.list(cov)) cov<- list(cov)
>>       ncovs<- length(cov)
>>       nrow<- nrow(data)
>>       ncol<- ncol(data)
>>       ncov<- length(cov[[1]])
>>       nobs<- nrow*ncov
>>       if (length(unique(c(sapply(cov, length), length(cov.times)-1)))>  1)
>>           stop(paste(
>>               "all elements of cov must be of the same length and \n",
>>               "cov.times must have one more entry than each element of
>> cov."))
>>       var.names<- names(data)
>>       subjects<- rownames(data)
>>       omit.cols<- if (!common.times) c(all.cov, cov.times) else all.cov
>>       keep.cols<- (1:ncol)[-omit.cols]
>>       nkeep<- length(keep.cols)
>>       if (is.numeric(event)) event<- var.names[event]
>>       times<- if (common.times) matrix(cov.times, nrow, ncov+1, byrow=T)
>>           else data[, cov.times]
>>       new.data<- matrix(Inf, nobs, 3 + ncovs + nkeep)
>>       rownames<- rep("", nobs)
>>       colnames(new.data)<- c('start', 'stop', paste(event, suffix, sep=""),
>>           var.names[-omit.cols], cov.names)
>>       end.row<- 0
>>       for (i in 1:nrow){
>>           start.row<- end.row + 1
>>           end.row<- end.row + ncov
>>           start<- times[i, 1:ncov]
>>           stop<- times[i, 2:(ncov+1)]
>>           event.time<- ifelse (stop == data[i, time]&  data[i, event] ==
>> 1, 1, 0)
>>           keep<- matrix(unlist(data[i, -omit.cols]), ncov, nkeep, byrow=T)
>>           select<- apply(matrix(!is.na(data[i, all.cov]), ncol=ncovs),
>> 1, all)
>>           rows<- start.row:end.row
>>           cov.mat<- xlag(matrix(unlist(data[i, all.cov]),
>> nrow=length(rows)), lag)
>>           new.data[rows[select], ]<-
>>               cbind(start, stop, event.time, keep, cov.mat)[select,]
>>           rownames[rows]<- paste(subjects[i], '.', seq(along=rows), sep="")
>>           }
>>       row.names(new.data)<- rownames
>>       as.data.frame(new.data[new.data[, 1] != Inf&
>>           apply(as.matrix(!is.na(new.data[, cov.names])), 1, all), ])
>>       }
>> #-----------------------------------------------------------------
>>



More information about the R-help mailing list