# [R] Problem with predict.coxph

Ramon Diaz-Uriarte rdiaz02 at gmail.com
Thu Aug 20 10:45:47 CEST 2009

```Dear Terry,

The following is an additional set of two simple examples. In these,
no issue with factor levels, etc, arises. It looks like the problem
consistently shows up when the number of rows of "newdata" differs
from the number of rows in the original data set AND time and status
are not part of the data frame.

****************
library(survival)

rm(list = ls())
time <- c(4,3,1,1,2,2,3)
status <- c(1,1,1,0,1,1,0)
d3 <- data.frame(x=c(0,2,1,1,1,0,0),
u=c(0,0,0,0,1,1,1))

m3 <- coxph(Surv(time, status) ~ x + u, data = d3)

predict(m3) # OK
predict(m3, newdata = d3) # OK
predict(m3, newdata = d3[1:5, ]) # Fails
predict(m3, newdata = d3[c(1:3, 1:4), ]) # OK

rm(list = ls())
d1 <- data.frame(time = c(4,3,1,1,2,2,3),
status = c(1,1,1,0,1,1,0),
x= c(0,2,1,1,1,0,0),
u = c(0,0,0,0,1,1,1))

m1 <- coxph(Surv(time, status) ~ x + u, data = d1)

predict(m1) ## OK
predict(m1, newdata = d1) ## OK
predict(m1, newdata = d1[1:5, ]) ## OK
predict(m1, newdata = d1[c(1:3, 1:4), ]) ## OK

********

Best,

R.

On Wed, Aug 19, 2009 at 5:53 PM, Terry Therneau<therneau at mayo.edu> wrote:
> -- begin included message ---
> We occasionally utilize the coxph function in the survival library to
> fit multinomial logit models. (The breslow method produces the same
> likelihood function as the multinomial logit). We then utilize the
> predict function to create summary results for various combinations of
> covariates.  For example:
>
> ...
>
> The problem is that under R 2.8.1 and R 2.9.1 the previous line fails
> with the following error:
>
>> totalut<-predict(mod1,newdata=newdata,type="lp")
> Error in model.frame.default(Terms2, newdata, xlev = object\$xlevels) :
>  variable lengths differ (found for 'Price')
> 'newdata' had 25 rows but variable(s) found have 43350 rows
>
> -----------end inclusion --------------
>
>
>  The coxph code was updated to use the "standard" R methods for
> prediction with factors.  I even added test cases -- but obviously I've
> missed something.  I'll look into a fix and get back to you.
>  Can I assume that Price and Product were both factors with the 5
> levels 1:5?
>
>        Terry Therneau
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> and provide commented, minimal, self-contained, reproducible code.
>

--
Ramon Diaz-Uriarte
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz
Phone: +34-91-732-8000 ext. 3019

```