[R] Predicted Cox survival curves - factor coding problems..

Terry Therneau therneau at mayo.edu
Mon May 7 14:53:29 CEST 2007


  The combination of survfit, coxph, and factors is getting confused.  It is
not smart enough to match a new data frame that contains a numeric for sitenew
to a fit that contained that variable as a factor.  (Perhaps it should be smart
enough to at least die gracefully -- but it's not).

   The simple solution is to not use factors.
   
site1 <- 1*(coxsnps$sitenew==1)
site2 <- 1*(coxsnps$sitenew==2)
test1 <- coxph(Surv(time, censor) ~ snp1 + sex + site1 + site2 + gene +
	  eth.self + strata(edu), data= coxsnps)
	  
	 output

profile1 <- data.frame(snp1=c(0,1), site2=c(0,0), sex=c(0,0),  
	               site1=c(0,0), site2=c(0,0), geno=c(0,0) eth.self=c(0,0))
plot(survfit(test1, newdata=profile1))

 Note that you do not have to explicitly make "edu" a factor.  Since it is
included in a strata statement, the coxph routine must treat it as discrete
groups.



	Terry Therneau



More information about the R-help mailing list