[R] Choosing between functional forms using flexible parametric survival models
Bonnett, Laura
L@J@Bonnett @ending from liverpool@@c@uk
Thu Sep 27 17:26:31 CEST 2018
Dear all,
I am using R 3.4.3 on Windows 10. I am writing code to use in a forthcoming teaching session. As part of the workshop the students are using breast cancer data made available by Patrick Royston and available from http://www.statapress.com/data/fpsaus.html (I didn't pick the dataset by the way). I would like the students to visualise linear, fractional polynomial and spline transformations of the "node" variable using a flexible parametric model with 3 knots for the baseline hazard. I can do this using the "predict" option within stpm2 as follows:
flex_nodes_lin <- stpm2(Surv(rfs/12,rfsi)~nodes, data=Practical_Rott_dev,df=3)
haz_lin <- predict(flex_nodes_lin,type="hazard")
flex_nodes_fp <- stpm2(Surv(rfs/12,rfsi)~log(nodes),data=Practical_Rott_dev,df=3)
haz_fp <- predict(flex_nodes_fp,type="hazard")
spline3 <- stpm2(Surv(rfs/12,rfsi)~1, data=Practical_Rott_dev,df=3)
haz_spline3 <- predict(spline3,type="hazard")
data_part9 <- data.frame(nodes,haz_lin[nodes],haz_spline3[nodes],haz_fp[nodes])
data_part9_m <- melt(data_part9,id.vars='nodes',factorsAsStrings=F)
plot_part9 <- ggplot(data_part9_m,aes(nodes,value,colour=variable))+geom_line()+scale_colour_manual(labels=c("Linear","FP1","Spline 3 knots"),values=c("green","red","blue"))+theme_bw()
plot_part9 + labs(x="Number of positive nodes",y="",color="") + theme(legend.position=c(0.8,0.8))
However, to my mind using "hazard" (or "survival") leads to a plot which do not help to understand the different functional form of "nodes". Therefore, I would prefer to do this using the linear predictor for each model instead. I've written the following code to do this:
lp_nodes_lin <- flex_nodes_lin using lm$fitted.values
lp_nodes_spline <- flex_nodes_spline using lm$fitted.values
lp_nodes_fp <- flex_nodes_fp using lm$fitted.values
data_part9 <- data.frame(flex_nodes_lin using lm$model$nodes,lp_nodes_lin,lp_nodes_spline,lp_nodes_fp)
colnames(data_part9)[1] <- "nodes"
data_part9_m <- melt(data_part9,id.vars='nodes')
plot_part9 <- ggplot(data_part9_m,aes(nodes,value,colour=variable))+geom_line()+scale_colour_manual(labels=c("Linear","Spline (3 knots)", "FP1"),values=c("green","red","blue"))+theme_bw()
plot_part9 + labs(x="Number of positive nodes",y="Prediction",color="") + theme(legend.position=c(0.8,0.8))
I have 2 concerns over this:
1. The plots are still not the shape I would expect them to be i.e. a line along the 45 degree line for the linear transformation, and a curve for each of the spline and FP transformations.
2. This code is really complicated - there must be an easier way?!
Any help gratefully received!
Kind regards,
Laura
P.S. If I was doing this in the logistic regression the code would be relatively simple:
age_mod <- glm(DAY30~AGE,family="binomial")
lp_age_lin <- predict(age_mod)
agefp1_mod <- mfp(DAY30~fp(AGE,df=2,alpha=1),family="binomial")
lp_agefp1 <- predict(agefp1_mod)
age3_mod <- glm(DAY30~age3_spline,family="binomial")
lp_age3 <- predict(age3_mod)
data_part8 <- data.frame(AGE,lp_age_lin,lp_agefp1,lp_age3)
data_part8_m <- melt(data_part8,id.vars='AGE')
plot_part8 <- ggplot(data_part8_m,aes(AGE,value,colour=variable))+geom_line()+scale_colour_manual(labels=c("Linear","FP1","Spline 3 knots"),values=c("green","blue","red"))+theme_bw()
plot_part8 + labs(x="Age (years)",y="Linear Predictor (log odds)",color="") + theme(legend.position=c(0.2,0.8))
[[alternative HTML version deleted]]
More information about the R-help
mailing list