Dear all,

I am using R 3.4.3 on Windows 10.  I am writing code to use in a forthcoming 
teaching session.  As part of the workshop the students are using breast cancer 
data made available by Patrick Royston and available from 
http://www.statapress.com/data/fpsaus.html (I didn't pick the dataset by the 
way).  I would like the students to visualise linear, fractional polynomial and 
spline transformations of the "node" variable using a flexible parametric model 
with 3 knots for the baseline hazard.  I can do this using the "predict" option 
within stpm2 as follows:

flex_nodes_lin <- stpm2(Surv(rfs/12,rfsi)~nodes, data=Practical_Rott_dev,df=3)
haz_lin <- predict(flex_nodes_lin,type="hazard")

flex_nodes_fp <- 
stpm2(Surv(rfs/12,rfsi)~log(nodes),data=Practical_Rott_dev,df=3)
haz_fp <- predict(flex_nodes_fp,type="hazard")

spline3 <- stpm2(Surv(rfs/12,rfsi)~1, data=Practical_Rott_dev,df=3)
haz_spline3 <- predict(spline3,type="hazard")

data_part9 <- data.frame(nodes,haz_lin[nodes],haz_spline3[nodes],haz_fp[nodes])
data_part9_m <- melt(data_part9,id.vars='nodes',factorsAsStrings=F)
plot_part9 <- 
ggplot(data_part9_m,aes(nodes,value,colour=variable))+geom_line()+scale_colour_manual(labels=c("Linear","FP1","Spline
 3 knots"),values=c("green","red","blue"))+theme_bw()
plot_part9 + labs(x="Number of positive nodes",y="",color="") + 
theme(legend.position=c(0.8,0.8))

However, to my mind using "hazard" (or "survival") leads to a plot which do not 
help to understand the different functional form of "nodes".  Therefore, I 
would prefer to do this using the linear predictor for each model instead.  
I've written the following code to do this:
lp_nodes_lin <- flex_nodes_lin@lm$fitted.values
lp_nodes_spline <- flex_nodes_spline@lm$fitted.values
lp_nodes_fp <- flex_nodes_fp@lm$fitted.values

data_part9 <- 
data.frame(flex_nodes_lin@lm$model$nodes,lp_nodes_lin,lp_nodes_spline,lp_nodes_fp)
colnames(data_part9)[1] <- "nodes"

data_part9_m <- melt(data_part9,id.vars='nodes')
plot_part9 <- 
ggplot(data_part9_m,aes(nodes,value,colour=variable))+geom_line()+scale_colour_manual(labels=c("Linear","Spline
 (3 knots)", "FP1"),values=c("green","red","blue"))+theme_bw()
plot_part9 + labs(x="Number of positive nodes",y="Prediction",color="") + 
theme(legend.position=c(0.8,0.8))

I have 2 concerns over this:

1.       The plots are still not the shape I would expect them to be i.e. a 
line along the 45 degree line for the linear transformation, and a curve for 
each of the spline and FP transformations.

2.       This code is really complicated - there must be an easier way?!

Any help gratefully received!

Kind regards,
Laura

P.S. If I was doing this in the logistic regression the code would be 
relatively simple:
age_mod <- glm(DAY30~AGE,family="binomial")
lp_age_lin <- predict(age_mod)

agefp1_mod <- mfp(DAY30~fp(AGE,df=2,alpha=1),family="binomial")
lp_agefp1 <- predict(agefp1_mod)

age3_mod <- glm(DAY30~age3_spline,family="binomial")
lp_age3 <- predict(age3_mod)

data_part8 <- data.frame(AGE,lp_age_lin,lp_agefp1,lp_age3)
data_part8_m <- melt(data_part8,id.vars='AGE')
plot_part8 <- 
ggplot(data_part8_m,aes(AGE,value,colour=variable))+geom_line()+scale_colour_manual(labels=c("Linear","FP1","Spline
 3 knots"),values=c("green","blue","red"))+theme_bw()
plot_part8 + labs(x="Age (years)",y="Linear Predictor (log odds)",color="") + 
theme(legend.position=c(0.2,0.8))

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to