[R] using survreg() in survival package with "long" data

Fox, John jfox at mcmaster.ca
Sat Aug 29 23:56:18 CEST 2015


Dear Göran,

Thank you for responding to my query; please see below:

> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Göran
> Broström
> Sent: August 29, 2015 3:59 PM
> To: r-help at r-project.org
> Subject: Re: [R] using survreg() in survival package with "long" data
> 
> Dear John,
> 
> I think you are missing that 'survreg' does not handle left truncated data. With
> that you should use the 'eha' package, for a PH model the function 'phreg', and
> for an AFT model the function 'aftreg' (you didn't tell which model you want to
> fit).

That's odd, in that it's not true in general, since, e.g., survreg() can be used to fit the left-censored Tobit regression model, as illustrated in this example from ?survreg: 
	tobinfit <- survreg(Surv(durable, durable>0, type='left') ~ age + quant, data=tobin, dist='gaussian')

In fact, in my example, the data are right-censored, but in the second data set are represented in counting-process form as one-week intervals. I imagine that you're right in the sense that survreg() can't handle data like this in counting-process form. Yet, I've reviewed the documentation in the survey package but can't find any reference to this -- which is not to say that it's not there, only that I don't see it.

> 
> Your attempt with the 'survreg' function implies that you are satisfied with a
> Weibull baseline distribution, in which case you could choose either model.

Right, I understand that this is the default. The problem is just a small toy example meant to illustrate the error.

Thanks again,
 John

> 
> Göran
> 
> 
> On 08/29/2015 07:06 PM, Fox, John wrote:
> > Dear list members,
> >
> > I'm unable to fit a parametric survival regression using survreg() in
> > the survival package with data in "counting-process" ("long") form.
> >
> > To illustrate using a scaled-down problem with 10 subjects (with data
> > placed on the web):
> >
> > --------------- snip ------------
> >> library(survival) RW <-
> >> read.table("http://socserv.mcmaster.ca/jfox/.Pickup/RW.txt") RL <-
> >> read.table("http://socserv.mcmaster.ca/jfox/.Pickup/RL.txt")
> >
> >> RW # "wide" data
> > week arrest age 1    20      1  27 2    17      1  18 3    25      1
> > 19 4    52      0  23 5    52      0  19 6    52      0  24 7    23
> > 1  25 8    52      0  21 9    52      0  22 10   52      0  20
> >
> >> head(RL, 20) # "long" data, counting-process form
> > start stop arrest.time age 1.1      0    1           0  27 1.2      1
> > 2           0  27 1.3      2    3           0  27 1.4      3    4
> > 0  27 1.5      4    5           0  27 1.6      5    6           0
> > 27 1.7      6    7           0  27 1.8      7    8           0  27
> > 1.9      8    9           0  27 1.10     9   10           0  27 1.11
> > 10   11           0  27 1.12    11   12           0  27 1.13    12
> > 13           0  27 1.14    13   14           0  27 1.15    14   15
> > 0  27 1.16    15   16           0  27 1.17    16   17           0
> > 27 1.18    17   18           0  27 1.19    18   19           0  27
> > 1.20    19   20           1  27
> >
> > --------------- snip ------------
> >
> > I have no trouble fitting a Cox model to both the wide and long forms
> > of the data, obtaining (as should be the case) identical results:
> >
> > --------------- snip ------------
> >
> >> coxph(Surv(week, arrest) ~ age, data=RW)  # works
> > Call: coxph(formula = Surv(week, arrest) ~ age, data = RW)
> >
> >
> > coef exp(coef) se(coef)    z    p age 0.0963    1.1011   0.2073 0.46
> > 0.64
> >
> > Likelihood ratio test=0.21  on 1 df, p=0.643 n= 10, number of events=
> > 4
> >> coxph(Surv(start, stop, arrest.time) ~ age,  data=RL) # works, same
> > Call: coxph(formula = Surv(start, stop, arrest.time) ~ age, data =
> > RL)
> >
> >
> > coef exp(coef) se(coef)    z    p age 0.0963    1.1011   0.2073 0.46
> > 0.64
> >
> > Likelihood ratio test=0.21  on 1 df, p=0.643 n= 397, number of events=
> > 4
> >
> > --------------- snip ------------
> >
> > But when I try to fit a parametric survival regression with survreg(),
> > I get an error with the long form of the data:
> >
> > --------------- snip ------------
> >
> >> survreg(Surv(week, arrest) ~ age, data=RW) # works
> > Call: survreg(formula = Surv(week, arrest) ~ age, data = RW)
> >
> > Coefficients: (Intercept)         age 6.35386771 -0.08982624
> >
> > Scale= 0.7363196
> >
> > Loglik(model)= -22.1   Loglik(intercept only)= -22.2 Chisq= 0.3 on 1
> > degrees of freedom, p= 0.58 n= 10
> >
> >> survreg(Surv(start, stop, arrest.time) ~ age,  data=RL) # fails
> > Error in survreg(Surv(start, stop, arrest.time) ~ age, data = RL) :
> > Invalid survival type
> >
> > --------------- snip ------------
> >
> > I expect that there's something about survreg() that I'm missing. I
> > first noted this problem quite some time ago but didn't look into it
> > carefully because I didn't really need to use survreg().
> >
> > Any help would be appreciated.
> >
> > Thanks, John ----------------------------- John Fox, Professor
> > McMaster University Hamilton, Ontario Canada L8S 4M4 Web:
> > socserv.mcmaster.ca/jfox
> >
> > ______________________________________________ R-help at r-project.org
> > mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
> > posting guide http://www.R-project.org/posting-guide.html and provide
> > commented, minimal, self-contained, reproducible code.
> >
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list