[R] Fitting weibull and exponential distributions to left censoring data
Christian Ritz
ritz at life.ku.dk
Sun Nov 2 11:22:32 CET 2008
Hi Göran,
the R package NADA is specifically designed for left-censored data:
http://www.statistik.uni-dortmund.de/useR-2008/slides/Helsel+Lee.pdf
The function cenreg() in NADA is a front end to survreg().
Christian
Göran Broström wrote:
> On Fri, Oct 31, 2008 at 2:27 PM, Terry Therneau <therneau at mayo.edu> wrote:
>> Use the survreg function.
>
> The survreg function cannot fit left censored data (correct me if I am
> wrong!), neither can phreg or aftreg (package eha). On the other hand,
> if Borja instead wanted to fit left truncated data (it is a common
> mistake to confuse left truncation with left censoring), it is
> possible to use phreg or aftreg, but still not survreg (again, correct
> me if I am wrong).
>
> If instead Borja really wants to fit left censored data, it is fairly
> simple with the aid of the function optim, for instance by calling
> this function:
>
> left <- function(x, d){
> ## d[i] = FALSE: x[i] is left censored
> ## d[i] = TRUE: x[i] is observed exactly
> loglik <- function(param){# The loglihood function
> lambda <- exp(param[2])
> p <- exp(param[1])
> sum(ifelse(d,
> dweibull(x, p, lambda, log = TRUE),
> pweibull(x, p, lambda, log.p = TRUE)
> )
> )
> }
> par <- c(0, 0)
> res <- optim(par, loglik, control = list(fnscale = -1), hessian = TRUE)
> list(log.shape = res$par[1],
> log.scale = res$par[2],
> shape = exp(res$par[1]),
> scale = exp(res$par[2]),
> var.log = solve(-res$hessian)
> )
> }
>
> Use like this:
>
>> x <- rweibull(500, shape = 2, scale = 1)
>> d <- x > median(x) # 50% left censoring, Type II.
>> y <- ifelse(d, x, median(x))
>> left(y, d)
>
> $log.shape
> [1] 0.707023
>
> $log.scale
> [1] -0.007239744
>
> $shape
> [1] 2.027945
>
> $scale
> [1] 0.9927864
>
> $var.log
> [,1] [,2]
> [1,] 0.0022849526 0.0005949114
> [2,] 0.0005949114 0.0006508635
>
>
>> There are many different ways to parameterize a Weibull. The survreg function
>> imbeds it a general location-scale familiy, which is a different
>> parameterization than the rweibull function.
>>
>>> y <- rweibull(1000, shape=2, scale=5)
>>> survreg(Surv(y)~1, dist="weibull")
>> Coefficients:
>> (Intercept)
>> 1.592543
>>
>> Scale= 0.5096278
>>
>> Loglik(model)= -2201.9 Loglik(intercept only)= -2201.9
>>
>> ----
>>
>> survreg's scale = 1/(rweibull shape)
>> survreg's intercept = log(rweibull scale)
>> For the log-likelihood all parameterizations lead to the same value.
>>
>> There is not "right" or "wrong" parameterization for a Weibull (IMHO),
>
> Correct, but there are two points I would like to add to that:
> (i) It is a good idea to perform optimisation with a parametrization
> that give no range restrictions.
>
> (ii) It is a good idea to transform back the results to the
> parametrization that is standard in R, for comparative reasons.
>
> See for example the function 'left' above.
>
>> but
>> there certainly is a lot of room for confusion. This comes up enough that I
>> have just added it as an example in the survreg help page, which will migrate to
>> the general R distribution in due course.
>>
>> Terry Therneau
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
Original posting: http://tolstoy.newcastle.edu.au/R/e5/help/08/10/5673.html
More information about the R-help
mailing list