[R] Fitting weibull and exponential distributions to left censoring data

Göran Broström goran.brostrom at gmail.com
Sat Nov 1 21:35:09 CET 2008


On Fri, Oct 31, 2008 at 2:27 PM, Terry Therneau <therneau at mayo.edu> wrote:
>  Use the survreg function.

The survreg function cannot fit left censored data (correct me if I am
wrong!), neither can phreg or aftreg (package eha). On the other hand,
if Borja instead wanted to fit left truncated data (it is a common
mistake to confuse left truncation with left censoring), it is
possible to use phreg or aftreg, but still not survreg (again, correct
me if I am wrong).

If instead Borja really wants to fit left censored data, it is fairly
simple with the aid of the function optim, for instance by calling
this function:

left <- function(x, d){
    ## d[i] = FALSE: x[i] is left censored
    ## d[i] = TRUE: x[i] is observed exactly
    loglik <- function(param){# The loglihood function
        lambda <- exp(param[2])
        p <- exp(param[1])
        sum(ifelse(d,
                        dweibull(x, p, lambda, log = TRUE),
                        pweibull(x, p, lambda, log.p = TRUE)
                   )
            )
    }
    par <- c(0, 0)
    res <- optim(par, loglik, control = list(fnscale = -1), hessian = TRUE)
    list(log.shape = res$par[1],
         log.scale = res$par[2],
         shape = exp(res$par[1]),
         scale = exp(res$par[2]),
         var.log = solve(-res$hessian)
         )
}

Use like this:

> x <- rweibull(500, shape = 2, scale = 1)
> d <- x > median(x) # 50% left censoring, Type II.
> y <- ifelse(d, x, median(x))
> left(y, d)

$log.shape
[1] 0.707023

$log.scale
[1] -0.007239744

$shape
[1] 2.027945

$scale
[1] 0.9927864

$var.log
             [,1]         [,2]
[1,] 0.0022849526 0.0005949114
[2,] 0.0005949114 0.0006508635


>
>  There are many different ways to parameterize a Weibull.  The survreg function
> imbeds it a general location-scale familiy, which is a different
> parameterization than the rweibull function.
>
>> y <- rweibull(1000, shape=2, scale=5)
>> survreg(Surv(y)~1, dist="weibull")
>
> Coefficients:
> (Intercept)
>   1.592543
>
> Scale= 0.5096278
>
> Loglik(model)= -2201.9   Loglik(intercept only)= -2201.9
>
> ----
>
>  survreg's scale  =    1/(rweibull shape)
>  survreg's intercept = log(rweibull scale)
>  For the log-likelihood all parameterizations lead to the same value.
>
>  There is not "right" or "wrong" parameterization for a Weibull (IMHO),

Correct, but there are two points I would like to add to that:
(i) It is a good idea to perform optimisation with a parametrization
that give no range restrictions.

(ii) It is a good idea to transform back the results to the
parametrization that is standard in R, for comparative reasons.

See for example the function 'left' above.

> but
> there certainly is a lot of room for confusion.  This comes up enough that I
> have just added it as an example in the survreg help page, which will migrate to
> the general R distribution in due course.
>
>  Terry Therneau
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-- 
Göran Broström


More information about the R-help mailing list