[R] Fitting weibull, exponential and lognormal distributions to left-truncated data.

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Oct 8 13:09:04 CEST 2008


On Wed, 8 Oct 2008, Gough Lauren wrote:

> Hi,
>
> Thank you very much for your reply. This seems to be working OK when
> fitting weibull and lognormal distributions.  However, fitdistr now
> requires me to include start values:

As documented.

>> ltwei<-function(x,shape,scale,log=FALSE){
> + dweibull(x,shape,scale,log)/pweibull(1,shape,scale,lower=FALSE)
> + }
>> ltweifit<-fitdistr(x,ltwei) # x is observed data
> Error in fitdistr(x, ltwei) : 'start' must be a named list
>> ltweifit<-fitdistr(x,ltwei,start=list(shape=0.5,scale=0.5))
> There were 34 warnings (use warnings() to see them)
>> ltweifit
>      shape         scale
>   1.11108278   13.00703630
> ( 0.01936651) ( 0.42897340)
>
> Is there anyway I can fit to truncated data without having to name start
> values?  Alternatively, is there any recommended technique for choosing
> sensible start values?

Not really, depends how heavy the truncation is.

> Further, when I try to fit an exponential distribution I get an error
> message:

But a truncated exponential is just a shifted exponential and has one 
parameter -- you gave it two!  Just fit an exponential to x-1.

>> ltexp<-function(x,rate,log=FALSE){
> + dexp(x,rate,log)/pexp(1,rate,lower=FALSE)
> + }
>> ltexpfit<-fitdistr(x,ltexp)
> Error in fitdistr(x, ltexp) : 'start' must be a named list
>> ltexpfit<-fitdistr(x,ltexp,start=list(0.1))
> Warning message:
> In optim(x = c(2.541609, 1.436143, 4.600524, 6.437174, 2.84974,  :
>  one-diml optimization by Nelder-Mead is unreliable: use optimize
>> ltexpfit
> Error in dn[[2]] : subscript out of bounds
>
> This error message seems to occur regardless of the start value used.
> Do you know why this is?
>
> Sorry to pester you again, and apologies if I am asking silly questions
> - my knowledge of R and probability distributions (except the normal!)
> are rather limited!
>
> Best wishes
>
> Lauren
>
> -----Original Message-----
> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> Sent: 07 October 2008 12:25
> To: Richard.Cotton at hsl.gov.uk
> Cc: Gough Lauren; vito muggeo; r-help at r-project.org
> Subject: Re: [R] Fitting weibull, exponential and lognormal
> distributions to left-truncated data.
>
> On Tue, 7 Oct 2008, Richard.Cotton at hsl.gov.uk wrote:
>
>>>> I have several datasets, all left-truncated at x=1, that I am
>> attempting
>>>> to fit distributions to (lognormal, weibull and exponential).  I had
>
>>>> been using fitdistr in the MASS package as follows:
>>
>>> A possible solution is to use the survreg() in the survival package
>>> without specifying the covariates, i.e.
>>>
>>> library(survival)
>>> survreg(Surv(..)~1, dist="weibull")
>>>
>>> where Surv(..) accepts information about "times",
>>> censoring/truncation variables and dist allows to specify alternative
> distributions.
>>> See ?Surv e ?survreg
>>
>> The survival package is mostly targeted at right-censored data.  The
>> NADA package provides wrappers for many of the survival routines so
>> they work with left-censored data.
>
> Left-censoring and left-truncation are not the same thing.  With
> left-censoring you see that you had observations < 1, and with
> left-truncation you do not (at least how the terms are usually applied:
> occasionally the meanings are reversed).
>
> For left-truncation it is relatively easy, e.g.
>
> ltwei <- function(x, shape, scale = 1, log = FALSE)
>     dweibull(x, shape, scale, log)/pweibull(1, shape, scale,
> lower=FALSE)
>
> and use this in fitdistr.
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
> This message has been checked for viruses but the contents of an attachment
> may still contain software viruses, which could damage your computer system:
> you are advised to perform your own checks. Email communications with the
> University of Nottingham may be monitored as permitted by UK legislation.
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list