[R] Fitting weibull, exponential and lognormal distributions to left-truncated data.

vito muggeo vmuggeo at dssm.unipa.it
Tue Oct 7 12:45:49 CEST 2008


Hi Gough,
A possible solution is to use the survreg() in the survival package 
without specifying the covariates, i.e.

library(survival)
survreg(Surv(..)~1, dist="weibull")

where Surv(..) accepts information about "times", censoring/truncation 
variables and dist allows to specify alternative distributions.
See ?Surv e ?survreg

hope this helps you,

Gough Lauren ha scritto:
> Dear All,
> 
> I have two questions regarding distribution fitting.
> 
> I have several datasets, all left-truncated at x=1, that I am attempting
> to fit distributions to (lognormal, weibull and exponential).  I had
> been using fitdistr in the MASS package as follows:
> 
> fitdistr<-(x,"weibull")
> 
> However, this does not take into consideration the truncation at x=1.  I
> read another posting in this forum that suggested using the argument
> "lower" to truncate the distribution fitting.  However, this does not
> seem to be working.  For example, when I attempt to fit a weibull
> distribution truncated at x=1 using "lower", it seems to set the
> best-fit shape parameter at 1:
> 
>> fitdistr(x,"weibull",lower=1)
>      shape        scale   
>   1.00000000   9.87964337 
>  (0.02358731) (0.40649570) ##I have tried this on other datasets also
> truncated at x=1 and get the same result (i.e. shape=1).
> 
> Does anyone know how to successfully fit the exponential, weibull and
> lognormal distributions to truncated data?
> 
> 
> 
> Secondly, as my datasets are large (>1000 data points) assessing the fit
> of the distribution with kolmogorov smirnov goodness of fit tests is
> routinely showing statistical significance for all distributions.
> Therefore, I would like to plot the observed data with the theoretical
> best fit distributions (weibull, exponential and lognormal) to visually
> assess which fits the observed data best.  So far I have been doing this
> as follows:
> 
>> fitdistr(x,"weibull")
> shape        scale   
>   a   		b 
> 
>> D1<-density(x) ##density distribution of observed data
>> D2<-density(rweibull(1500,shape=a,scale=b)) ##density of a random
> variable following the theoretical best fit weibull distribution with
> shape parameter =a, scale parameter = b.
> 
>> plot(range(D1$x),range(D1$y,D2$y),type="n",xlab="x",ylab="Density")
>> lines(D1,col="red")
>> lines(D2,col="blue")
> 
> This successfully plots the two density curves on the same graph, but it
> plots data below the x=1 threshold - even for the observed data!  I have
> tried limiting the scale of x-axis using xlim=c(1,150) but the graph
> still plots the origin of the graph as (0,0).  I can only get different
> origins if I limit x more extremely e.g. xlim=c(50,150).  Does anyone
> know how I can successfully change the origin of the graph to (1,0)?
> 
> 
> Sorry for the long e-mail! Any help would be greatly appreciated.
> 
> Regards,
> 
> Lauren
> 
> This message has been checked for viruses but the contents of an attachment
> may still contain software viruses, which could damage your computer system:
> you are advised to perform your own checks. Email communications with the
> University of Nottingham may be monitored as permitted by UK legislation.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
====================================
Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo



More information about the R-help mailing list