[R] Fitting weibull, exponential and lognormal distributions to left-truncated data.
vito muggeo
vmuggeo at dssm.unipa.it
Tue Oct 7 12:45:49 CEST 2008
Hi Gough,
A possible solution is to use the survreg() in the survival package
without specifying the covariates, i.e.
library(survival)
survreg(Surv(..)~1, dist="weibull")
where Surv(..) accepts information about "times", censoring/truncation
variables and dist allows to specify alternative distributions.
See ?Surv e ?survreg
hope this helps you,
Gough Lauren ha scritto:
> Dear All,
>
> I have two questions regarding distribution fitting.
>
> I have several datasets, all left-truncated at x=1, that I am attempting
> to fit distributions to (lognormal, weibull and exponential). I had
> been using fitdistr in the MASS package as follows:
>
> fitdistr<-(x,"weibull")
>
> However, this does not take into consideration the truncation at x=1. I
> read another posting in this forum that suggested using the argument
> "lower" to truncate the distribution fitting. However, this does not
> seem to be working. For example, when I attempt to fit a weibull
> distribution truncated at x=1 using "lower", it seems to set the
> best-fit shape parameter at 1:
>
>> fitdistr(x,"weibull",lower=1)
> shape scale
> 1.00000000 9.87964337
> (0.02358731) (0.40649570) ##I have tried this on other datasets also
> truncated at x=1 and get the same result (i.e. shape=1).
>
> Does anyone know how to successfully fit the exponential, weibull and
> lognormal distributions to truncated data?
>
>
>
> Secondly, as my datasets are large (>1000 data points) assessing the fit
> of the distribution with kolmogorov smirnov goodness of fit tests is
> routinely showing statistical significance for all distributions.
> Therefore, I would like to plot the observed data with the theoretical
> best fit distributions (weibull, exponential and lognormal) to visually
> assess which fits the observed data best. So far I have been doing this
> as follows:
>
>> fitdistr(x,"weibull")
> shape scale
> a b
>
>> D1<-density(x) ##density distribution of observed data
>> D2<-density(rweibull(1500,shape=a,scale=b)) ##density of a random
> variable following the theoretical best fit weibull distribution with
> shape parameter =a, scale parameter = b.
>
>> plot(range(D1$x),range(D1$y,D2$y),type="n",xlab="x",ylab="Density")
>> lines(D1,col="red")
>> lines(D2,col="blue")
>
> This successfully plots the two density curves on the same graph, but it
> plots data below the x=1 threshold - even for the observed data! I have
> tried limiting the scale of x-axis using xlim=c(1,150) but the graph
> still plots the origin of the graph as (0,0). I can only get different
> origins if I limit x more extremely e.g. xlim=c(50,150). Does anyone
> know how I can successfully change the origin of the graph to (1,0)?
>
>
> Sorry for the long e-mail! Any help would be greatly appreciated.
>
> Regards,
>
> Lauren
>
> This message has been checked for viruses but the contents of an attachment
> may still contain software viruses, which could damage your computer system:
> you are advised to perform your own checks. Email communications with the
> University of Nottingham may be monitored as permitted by UK legislation.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
====================================
Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo
More information about the R-help
mailing list