[R] Fitting 3 beta distributions
Achim Zeileis
Achim.Zeileis at uibk.ac.at
Sun Oct 2 14:44:41 CEST 2011
On Sat, 1 Oct 2011, Nitin Bhardwaj wrote:
> Hi,
> I want to fit 3 beta distributions to my data which ranges between 0 and 1.
> What are the functions that I can easily call and specify that 3 beta
> distributions should be fitted?
> I have already looked at normalmixEM and fitdistr but they dont seem to be
> applicable (normalmixEM is only for fitting normal dist and fitdistr will
> only fit 1 distribution, not 3). Is that right?
>From your description above, I guess that (a) you want to fit a _mixture_
of 3 beta distributions, and (b) have tried to use "mixtools" and "MASS"
so far.
Based on these assumptions: fitdistr() does not fit mixture models.
"mixtools" does fit mixtures and the accompanying paper has an example
where a nonparametric model is applied to mixtures of beta distributions.
Furthermore, the "betareg" package has a function betamix() which can fit
mixtures of beta regression models (including the special case of no
covariates).
Both "mixtools" and "betareg" have been published in JSS, as indicated
when calling citation("mixtools") and citation("betareg"):
http://www.jstatsoft.org/v32/i06/
http://www.jstatsoft.org/v34/i02/
The latter does not yet contain the betamix() function. As an example, one
can use the artificial data generated in Section 5.2:
set.seed(123)
y1 <- c(rbeta(150, 0.3 * 4, 0.7 * 4), rbeta(50, 0.5 * 4, 0.5 * 4))
y2 <- c(rbeta(100, 0.3 * 4, 0.7 * 4), rbeta(100, 0.3 * 8, 0.7 * 8))
d <- data.frame(y1, y2)
bm1 <- betamix(y1 ~ 1 | 1, data = d, k = 2)
bm2 <- betamix(y2 ~ 1 | 1, data = d, k = 2)
where one should note that compared to R's parametrization of the beta
distribution two transformations are employed: From shape1/shape2 to
mu/phi and then adding logit/log link functions.
> Also, my data has 26 million data points. What can I do to reduce the
> computation time with the suggested function?
I think all functions above will have problems with 26 million
observations directly. One alternative - if the fitting function
takes weights - would be to use a representative sample or computing
weights on a possibly coarsened grid.
hth,
Z
> thanks a lot in advance,
> eagerly waiting for any input.
> Best
> Nitin
>
> --
> ??I+I??
>
> [[alternative HTML version deleted]]
>
>
More information about the R-help
mailing list