[R] mixture univariate distributions fit

Thu Dec 30 09:36:07 CET 2021

Thank you Bert

The values are results from particle size measurement by sedimentation and 
they are really available only as these cumulative or density distributions. 
What I thought about was that there is some package which could fit data of 
such curves and deliver parameters of fitted curves.

Something like
https://chrisostrouchov.com/post/peak_fit_xrd_python/

I found package EMpeaksR which results close to values estimated from mixtools 
package.

test <- spect_em_gmm(temp1$velik, temp1$proc, mu=c(170, 220), 
mix_ratio=c(1,1), sigma=c(5,5), maxit=2000, conv.cri=1e-8)
print(cbind(test$mu, test$sigma, test$mix_ratio))
         [,1]      [,2]      [,3]
[1,] 170.7744  7.200109 0.5759867
[2,] 229.1815 10.831626 0.4240133

But it is probably in stage of intensive development as it is limited in data 
visualisation

Any further hint is appreciated.

Regards
Petr

> -----Original Message-----
> From: Bert Gunter <bgunter.4567 using gmail.com>
> Sent: Wednesday, December 29, 2021 5:01 PM
> To: PIKAL Petr <petr.pikal using precheza.cz>
> Cc: r-help mailing list <r-help using r-project.org>
> Subject: Re: [R] mixture univariate distributions fit
>
> No.
>
> However, if the object returned is the "Value" structure of whatever density
> function you use, it probably contains the original data. You need to check
> the docs to see. But this does not appear to be your situation.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Wed, Dec 29, 2021 at 3:05 AM PIKAL Petr <petr.pikal using precheza.cz>
> wrote:
> >
> > Dear all
> >
> > I have data which are either density distribution estimate or
> > cummulative density distribution estimate (temp1, temp2 below). I
> > would like to get values (mu, sd) for underlaying original data but they 
> > are
> not available.
> >
> > I found mixtools package which calculate what I need but it requires
> > original data (AFAIK). They could be generated from e.g. temp1 by
> >
> > set.seed(111)
> > x<- sample(temp1$velik, size=100000, replace=TRUE, prob=temp1$proc)
> >
> > library(mixtools)
> > fit <- normalmixEM(x)
> > plot(fit, which=2)
> > summary(fit)
> > summary of normalmixEM object:
> >            comp 1     comp 2
> > lambda   0.576346   0.423654
> > mu     170.784520 229.192823
> > sigma    7.203491  10.793461
> > loglik at estimate:  -424062.7
> > >
> >
> > Is there any way how to get such values directly from density or
> > cummulative density estimation without generating fake data by sample?
> >
> > Best regards
> > Petr
> >
> >
> > temp1 <- structure(list(velik = c(155, 156.8, 157.9, 158.8, 159.6,
> > 160.4, 161.2, 161.9, 162.5, 163.1, 163.8, 164.3, 164.7, 165.3, 165.8,
> > 166.2, 166.7, 167.2, 167.7, 168.2, 168.7, 169.1, 169.6, 170.1, 170.6,
> > 171.1, 171.6, 172, 172.5, 173, 173.5, 174, 174.5, 175.1, 175.7, 176.3,
> > 177, 177.6, 178.3, 179.1, 179.9, 180.6, 181.4, 182.4, 183.5, 184.7,
> > 186.1, 187.9, 189.8, 192, 194.4, 197, 200.1, 203.5, 206.7, 209.2,
> > 211.3, 213.1, 214.8, 216.3, 217.4, 218.5, 219.5, 220.4, 221.3, 222.1,
> > 223, 223.7, 224.5, 225.2, 225.9, 226.7, 227.5, 228.2, 228.9, 229.6,
> > 230.4, 231.2, 231.9, 232.6, 233.4, 234.2, 235, 235.9, 236.8, 237.7,
> > 238.6, 239.7, 241, 242.3, 243.6, 245.2, 247.1, 249.3, 251.9, 255.3,
> > 260, 266, 274.9, 323.4 ), proc = c(0.6171, 1.583, 1.371, 2.13, 1.828,
> > 2.095, 1.994, 2.694, 2.824, 2.41, 2.909, 3.768, 3.179, 3.029, 3.798,
> > 3.743, 3.276, 3.213, 3.579, 2.928, 4.634, 3.415, 3.473, 3.135, 3.476,
> > 3.759, 3.726, 3.9, 3.593, 2.89, 3.707, 4.08, 2.846, 2.685, 3.394,
> > 2.737, 2.693, 2.878, 2.248, 2.368, 2.258, 2.662, 1.866, 1.895, 1.457,
> > 1.513, 1.181, 1.008, 0.9641, 0.799, 0.7878, 0.7209, 0.5869, 0.5778,
> > 0.7313, 0.9531, 1.053, 1.317, 1.247, 1.739, 2.064, 1.99, 2.522, 2.401,
> > 2.48, 2.687, 2.797, 2.918, 3.243, 3.055, 3.009, 2.89, 3.037, 3.25,
> > 3.349, 3.141, 2.771, 2.985, 3.203, 3.298, 3.215, 2.637, 2.683, 2.782,
> > 2.632, 2.625, 2.475, 2.014, 1.781, 1.987, 1.627, 1.374, 1.352, 0.9441,
> > 1.01, 0.5737, 0.5265, 0.3794, 0.2513, 0.0351)), row.names = 2:101,
> > class = "data.frame")
> >
> > temp2 <- structure(list(velik = c(153.8, 156.3, 157.3, 158.4, 159.2,
> > 160.1, 160.8, 161.6, 162.2, 162.8, 163.5, 164, 164.5, 165, 165.5, 166,
> > 166.4, 166.9, 167.5, 167.9, 168.5, 168.9, 169.4, 169.8, 170.4, 170.9,
> > 171.3, 171.8, 172.2, 172.7, 173.3, 173.8, 174.2, 174.8, 175.5, 176,
> > 176.6, 177.3, 177.9, 178.7, 179.5, 180.3, 180.9, 181.9, 182.9, 184.1,
> > 185.3, 186.9, 188.8, 190.8, 193.2, 195.6, 198.4, 201.8, 205.3, 208.1,
> > 210.3, 212.3, 213.9, 215.7, 216.9, 218, 219.1, 219.9, 220.8, 221.7,
> > 222.6, 223.4, 224.1, 224.8, 225.6, 226.3, 227.1, 227.8, 228.5, 229.2,
> > 230, 230.8, 231.6, 232.3, 233, 233.7, 234.6, 235.5, 236.3, 237.2,
> > 238.1, 239.1, 240.3, 241.6, 242.9, 244.4, 246.1, 248, 250.6, 253.1,
> > 257.6, 262.5, 269.5, 280.4, 372.9), proc = c(0, 1, 2, 3, 4, 5, 6, 7,
> > 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
> > 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42,
> > 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,
> > 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76,
> > 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93,
> > 94, 95, 96, 97, 98, 99, 100)), row.names = c(NA, 101L), class =
> > "data.frame")
> > >
> >
> > >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.