# [R] mixture univariate distributions fit

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Fri Dec 31 18:57:20 CET 2021

Petr:
Please feel free to ignore and not reply if you think the following

1. Do you want to know the location of peaks (local modes) or the
parameters of the/a mixture distribution? Peaks do not have to be
located at the modes of the individual components of the mixture.

2. Do you know the number of components in the mixture? This would
simplify the problem (a lot, I believe; though those more
knowledgeable should comment on that).

3. Do you know that the points on the fitted density you get are
obtained as a mixture of normals? Or  at least of symmetric
distributions? ... or whether they are obtained by some sort of
(algorithmic) density estimation procedure?

Best and New Year's greeting to all,
Bert

On Fri, Dec 31, 2021 at 1:49 AM PIKAL Petr <petr.pikal using precheza.cz> wrote:
>
> Hallo Ivan
>
> Thanks. Yes, this approach seems to be viable. I did not consider using
> dnorm in fitting procedure. But as you pointed
>
> > (Some nonlinear least squares problems will be much harder to solve
> > though.)
>
> This simple example is quite easy. The more messy are data and the more
> distributions are mixed in them the more problematic could be the correct
> starting values selection. Errors could be quite common.
>
> x <- (0:200)/100
> y1 <- dnorm(x, mean=.3, sd=.1)
> y2 <- dnorm(x, mean=.7, sd=.2)
> y3 <- dnorm(x, mean=.5, sd=.1)
>
> ymix <- ((y1+2*y2+y3)/max(y1+2*y2+y3))+rnorm(201, sd=.001)
> plot(x, ymix)
>
> With just sd1 and sd2 slightly higher, the fit results to error.
> > fit <- minpack.lm::nlsLM(
> +  ymix ~ a1 * dnorm(x, mu1, sd1) + a2 * dnorm(x, mu2, sd2)+
> +  a3 * dnorm(x, mu3, sd3),
> +  start = c(a1 = 1, mu1 = .3, sd1=.3, a2 = 2, mu2 = .7, sd2 =.3,
> +  a3 = 1, mu3 = .5, sd3 = .1),
> +  lower = rep(0, 9) # help minpack avoid NaNs
> + )
> Error in nlsModel(formula, mf, start, wts) :
>   singular gradient matrix at initial parameter estimates
>
> If sd1 and sd2 are set to lower value, the function is no longer singular
> and arrives with result.
>
> Well, it seems that the  only way how to procced is to code such function by
> myself and take care of suitable starting values.
>
> Best regards.
> Petr
>
> > -----Original Message-----
> > From: Ivan Krylov <krylov.r00t using gmail.com>
> > Sent: Friday, December 31, 2021 9:26 AM
> > To: PIKAL Petr <petr.pikal using precheza.cz>
> > Cc: r-help mailing list <r-help using r-project.org>
> > Subject: Re: [R] mixture univariate distributions fit
> >
> > On Fri, 31 Dec 2021 07:59:11 +0000
> > PIKAL Petr <petr.pikal using precheza.cz> wrote:
> >
> > > x <- (0:100)/100
> > > y1 <- dnorm((x, mean=.3, sd=.1)
> > > y2 <- dnorm((x, mean=.7, sd=.1)
> > > ymix <- ((y1+2*y2)/max(y1+2*y2))
> >
> > > My question is if there is some package or function which could get
> > > those values ***directly from x and ymix values***, which is
> > > basically what is measured in my case.
> >
> > Apologies if I'm missing something, but, this being a peak fitting
> > problem, shouldn't nls() (or something from the minpack.lm or nlsr
> > packages) work for you here?
> >
> > minpack.lm::nlsLM(
> >  ymix ~ a1 * dnorm(x, mu1, sigma1) + a2 * dnorm(x, mu2, sigma2),
> >  start = c(a1 = 1, mu1 = 0, sigma1 = 1, a2 = 1, mu2 = 1, sigma2 = 1),
> >  lower = rep(0, 6) # help minpack avoid NaNs
> > )
> > # Nonlinear regression model
> > #  model: ymix ~ a1 * dnorm(x, mu1, sigma1) + a2 * dnorm(x, mu2, sigma2)
> > #  data: parent.frame()
> > #      a1    mu1 sigma1     a2    mu2 sigma2
> > #  0.1253 0.3000 0.1000 0.2506 0.7000 0.1000
> > # residual sum-of-squares: 1.289e-31
> > #
> > # Number of iterations to convergence: 23
> > # Achieved convergence tolerance: 1.49e-08
> >
> > (Some nonlinear least squares problems will be much harder to solve
> > though.)
> >
> > --
> > Best regards,
> > Ivan
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help