[R] mixture univariate distributions fit

PIKAL Petr petr@p|k@| @end|ng |rom prechez@@cz
Fri Dec 31 10:48:52 CET 2021


Hallo Ivan

Thanks. Yes, this approach seems to be viable. I did not consider using
dnorm in fitting procedure. But as you pointed

> (Some nonlinear least squares problems will be much harder to solve
> though.)

This simple example is quite easy. The more messy are data and the more
distributions are mixed in them the more problematic could be the correct
starting values selection. Errors could be quite common.

x <- (0:200)/100
y1 <- dnorm(x, mean=.3, sd=.1)
y2 <- dnorm(x, mean=.7, sd=.2)
y3 <- dnorm(x, mean=.5, sd=.1)

ymix <- ((y1+2*y2+y3)/max(y1+2*y2+y3))+rnorm(201, sd=.001)
plot(x, ymix)

With just sd1 and sd2 slightly higher, the fit results to error.
> fit <- minpack.lm::nlsLM(
+  ymix ~ a1 * dnorm(x, mu1, sd1) + a2 * dnorm(x, mu2, sd2)+
+  a3 * dnorm(x, mu3, sd3),
+  start = c(a1 = 1, mu1 = .3, sd1=.3, a2 = 2, mu2 = .7, sd2 =.3,
+  a3 = 1, mu3 = .5, sd3 = .1),
+  lower = rep(0, 9) # help minpack avoid NaNs
+ )
Error in nlsModel(formula, mf, start, wts) : 
  singular gradient matrix at initial parameter estimates

If sd1 and sd2 are set to lower value, the function is no longer singular
and arrives with result. 

Well, it seems that the  only way how to procced is to code such function by
myself and take care of suitable starting values. 

Best regards.
Petr

> -----Original Message-----
> From: Ivan Krylov <krylov.r00t using gmail.com>
> Sent: Friday, December 31, 2021 9:26 AM
> To: PIKAL Petr <petr.pikal using precheza.cz>
> Cc: r-help mailing list <r-help using r-project.org>
> Subject: Re: [R] mixture univariate distributions fit
> 
> On Fri, 31 Dec 2021 07:59:11 +0000
> PIKAL Petr <petr.pikal using precheza.cz> wrote:
> 
> > x <- (0:100)/100
> > y1 <- dnorm((x, mean=.3, sd=.1)
> > y2 <- dnorm((x, mean=.7, sd=.1)
> > ymix <- ((y1+2*y2)/max(y1+2*y2))
> 
> > My question is if there is some package or function which could get
> > those values ***directly from x and ymix values***, which is
> > basically what is measured in my case.
> 
> Apologies if I'm missing something, but, this being a peak fitting
> problem, shouldn't nls() (or something from the minpack.lm or nlsr
> packages) work for you here?
> 
> minpack.lm::nlsLM(
>  ymix ~ a1 * dnorm(x, mu1, sigma1) + a2 * dnorm(x, mu2, sigma2),
>  start = c(a1 = 1, mu1 = 0, sigma1 = 1, a2 = 1, mu2 = 1, sigma2 = 1),
>  lower = rep(0, 6) # help minpack avoid NaNs
> )
> # Nonlinear regression model
> #  model: ymix ~ a1 * dnorm(x, mu1, sigma1) + a2 * dnorm(x, mu2, sigma2)
> #  data: parent.frame()
> #      a1    mu1 sigma1     a2    mu2 sigma2
> #  0.1253 0.3000 0.1000 0.2506 0.7000 0.1000
> # residual sum-of-squares: 1.289e-31
> #
> # Number of iterations to convergence: 23
> # Achieved convergence tolerance: 1.49e-08
> 
> (Some nonlinear least squares problems will be much harder to solve
> though.)
> 
> --
> Best regards,
> Ivan


More information about the R-help mailing list