[R] Problem in optimization of Gaussian Mixture model

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Fri Aug 25 19:04:12 CEST 2017


You are doing yourself no favors by posting HTML format email... the code 
shown below has extra characters in it that you probably did not put 
there. This is a plain text mailing list, so you need to send plain text 
email. Read the Posting Guide mentioned in the footer of this and every 
R-help posting.

On top of that, your code is not presented as a stand-alone reproducible 
example. Read [1], [2], and in particular [3]. Failing to make it 
reproducible means most readers will just ignore it and move on.

I cannot figure out why you have "seg" as an optimized parameter... in 
particular, you are trying to index one-dimensional vectors with two 
indexes, one of which is the floating point variable "seg".

Anyway, despite the above, here is my take on your problem.

####
myfunc <- function( x ) {
    meanval <- c( 506.8644, 672.8448, 829.902 )
    sigmaval <- c( 61.02859, 9.149168, 74.84682 )
    coeffval <- c( 0.1241933, 0.6329082, 0.2428986 )
    sapply( x
          , function( xi )
              sum( coeffval * dnorm( xi
                                   , meanval
                                   , sigmaval
                                   )
                 )
          )
}

plot( 400:1000, myfunc( 400:1000 ) )

#' ![](http://i.imgur.com/65fhqrL.png)


# fooled by local maximum
val1 <- optim( par = c( x = 800 )
              , fn = myfunc
              , method = "BFGS"
              , control = list( fnscale= -1
                              , parscale = 1/0.025
                              )
              )
val1
#> $par
#>        x
#> 829.5249
#>
#> $value
#> [1] 0.001294662
#>
#> $counts
#> function gradient
#>        5        4
#>
#> $convergence
#> [1] 0
#>
#> $message
#> NULL

val2 <- optim( par = c( x = 1000 )
              , fn = myfunc
              , method = "SANN"
              , control = list( fnscale= -1
                              , parscale = 1/0.025
                              )
              )
val2
#> $par
#>        x
#> 672.8166
#>
#> $value
#> [1] 0.02776057
#>
#> $counts
#> function gradient
#>    10000       NA
#>
#> $convergence
#> [1] 0
#>
#> $message
#> NULL
####

[1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

[2] http://adv-r.had.co.nz/Reproducibility.html

[3] https://cran.r-project.org/web/packages/reprex/index.html (read the 
vignette)

On Thu, 24 Aug 2017, niharika singhal wrote:

> Hello,
>
>
> I am facing a problem with optimization in R from 2-3 weeks.
>
>
>
> I have some Gaussian mixtures parameters and I want to find the maximum in
> that
>
>
>
> *Parameters are in the form *
>
> mean1    mean2    mean3   sigma1   sigma2   sigma3   c1        c2        c3
>
> 506.8644 672.8448 829.902 61.02859 9.149168 74.84682 0.1241933
> 0.6329082 0.2428986
>
>
>
>
>
> I have used optima and optimx to find the maxima, but it gives me value
> near by the highest mean as an output, for example 830 in the above
> parameters. The code for my optim function is
>
> val1=*optim*(par=c(X=1000, *seg=1*),fn=xnorm,  opt=c(NA,seg=1),
> method="BFGS",lower=-Inf,upper=+Inf, control=list(fnscale=-1))
>
> *I am running the optim function in a loop and for different initial value
> and taking the val1$par[1] as the best value.*
>
> *I am taking this parameter since I want to run it on n number of segments
> latter*
>
> and xnorm is simply calculating the dnorm
>
> *xnorm*=function(param, opt = rep(NA, 2)){
>
>  if (any(!sapply(opt, is.na))) {
>
>    i = !sapply(opt, is.na)
>
>    # Fix non-NA values
>
>    param[i] <- opt[i]
>
>  }
>
>  xval= param[1]
>
>    seg <- param[2]
>
>  sum_prob=0
>
>  val=0
>
> l=3
>
> meanval=c(506.8644, 672.8448, 829.902)
>
> sigmaval=c(61.02859, 9.149168, 74.84682)
>
> coeffval(0.1241933, 0.6329082, 0.2428986)
>
>  for(n in 1 :l)
>
>  {
>
>    mu=meanval[seg,n]
>
>    sg=sigmaval[seg,n]
>
>    cval=coeffval[seg,n]
>
>    val=cval*(dnorm(xval,mu,sg))
>
>    #print(paste0("The dnorm value for x is.: ", val))
>
>    sum_prob=sum_prob+val
>
>  }
>
>  sum_prob
>
> }
>
>
> The output is not correct. Since I check my data using*
> UnivarMixingDistribution* from distr package and according to this the max
> should lie somewhere between 600-800
>
> Code I used to check
>
> mc0=c( 0.6329082,0.6329082,0.2428986)
>
> rv
> <-UnivarMixingDistribution(Norm(506.8644,61.02859),Norm(672.8448,9.149168),Norm(
> 829.902,74.84682), mixCoeff=mc0/sum(mc0))
>
> plot(rv, to.draw.arg="d")
>
>
>
> Can someone please help how I can solve this problem?
>
>
> Thanks & Regards
>
> Niharika Singhal
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list