[R] MLE for bimodal distribution
Ravi Varadhan
rvaradhan at jhmi.edu
Wed Apr 8 22:18:44 CEST 2009
Ben,
You can also do this nicely with the function spg() in the "BB" package
set.seed(1001)
data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3))
f = function(m, s, m2, s2, w)
{
-sum(log(w*dnorm(data, mean=m, sd=s)+
(1-w)*dnorm(data, mean=m2, sd=s2)))
}
start0 <- list("m"=3, "s"=0.5, "m2"=5, "s2"=0.35, "w"=0.6)
fcn <- function(x) f(x[1], x[2], x[3], x[4], x[5])
spg(par=as.numeric(start0), fn=fcn, lower=c(-Inf, 0, -Inf, 0, 0), upper=c(Inf, Inf, Inf, Inf, 1))
# of course, "L-BFGS-B" also works well
optim(par=as.numeric(start0), fn=fcn, method="L-BFGS-B", lower=c(-Inf, 0, -Inf, 0, 0), upper=c(Inf, Inf, Inf, Inf, 1))
Ravi.
____________________________________________________________________
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University
Ph. (410) 502-2619
email: rvaradhan at jhmi.edu
----- Original Message -----
From: Ben Bolker <bolker at ufl.edu>
Date: Wednesday, April 8, 2009 3:49 pm
Subject: Re: [R] MLE for bimodal distribution
To: r-help at r-project.org
>
>
> _nico_ wrote:
> >
> > Hello everyone,
> >
> > I'm trying to use mle from package stats4 to fit a bi/multi-modal
> > distribution to some data, but I have some problems with it.
> > Here's what I'm doing (for a bimodal distribution):
> >
> > # Build some fake binormally distributed data, the procedure fails
> also
> > with real data, so the problem isn't here
> > data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3))
> > # Just to check it's bimodal
> > plot(density(data))
> > f = function(m, s, m2, s2, w)
> > {
> > -log( sum(w*dnorm(data, mean=m, sd=s)) + sum((1-w)*dnorm(data, mean=m2,
> > sd=s2)) )
> > }
> >
> > res = mle(f, start=list("m"=3, "s"=0.5, "m2"=5, "s2"=0.35, "w"=0.6))
> >
> > This gives an error:
> > Error in optim(start, f, method = method, hessian = TRUE, ...) :
> > non-finite finite-difference value [2]
> > In addition: There were 50 or more warnings (use warnings() to see
> the
> > first 50)
> > And the warnings are about dnorm producing NaNs
> >
> > So, my questions are:
> > 1) What does "non-finite finite-difference value" mean? My guess
> would be
> > that an Inf was passed somewhere where a finite number was expected...?
> > I'm not sure how optim works though...
> > 2) Is the log likelihood function I wrote correct? Can I use the
> same type
> > of function for 3 modes?
> > 3) What should I do to avoid function failure? I tried by changing
> the
> > parameters, but it did not work.
> > 4) Can I put constraints to the parameters? In particular, I would
> like w
> > to be between 0 and 1.
> > 5) Is there a way to get the log-likelihood value, so that I can compare
> > different extimations?
> > 6) Do you have any (possibly a bit "pratically oriented") readings
> about
> > MLE to suggest?
> >
> > Thanks in advance
> > Nico
> >
>
> Here's some tweaked code that works.
> Read about the "L-BFGS-B" method in ?optim to constrain parameters,
> although you will have some difficulty making this work for more than
> two components. I think there's also a mixture model problem in
> Venables & Ripley (MASS).
>
> There are almost certainly more efficient approaches for mixture
> model fitting, although this "brute force" approach should be fine
> if you don't need to do anything too complicated.
> (RSiteSearch("mixture model"))
>
> set.seed(1001)
> data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3))
> # Just to check it's bimodal
> plot(density(data))
> f = function(m, s, m2, s2, w)
> {
> -sum(log(w*dnorm(data, mean=m, sd=s)+
> (1-w)*dnorm(data, mean=m2, sd=s2)))
> }
>
> library(stats4)
> start0 <- list("m"=3, "s"=0.5, "m2"=5, "s2"=0.35, "w"=0.6)
> do.call("f",start0) ## OK
> res = mle(f, start=start0)
>
> with(as.list(coef(res)),
> curve(w*dnorm(x,m,s)+(1-w)*dnorm(x,m2,s2),add=TRUE,col=2))
>
>
> --
> View this message in context:
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
>
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list