[R] Estimating a Normal Mixture Distribution

Fri Apr 20 21:57:39 CEST 2007

Hi everyone,

I am using R 2.4.1 on a MacOS X ("Tiger") operating system. In the  
last few day I was trying to estimate the parameters of a mixture of  
two normal distributions using Maximum Likelihood.
The code is from Modern Applied Statistics with S (4th edition),  
chapter 16 ("Optimization"), the dataset is available under MASS in  
R. Unfortunately, when I tried out the code from the book it didn´t  
work out, although I tried to copy the code as accurately as possible:

CODE START

# loading dataset.
attach(geyser)

# defining start values for the five parameters.
p0<-c(p=mean(waiting<70),u1=50,s1=5,u2=80,s2=5)

# defining the objective function (=log-likelihood function).
mix.obj<-function(p,x)
{
	e<-p[1]*dnorm((x-p[2])/p[3])/p[3] + (1-p[1])*dnorm((x-p[4])/p[5])/p[5]
	if (any(e<=0)) Inf
	else -sum(log(e))
}

# define two functions to calculate the first derivatives of the  
objective function.
lmix2a<-deriv(~ -log(p*dnorm((x-u1)/s1)/s1 + (1-p)*dnorm((x-u2)/s2)/ 
s2), c("p","u1","s1","u2","s2"), function(x,p,u1,s1,u2,s2) NULL)

mix.gr<-function(p,x){
	p<-p[1]
	u1<-p[2]
	s1<-p[3]
	u2<-p[4]
	s2<-p[5]
	colSums(attr(lmix2a(x,p,u1,s1,u2,s2),"gradient"))}

# finally, the optimization using the Broyden-Fletcher-Goldfarb- 
Shanno method.
resultsBFGS_D=optim 
(p0,mix.obj,mix.gr,x=waiting,method="BFGS",control=list(parscale=c 
(0.1,rep(1,4))))
resultsBFGS_D$par

CODE END

The output is given as:

p                 u1                s1               
u2                 s2
0.361204 50.000000  5.000000 80.000000  5.000000

Unfortunately, these are exactly the same parameter values which I  
have specified as starting values under "p0".
Thank you very much for your help, best regards,

András