# [R] univariate normal mixtures

Spencer Graves spencer.graves at pdf.com
Thu Jul 17 19:34:43 CEST 2003

```	  Did you search "www.r-help.org" -> Search -> "R Site Search" and
S-News (Google -> S-News -> S-News Mailing List Archives)?

I recently estimated a normal mixture, 2 components, mean 0,
different variances.  I had computational difficulties until I took the
time to avoid under and overflows, preserving numerical precision.  I
got strange answers in my application, which convinced me that I had a
3-component mixture, not 2, but it was not sufficiently important for me
to modify my code to allow more than 2 components.

The code I used follows, in case it might be of interest to someone.

hope this helps.  spencer graves
#######################
dnorm2.0 <-
function(x, log.sd1=0.5*log(var(x)/10),
log.sd2=0.5*log(10*var(x)), logit.w2=0,
output.all=F, log.p=T){
# 	log(dnorm(...)) for both densities
lg.dn1 <- ((-0.5)*(log(2*pi)+(x/exp(log.sd1))^2)-log.sd1)
lg.dn2 <- ((-0.5)*(log(2*pi)+(x/exp(log.sd2))^2)-log.sd2)
#	Adjust one to use a negative exponential
#  in the denominator for
#  w2 = exp(logit.w2)/(1+exp(logit.w2))
#     = 1/(1+exp(-logit.w2))
if(logit.w2>=0){
lg.dn1 <- (lg.dn1-logit.w2)
logit.w2 <- (-logit.w2)
}
else
lg.dn2 <- (lg.dn2+logit.w2)
#	Now write log(exp(lg.dn1)+exp(lg.dn2))
#   = lg12 + log(1+exp(lg.12-lg12))
lg12 <- pmax(lg.dn1, lg.dn2)
lg.12 <- pmin(lg.dn1, lg.dn2)
#	Put it all together
lg.d2 <- (lg12 + log(1+exp(lg.12-lg12))
+ logit.w2)
if(log.p)lg.d2 else exp(lg.d2)
}

lglk.dn2 <-
function(x=c(log.sd1=log(1e-15), log.sd2=0, logit.w2=log(35/(127-35))),
data.){
#
dn <- dnorm2.0(data., log.sd1=x,
log.sd2=x, logit.w2=x)
(-sum(dn))
}

Carlos J. Gil Bellosta wrote:
> Well,
>
> If k is known, you can use maximun likelihood to fit the weights, means,
> and sd's. The EM algorithm can be of help to solve the optimization
> problem. You would have to implement it yourself for your particular
> case, but I do not think it is big trouble.
>
> Then you could estimate k using Bayesian formalism: from a reasonable a
> priory distribution on k=1, 2,... compute the posterior distributions
> using the densities obtained above, etc.
>
> Carlos J. Gil Bellosta
>
> Joke Allemeersch wrote:
>
>> Hello,
>>
>> I have a concrete statistical question:
>> I have a sample of an univariate mixture of an unknown number (k) of
>> normal distributions, each time with an unknown mean `m_i' and a
>> standard deviation `k * m_i', where k is known factor constant for all
>> the normal distributions. (The `i' is a subscript.)
>> Is there a function in R that can estimate the number of normal
>> distributions k and the means `m_i' for the different normal
>> distributions from a sample?  Or evt. a function that can estimate the
>> `m_i', when the number of distributions `k' is known?
>> So far I only found a package, called `normix'.  But at first sight it
>> only provides methods to sample from such distributions and to
>> estimate the densities; but not to fit such a distribution.
>> Can someone indicate where I can find an elegant solution?
>>
>>
>> Joke Allemeersch
>>
>> Katholieke universiteit Leuven.
>> Belgium.
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help

```