[R] fitting a mixture of distributions with optim and max log likelihood ?
ben@zoo.ufl.edu
ben at zoo.ufl.edu
Tue Aug 28 23:20:18 CEST 2001
What you're doing seems reasonable, but this is a hard problem in
general. My first advice would be to check out MASS (Modern Applied
Statistics with S-PLUS), 3d ed., pp. 261-266; this gives methods for doing
the problem in S-PLUS, and several references to the literature on mixture
models. Many of the functions used in MASS to solve this problem are
specific to S-PLUS, but there are equivalents in R (deriv3 <-> D, etc.).
The online R complements at
http://www.stats.ox.ac.uk/pub/MASS3/Compl.shtml should help you out.
Ben Bolker
On Tue, 28 Aug 2001, Bob Sandefur wrote:
> hi
> Suppose I have a mixture of 2 distributions generated by
>
> rtwonormals <- function(npnt,m1,s1,m2,s2,p2){
> rv<-vector(npnt,mode="numeric")
> for( i in seq(1:npnt)){
> if(runif(1,0,1)<=p2){
> rv[i]<-rnorm(1,m2,s2)
> }
> else{
> rv[i]<-rnorm(1,m1,s1)
> }
> }
> return(rv)
> }
> x <- rtwonormals(50000,0,100,500,500,0.05)
>
> #and I try to fit these with (based on thread: [R] Estimating Weibull Distribution Parameters - very basic question)
>
> loglike<-function(p) -2*sum(log((1-p[5])*dnorm(x,p[1],p[2])+p[5]*dnorm(x,p[3],p[4])))
> optim(c(-20,150,400,600,.035),loglike)
> optim(c(20,70,600,400,.095),loglike)
> optim(c(0,100,500,500,.05),loglike)
> optim(c(-20,150,400,600,.035),loglike)
>
> # three different starting values (1 and 4 the same to check reproducablity) I get:
>
> Version 1.3.0 (2001-06-22) on windoze XP
>
> > optim(c(-20,150,400,600,.035),loglike)
> $par
> [1] 1.28597210 100.53443070 550.06070497 615.06936388 0.04563778
> $value
> [1] 622843.1
> $counts
> function gradient
> 493 NA
> $convergence
> [1] 0
> $message
> NULL
> There were 22 warnings (use warnings() to see them)
>
> > optim(c(20,70,600,400,.095),loglike)
> $par
> [1] 0.62742812 100.15891023 533.25825184 514.63882147 0.04670099
> $value
> [1] 622692.7
> $couns
> function gradient
> 501 NA
> $convergence
> [1] 1 < OPPS
> $message
> NULL
> There were 22 warnings (use warnings() to see them)
>
> > optim(c(0,100,500,500,.05),loglike)
> $par
> [1] 0.56254342 100.03881574 499.47434961 505.59785487 0.04805347
> $value
> [1] 622685
> $counts
> function gradient
> 109 NA
> $convergence
> [1] 0
> $message
> NULL
> There were 21 warnings (use warnings() to see them)
>
> > optim(c(-20,150,400,600,.035),loglike)
> $par
> [1] 1.28597210 100.53443070 550.06070497 615.06936388 0.04563778
> $value
> [1] 622843.1
> $counts
> function gradient
> 493 NA
> $convergence
> [1] 0
> $message
> NULL
> There were 22 warnings (use warnings() to see them)
>
>
> Questions:
> 1) Did I mess up anything in the formulae?
> 2) Any suggestions for converging to the same value?
> 3) Any suggestions for other methods to get means, stddevs and proportions of the mixture of distributions?
>
> Thanx
>
> Robert (Bob) L Sandefur
> Principal Geostatistician
> Pincock Allen & Holt (A Hart Crowser Company)
> International Minerals Consultants
> 274 Union Suite 200
> Lakewood CO 80228
> USA
> 303 914-4467 v
> 303 987-8907 f
> rls at pincock.com
>
>
> ~
> ~
>
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>
--
318 Carr Hall bolker at zoo.ufl.edu
Zoology Department, University of Florida http://www.zoo.ufl.edu/bolker
Box 118525 (ph) 352-392-5697
Gainesville, FL 32611-8525 (fax) 352-392-3704
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list