[R-sig-finance] Distribution Fitting and fitdistr()

Tue Apr 25 12:21:02 CEST 2006

On Mon, 2006-04-24 at 23:09 -0400, Krishna Kumar wrote:
> Lorenzo Isella wrote:
> 
> >Dear All,
> >I think I understood how to use the fitdistr() function to fit random
> >draws to a certain distribution.
> >I first tested it on a unimodal Gaussian distribution:
> >
> >  
> >
> 
> This only works for distributions that are specified below
>  >>>
> Distributions '"beta"', '"cauchy"', '"chi-squared"',
>           '"exponential"', '"f"', '"gamma"', '"geometric"',
>           '"log-normal"', '"lognormal"', '"logistic"', '"negative
>           binomial"', '"normal"', '"Poisson"', '"t"' and '"weibull"'
>           are recognised, case being ignored.
>  >>>>>>>>
> 
> >Then I tested it on a set of draws from a mixture made up of three
> >Gaussian distributions.
> >This time I used 9 fitting parameters:
> >- 3 weights N1,N2,N3 for the components of the Gaussian mixture
> >- 3 standard deviations sig1,sig2,sig3
> >- 3 means mu1,mu2,mu3
> >
> >  
> >
> 
> The package nor1Mix has density , distribution and rng for mixtures but 
> no fitting so here is a  fitting function..
> 
> mydistr<-function(x,mu,sig,wt)
> {
> sum(wt*dnorm(x,mu,sig))
> }
> 
> mydistrmle<-function(x, y = x) {
> 
> #mu goes from x1:3 sig goes x 4:6 and wt goes x 7:9
>         f = -sum(log(mydistr(y, c(x[1],x[2],x[3]),     
> c(x[4],x[5],x[6]),     c(x[7],x[8],x[9])  )))
>         cat("\n Function value:  ", -f)
>         cat("\n  Estimated parameters:       ", 
> c(x[1],x[2],x[3]),c(x[4],x[5],x[6]),c(x[7],x[8],x[9]), "\n")
> 
>         f
>     }
>    
> 
> 
> #generate 1000 samples from norMix  mixture MW.nm2
> x<-rnorMix(1000,MW.nm2)
> hist(rnorMix(1000,MW.nm2))
> #  Give it a starting values
> mu<-c(-0.2,0.4,0.9)
> sigma<-c(0.2, 0.2, 0.6)
> wt<-c(0.3,0.3,0.2)
> r<-optim(c(mu,sigma,wt),mydistrmle,control=list(maxit=50000))
> print(matrix(r$par,3,3))
> print(MW.nm2)
> 
> You will have to add additional constraints on the "wt" etc.  ( see 
> ?constrOptim)
> 
> Further I don't think there is a guaranteed unique decomposition of 
> variance between the three gaussians. Just try different starting values
> and also is there a reason to stop at 3 gaussians ?..
> 
> Krishna

Thanks. There is not a particular reason to stop at three Gaussians
apart from the fact that I know that the data I will soon be given show
a 3 modal distribution.
Apart from implementing some extra constrains, are these sort of
problems numerically stable? I get plenty of warnings when I run your
code, although it executes.
Are those warnings anything to worry about or do they simply mean that
in some iterations the optimizer rejects the solution (though they seem
to deal with dnorm())?
Cheers

Lorenzo