[R] fitdistr, mle's and gamma distribution

Lourens Olivier Walters lwalters at cs.uct.ac.za
Wed Oct 8 13:56:24 CEST 2003


Thanks, your advice worked. I don't have much experience with maths, and
therefore tried to stay away from dealing with optimization, but going
down to this level opens a lot of possibilities. For the record, the
code I used, as you suggested:

###############
shape <- mean(data)^2/var(data)
scale <- var(data)/mean(data)
gamma.param1 <- shape
gamma.param2 <- scale
log.gamma.param1 <- log(gamma.param1)
log.gamma.param2 <- log(gamma.param2)
                                                                                                                                              gammaLoglik <- function(params, negative=TRUE){
   lglk <- sum(dgamma(data, shape=exp(params[1]), scale=exp(params[2]),
log=TRUE))
   if(negative)
      return(-lglk)
   else
      return(lglk)
}

optim.list <- optim(c(log.gamma.param1, log.gamma.param2), gammaLoglik)
gamma.param1 <- exp(optim.list$par[1])
gamma.param2 <- exp(optim.list$par[2])
################

optim converges and the parameters provide a much better fit to the data
than the method of moments parameters do. 

Armed with this new knowledge, I attempted to write a log(likelihood)
optimization method for the Pareto distribution. My ignorance of math
however shows, as the code does not work. 

Here is what I did:

################
pareto.density <- function(x, alpha, beta, log=FALSE){
   # Pareto distribution not defined for x < alpha
   # Test for values x < alpha, taking into account floating point error
   test.vals <- x-alpha
   error.vals <- test.vals[test.vals<0]
   if (length(error.vals)>0)
      stop("\nERROR: x > alpha in pareto.distr(x, alpha, beta,
+ log=FALSE)")
   density <- beta * (alpha^beta) * (x^(-beta-1))
   if(log)
      return(log(density))
   else
      return(density)
}

data.sorted <- sort(data)
alpha.val <- data.sorted[1]
beta.val <- 1/((1/n) * sum(log(data.sorted/alpha.val)))
log.alpha.val <- log(alpha.val)
log.beta.val <- log(beta.val)

paretoLoglik <- function(params, negative=TRUE){
   lglk <- sum(pareto.density(data.sorted, alpha=exp(params[1]),
+ beta=exp(params[2]), log=TRUE))
   if(negative)
      return(-lglk)
   else
      return(lglk)
}

optim.list <- optim(c(log.alpha.val, log.beta.val), paretoLoglik,
+ method="L-BFGS-B", lower=c(log.alpha.val, 0), upper=c(log.alpha.val,
+ Inf))
pareto.param1 <- exp(optim.list$par[1])
pareto.param2 <- exp(optim.list$par[2])
#################

I fixed the alpha parameter as my Pareto density function produces an
error if a datavalue > alpha.

I get the following output:

> source("browsesessoffplotfitted.R")
Error in optim(c(log.alpha.val, log.beta.val), paretoLoglik, method =
+ "L-BFGS-B",  :
        non-finite finite-difference value [0]

Any ideas would be appreciated, otherwise its back to method of moments
for the Pareto distribution for me :)

Thanks
Lourens

On Tue, 2003-09-30 at 22:07, Spencer Graves wrote:

>       In my experience, the most likely cause of this problem is that 
> optim may try to test nonpositive values for shape or scale.  I avoid 
> this situation by programming the log(likelihood) in terms of log(shape) 
> and log(scale) as follows: 
> 
>  > gammaLoglik <-
> + function(x, logShape, logScale, negative=TRUE){
> + lglk <- sum(dgamma(x, shape=exp(logShape), scale=exp(logScale),
> + log=TRUE))
> + if(negative) return(-lglk) else return(lglk)
> + }
>  > tst <- rgamma(10, 1)
>  > gammaLoglik(tst, 0, 0)
> [1] 12.29849
> 
> Then I then call optim directly to minimize the negative of the 
> log(likelihood). 
> 
>       If I've guessed correctly, this should fix the problem.  If not, 
> let us know. 
> 
>       hope this helps.  spencer graves




More information about the R-help mailing list