[R] MLE Estimation of Gamma Distribution Parameters for data with 'zeros'
Fox, Aaron
Afox at golder.com
Wed Jun 11 20:22:48 CEST 2008
Greetings, all
I am having difficulty getting the fitdistr() function to return without
an error on my data. Specifically, what I'm trying to do is get a
parameter estimation for fracture intensity data in a well / borehole.
Lower bound is 0 (no fractures in the selected data interval), and upper
bound is ~ 10 - 50, depending on what scale you are conducting the
analysis on.
I read in the data from a text file, convert it to numerics, and then
calculate initial estimates of the shape and scale parameters for the
gamma distribution from moments. I then feed this back into the
fitdistr() function.
R code (to this point):
########################################
data.raw=c(readLines("FSM_C_9m_ENE.inp"))
data.num <- as.numeric(data.raw)
data.num
library(MASS)
shape.mom = ((mean(data.num))/ (sd(data.num))^2
shape.mom
med.data = mean(data.num)
sd.data = sd(data.num)
med.data
sd.data
shape.mom = (med.data/sd.data)^2
shape.mom
scale.mom = (sd.data^2)/med.data
scale.mom
fitdistr(data.num,"gamma",list(shape=shape.mom,
scale=scale.mom),lower=0)
fitdistr() returns the following error:
" Error in optim(x = c(0.402707037, 0.403483333, 0.404383704,
2.432626667, :
L-BFGS-B needs finite values of 'fn'"
Next thing I tried was to manually specify the negative log-likelihood
function and pass it straight to mle() (the method specified in Ricci's
tutorial on fitting distributions with R). Basically, I got the same
result as using fitdistr().
Finally I tried using some R code I found from someone with a similar
problem back in 2003 from the archives of this mailing list:
R code
########################################
gamma.param1 <- shape.mom
gamma.param2 <- scale.mom
log.gamma.param1 <- log(gamma.param1)
log.gamma.param2 <- log(gamma.param2)
gammaLoglik <-
function(params,
negative=TRUE){
lglk <- sum(dgamma(data, shape=exp(params[1]), scale=exp(params[2]),
log=TRUE))
if(negative)
return(-lglk)
else
return(lglk)
}
optim.list <- optim(c(log.gamma.param1, log.gamma.param2), gammaLoglik)
gamma.param1 <- exp(optim.list$par[1])
gamma.param2 <- exp(optim.list$par[2])
#########################################
If I test this function using my sample data and the estimates of shape
and scale derived from the method of moments, gammaLogLike returns as
INF. I suspect the problem is that the zeros in the data are causing the
optim solver problems when it attempts to minimize the negative
log-likelihood function.
Can anyone suggest some advice on a work-around? I have seen
suggestions online that a 'censoring' algorithm can allow one to use MLE
methods to estimate the gamma distribution for data with zero values
(Wilkes, 1990, Journal of Climate). I have not, however, found R code to
implement this, and, frankly, am not smart enough to do it myself... :-)
Any suggestions? Has anyone else run up against this and written code to
solve the problem?
Thanks in advance!
Aaron Fox
Senior Project Geologist, Golder Associates
+1 425 882 5484 || +1 425 736 3958 (mobile)
afox at golder.com || www.fracturedreservoirs.com
More information about the R-help
mailing list