[R] Est. Component Size with AIC/BIC under Gamma Distribution

Edward Wijaya ewijaya at gmail.com
Fri May 23 08:40:56 CEST 2008


Dear all,

I am trying to model number of samples from
a given series. The series are modelled according
Gamma function.

In order to estimate the # samples, I use BIC/AIC
with MLE (computed from dgamma function).

Here is the code I have.

__BEGIN__

mlogl <- function( x_func, theta_func, samp) {
   # computing log_likelihood
    return( - sum(dgamma(samp, shape = x_func, scale=theta_func, log = TRUE)))
}


find_bic <- function(mll,smpl,k) {

      bic <- (-2 * mll) + (k * log(length(smpl)))
      bic
}

find_aic <- function(mll,smpl,k) {

      aic <- (-2 * mll) + (k * 2)
      aic

}

mlogl_process <- function(smpl,error,start ) {

    # EM algorithm to estimate max loglikelihood
    thetalast <- 0
    thetacurrent <- start
    theta <- start
    difference <- start
    thetaprocess <- start
    best <- 0

    while (difference > error) {
        mlogl_out <- nlm(mlogl, mean(smpl),  theta_func=thetacurrent, samp=smpl)
        theta <- mlogl_out$estimate
        thetalast <- thetacurrent
        thetacurrent <- theta
        difference <- abs(thetalast - thetacurrent)

        # E-STEP
        new_maxlogl <- nlm(mlogl, mean(smpl),  theta_func=theta,  samp=smpl)

        # M-STEP
        if (new_maxlogl$minimum > best) {
           best <- new_maxlogl$minimum

        }
    }
    best
}

# main program

# my samples
vsamples<- c(14.7, 18.8, 14, 15.9, 9.7, 12.8)

# initialize
start_ <- 300
error_ <- 0.001
thetastart <- 1
k <- 5

# compute AIC/BIC for k component

for (nofc in 1:k {
  cat("k = ", nofc, "\n")
  maxlogl <- -(mlogl_process(vsamples,error_,thetastart))
  bic_k <- find_bic(maxlogl,vsamples,nofc)
  aic_k <- find_aic(maxlogl,vsamples,nofc)
  cat("BIC = ", bic_k, " - AIC = ", aic_k, "- MLL= ", maxlogl,"\n")
}

__ END__

We finally expect to choose K with the smallest AIC/BIC
value.

However, the problem I have is that the AIC/BIC value is always
on the increase.

Can anybody advice what's wrong with my above approach?

Regards,
Edward



More information about the R-help mailing list