[R] MLE packages

Mon Oct 21 15:58:52 CEST 2019

I'm fairly new to R. The language is amazing, but I'm having trouble
navigating packages. I have a solution that handles the problems I'm
working on, but I don't know if it could be solved more cleanly with mle,
bbmle, maxLik, etc..

Here's an example problem first. I have run many WAV files through voice
recognition software; the software returns 50 hypotheses for each, together
with scores S_{ni} indicating how 'good' the i^th hypothesis is. I want to
map the S_{ni} to a probability distribution. So I'm using MLE to fit a
function f that maps scores to logs of relative probabilities. This means
maximising

\sum_n[   f(S_{nc_n}) - \log \sum_i \exp f(S_{ni})   ]

where c_n is the index of the correct hypothesis for the n^th sample.

Here's the code:

ave_log_likelihood = function(f, scores) {
    def = scores %>% filter(Sc > 0)
    log_likelihoods = with(def, f(Sc) - matrixStats::rowLogSumExps(f(S),
na.rm = T))
    return(mean(log_likelihoods))
}

nlopts = list(algorithm = "NLOPT_LN_BOBYQA", maxeval = 500, print_level = 0)

best_linear_fit = function(scores) {
  res <- nloptr(c(0.01),
                function(a) -ave_log_likelihood(function(x) (a * x),
scores),
                opts = nlopts)
  return (data.frame(log_likelihood = -res$objective, slope = res$solution,
doubling = log(2)/res$solution))
}

Now, I need to write a lot of variants of this with different objectives
and with different classes of function. But there's a lot of verbiage in
best_linear_fit which would currently be copy/pasted. Also, as written it
makes it messy to fit on training data and then evaluate on test data.

I'd appreciate any advice on packages that might make it easier to write
this more cleanly, ideally using the idioms used in `lm`, etc., such as
formulae and `predict`. (Any pointers on writing cleaner R code would also
be lovely!)

Thanks in advance;
Mohan

	[[alternative HTML version deleted]]