[R] Dealing with -Inf in a maximisation problem.
Rolf Turner
r.turner at auckland.ac.nz
Mon Nov 7 03:39:15 CET 2016
On 07/11/16 14:14, ProfJCNash wrote:
> Rolf, What optimizers did you try? There are a few in the optimrx package on R-forge that handle bounds, and it may be
> useful to set bounds in this case. Transformations using log or exp can be helpful if done carefully, but as you note,
> they can make the function more difficult to optimize.
I can't see how to impose bounds in my circumstances. As you will have
seen from my previous answer to Bill Dunlap's post, the b_i are
probabilities, parametrised in terms of z_i:
b_i = exp(z_i)/[sum_j exp(z_j)] .
It is not at all clear to me how to impose constraints on the z_i that
will bound the b_i away from 0.
I can constrain L <= z_i <= U for all i and get b_i >= exp(L)/[n*exp(U)]
--- I think; I may have things upsidedown and backwards --- but this
leaves an infinitude of choices for L and U.
Also the starting values at each M-step are "naturally" given in terms
of the b_i. I.e. I can calculate "reasonable" values for the b_i and
then transform these to provide starting values for the z_i. The
starting values for z_i might not satisfy a given set of constraints.
I guess I could simply truncate the starting values to fall within the
constraints, but that "feels wrong" to me.
I also worry about the impact that imposing constraints will have on
the monotonicity of the successive values of the expected log likelihood
in the EM algorithm context.
> Be cautious about using the default numerical gradient approximations. optimrx allows selection of the numDeriv grad()
> function, which is quite good. Complex step would be better, but you need a function which can be computed with complex
> arguments. Unfortunately, numerical gradients often step over the cliff edge of computability of the function. The
> bounds are not checked for the step to compute things like (f(x+h) - f(x) / h.
I think I can program up an analytic gradient function. Maybe I'll try
that. I have been reluctant to do so because I have had peculiar (bad)
experiences in the past in trying to use analytic gradients with nlm().
Of course the (analytic) gradient becomes undefined if one of the b_i is 0.
cheers,
Rolf
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
More information about the R-help
mailing list