[Rd] Arima_Like() and NaN - a (possible) problem, a patch, and RFC
Mauro Andreolini
mauro.andreolini at unimore.it
Thu Jan 29 19:51:31 CET 2009
Hi,
recently I have started working with R (v. 2.7.2), and I have been using
R's internal ARIMA_Like() function (from the "stats" package) to
estimate some ARIMA models. In particular, I use ARIMA_Like() in a
function "fn()" that I feed to the optim() method; the main goal is to
find optimal ARIMA prediction models for some time series.
The ARIMA_Like() function returns a three elements vector; under some
conditions (that I could not yet spot), the second element of this
vector is a 'NaN'. Since fn() is using this value to compute its return
value, it suddenly returns 'NaN' and optim() warns me about it:
Error in optim(init[mask], armafn, method = "BFGS", hessian = TRUE,
control = optim.control, :
non-finite finite-difference value [2]
I looked into the code (file src/arima.c of the stats package) and
noticed that this second element is a sum of logarithmic terms, computed
through the following snippet of code:
gain = M[0];
for (j = 0; j < d; j++) gain += delta[j] * M[r + j];
if(gain < 1e4) {
nu++;
ssq += resid * resid / gain;
sumlog += log(gain);
}
Here, sumlog is the second element of the resulting vector. However, the
"if(gain < 1e4) {" check does not explicitly check against negative
values of the gain variable. Indeed, whenever the gain variable assumes
a negative value, the statement "sumlog += log(gain);" evalutes to NaN.
I changed the check as follows:
if (gain > 0 && gain < 1e4) {
This avoids computation of logarithms on negative values. I recompiled
and reinstalled R, and the sumlog value is no more 'NaN'. As a result,
optim() never warns about the non-finite finite-difference value.
Here is my question: does this modification make any sense? Have I
missed something big? To me, it looks reasonable to avoid computing
log(x) when x < 0, but maybe returning 'NaN' may have its purposes.
Could someone please clarify this? I searched the mailing list archives
and I could not spot anything even close to this argument, which may be
an indication that I am doing something really wrong, but I would like
to understand why.
Best regards
Mauro Andreolini
More information about the R-devel
mailing list