[Rd] Ancient C /Fortran code linpack error

Thu Feb 9 17:44:01 CET 2017

> > On 9 Feb 2017, at 16:00, Göran Broström <goran.brostrom at umu.se> wrote:
> > 
> > In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this:
> > 
> >    F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info);
> >    if (*info == 0){
> >        F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job);
> >        ........
> > 
> > This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R!
> > 
> > I googled for C and nan and found a work-around: Change 'if ...' to
> > 
> >   if (*info == 0 & (hessian[0][0] == hessian[0][0])){
> > 
> > which works as a test of hessian[0][0] (not) being NaN.
> > 
> > I'm using the .C interface for calling C code.
> > 
> > Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector?

> You should/could use macro R_FINITE to test each entry of the hessian.
> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac:

>     for (j = 0; j < *n; j++)
>         for (i = 0; i < *n; i++) {
>             if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) )
>                 error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
>             rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i];
>         }

A minor hint  on that:  While REAL(.)  (or INTEGER(.) ...)  is really cheap in
the R sources themselves, that is not the case in package code.

Hence, not only nicer to read but even faster is

  double *fj = REAL(sexp_fjac);
  for (j = 0; j < *n; j++)
    for (i = 0; i < *n; i++) {
        if( !R_FINITE(fj[(*n)*j + i]) )
           error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
           rjac[(*ldr)*j + i] = fj[(*n)*j + i];
     }

> There may be a more compact way with a macro in the R headers.
> I feel that If other code can't handle non-finite values: then test.

> Berend Hasselman

Indeed: do test.
Much better safe than going for speed and losing in rare cases! 

The latter is a recipe to airplanes falling out of the sky  ( ;-\ )
and is unfortunately used by some (in)famous "optimized" (fast but
sometimes wrong!!) Lapack/BLAS libraries.

The NEWS about the next version of R (3.4.0 due in April) has
a new 2-paragraph entry related to this:

---------------------------------------------------------------------------------

  SIGNIFICANT USER-VISIBLE CHANGES:

[...........]

    * Matrix products now consistently bypass BLAS when the inputs have
      NaN/Inf values. Performance of the check of inputs has been
      improved. Performance when BLAS is used is improved for
      matrix/vector and vector/matrix multiplication (DGEMV is now used
      instead of DGEMM).

      One can now choose from alternative matrix product
      implementations via options(matprod = ).  The "internal"
      implementation is unoptimized but consistent in precision with
      other summation in R (uses long double accumulators).  "blas"
      calls BLAS directly for best performance, yet usually with
      undefined behavior for inputs with NaN/Inf.

---------------------------------------------------------------------------------

Last but not least :

If you are not afraid of +/- Inf, but really only of NA/NaN's (as the OP said), 
then note that "THE manual" (= "Writing R Extensions") does mention
ISNAN(.) almost in the same place as the first occurence of
R_FINITE(.).

Best regards,
Martin Maechler, ETH Zurich