[R] Using .Fortran in R - how can I use file i/o flexibly OR arrays of variable length as arguments?

Sat Feb 5 12:38:41 CET 2022

On Sat, 5 Feb 2022 09:41:02 +0000
Peter Green <mapjg using bristol.ac.uk> wrote:

> Any thought on the other route I suggested, that is, binary file i/o 
> from with the Fortran code (e.g. full use of the connections 
> functionality), again perhaps via C?

I think that Fortran I/O is only forbidden when it comes to units * and
6. It should be allowed to open a unit, use unformatted I/O to write an
array into it and readBin() back on the R side.

On the other hand, the use of connections means that we're calling into
R, which means that we have to interact with the garbage collector. It's
easier to just return the values to R in that case.

> I wondered whether you (or anyone) thought there was a way to do this 
> using C wrapper routines in some way?

I've never done this myself, but I think that the following should work:

/*
 * We'll need a single pointer to everything returned by the function,
 * so we'll store the outputs in a struct. If the function returns only
 * one allocated pointer, we don't need this structure.
 */
struct calculation_result {
 double *a, *b, *c; // ...
};

/* See WRE 5.13 */
static void free_calculation_result(SEXP * ptr) {
 struct calculation_result * res = R_ExternalPtrAddr(ptr);
 if (!res) return;
 /*
  * Call the Fortran function to deallocate a, b, c here, using C
  * interoperability or the F77_CALL macro.
  */
 Free(res);
 R_ClearExternalPtr(ptr);
}

/* See WRE 5.10.1 */
SEXP run_calculation(SEXP arg1, SEXP arg2, ...) {
 /*
  * Allocate the result store and make it known to R, including how to
  * dispose of it.
  */
 SEXP sres = PROTECT(R_MakeExternalPtr(NULL, R_NilValue, R_NilValue));
 R_RegisterCFinalizerEx(sres, free_calculation_result, TRUE);
 /* NOTE: if Calloc fails, it longjmp()s instead of returning NULL */
 struct calculation_result * res = Calloc(1, struct calculation_result);
 R_SetExternalPtrAddr(sres, res);

 /*
  * Call the Fortran function to perform the calculation here, store the
  * pointers it returns inside the res structure.
  */

 /*
  * It is now safe to call into R again. If an error happens or we're
  * interrupted, free_calculation_result will prevent a memory leak.
  * Allocate a result object (e.g. a named list) and store the data from
  * `res` in R vectors.
  */

 UNPROTECT(as many as PROTECT() calls above);
 return the_result_list;
}

Alternatively, provide some place with a static lifetime for the
pointers to the calculation results (put them in a COMMON block with
the SAVE attribute?) and split the code into three functions:

1. Perform the calculations and remember the pointers in variables
   with static lifetime; return the sizes of the arrays.
2. Copy the data from the COMMON pointers to the provided arrays.
3. Deallocate the Fortran-owned COMMON arrays.

Then the R wrapper should do the following:

on.exit(.Fortran('free_fortran_arrays'))
sizes <- .Fortran('perform_calculations', sizes = integer(k))$sizes
result <- .Fortran(
 'copy_fortran_arrays',
 a = double(sizes[1]), b = double(sizes[2]), ...
)

If R is interrupted or has an allocation failure between
perform_calculations and copy_fortran_arrays, the destructor still runs
and prevents resource leaks. This makes the code non-reentrant, but so
is R itself.

A third solution would be to abstract the Fortran code behind a C++
class and use the magic of Rcpp to return the data as R objects while
keeping the promise that the destructor eventually runs. This is more
or less the same as the first solution, but most of the work is done by
Rcpp.

Now that we both had to mention Writing R Extensions, I think we're
firmly in the <r-package-devel using r-project.org> territory.

-- 
Best regards,
Ivan