[R-pkg-devel] OpenMP and CRAN checks

Rodrigo Tobar Carrizo rodr|go@tob@rc@rr|zo @end|ng |rom uw@@edu@@u
Wed Apr 5 09:45:14 CEST 2023


Hi Dirk,

Thanks for your response. In a way this is more or less what I knew we would have to end up doing (i.e., explicit limit ourselves to 2 CPUs maximum for anything running on CRAN), but it still confuses me why this didn't just happen, given how CRAN sets OMP_THREAD_LIMIT=2 already, which should constrain us automatically.

I've taken your advice onboard and put in a very similar function that I am then invoking in all examples, tests and vignettes. Our code uses roxygen2, and there are 150 @examples in the inline documentation, so a midly complex sed command did the trick. The vignettes and tests were not as bad.

We'll try a resubmission and see how we fare.

Thanks again,

Rodrigo
________________________________
Von: Dirk Eddelbuettel <edd using debian.org>
Gesendet: Dienstag, 4. April 2023 22:00
An: Rodrigo Tobar Carrizo <rodrigo.tobarcarrizo using uwa.edu.au>
Cc: r-package-devel using r-project.org <r-package-devel using r-project.org>
Betreff: Re: [R-pkg-devel] OpenMP and CRAN checks


Hi Rodrigo,

This came up recently again on social media where I illustrated how the
tiledb package deals with it. So a quick recap:

First off, let's make the goals clear.

We want to _simultaneously_
 - abide by CRAN Policy rules and cap ourselves to two cores there
 - do not impose any limits on our users: ALL cores ALL the time

The solution we implemented a while is to use a function _that is an opt-in_
which looks at the standard OpenMP variable as well as at R's own Ncpus:

    limitTileDBCores <- function(ncores, verbose=FALSE) {
      if (missing(ncores)) {
        ## start with a simple fallback: 'Ncpus' (if set) or else 2
        ncores <- getOption("Ncpus", 2L)
        ## also consider OMP_THREAD_LIMIT (cf Writing R Extensions), gets NA if envvar unset
        ompcores <- as.integer(Sys.getenv("OMP_THREAD_LIMIT"))
        ## and then keep the smaller
        ncores <- min(na.omit(c(ncores, ompcores)))
      }
      stopifnot(`The 'ncores' argument must be numeric or character` = is.numeric(ncores) || is.character(ncores))
      ## for brevity omitted here how ncores propagates to TileDB library -- creates `cfg`
      if (verbose) message("Limiting TileDB to ",ncores," cores. See ?limitTileDBCores.")
      invisible(cfg)
    }

The key is that we reflect the smaller of Ncpus and OMP_THREAD_LIMIT along
with a fall-back of two in case nothing is set.

That function is then called (and feeds into the library config) at the
beginning of each
 - help file example
 - unit test file
 - vignette

As example (from a help file) is

    \dontshow{ctx <- tiledb_ctx(limitTileDBCores())}

(where ctx a context object controlling, inter alia, the thread pool).

By throttling it anywhere CRAN executes code, and using the prescribed
maximum of two core, we satisfy goal one of not getting thrown off CRAN. By
making it an _explicit_ opt-in we satisfy our goal of never slowing down our
users who (presumably) do not opt in. And those who have, say, Ncpus set (as
I do to spread R's own package installations over all my cores) get the
maximum performance for examples, tests, and vignettes too as they opted in.

"Works for us" as they say.

Hope this helps,  Dirk

--
dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list