[R-pkg-devel] Trouble with long-running tests on CRAN debian server
Dirk Eddelbuettel
edd @end|ng |rom deb|@n@org
Mon Aug 21 15:38:02 CEST 2023
On 21 August 2023 at 16:05, Ivan Krylov wrote:
| Dirk is probably right that it's a good idea to have OMP_THREAD_LIMIT=2
| set on the CRAN check machine. Either that, or place the responsibility
| on data.table for setting the right number of threads by default. But
| that's a policy question: should a CRAN package start no more than two
| threads/child processes even if it doesn't know it's running in an
| environment where the CPU time / elapsed time limit is two?
Methinks that given this language in the CRAN Repository Policy
If running a package uses multiple threads/cores it must never use more
than two simultaneously: the check farm is a shared resource and will
typically be running many checks simultaneously.
it would indeed be nice if this variable, and/or equivalent ones, were set.
As I mentioned before, I had long added a similar throttle (not for
data.table) in a package I look after (for work, even). So a similar
throttler with optionality is below. I'll add this to my `dang` package
collecting various functions.
A usage example follows. It does nothing by default, ensuring 'full power'
but reflects the minimum of two possible options, or an explicit count:
> dang::limitDataTableCores(verbose=TRUE)
Limiting data.table to '12'.
> Sys.setenv("OMP_THREAD_LIMIT"=3); dang::limitDataTableCores(verbose=TRUE)
Limiting data.table to '3'.
> options(Ncpus=2); dang::limitDataTableCores(verbose=TRUE)
Limiting data.table to '2'.
> dang::limitDataTableCores(1, verbose=TRUE)
Limiting data.table to '1'.
>
That makes it, in my eyes, preferable to any unconditional 'always pick 1 thread'.
Dirk
##' Set threads for data.table respecting possible local settings
##'
##' This function set the number of threads \pkg{data.table} will use
##' while reflecting two possible machine-specific settings from the
##' environment variable \sQuote{OMP_THREAD_LIMIT} as well as the R
##' option \sQuote{Ncpus} (uses e.g. for parallel builds).
##' @title Set data.table threads respecting default settingss
##' @param ncores A numeric or character variable with the desired
##' count of threads to use
##' @param verbose A logical value with a default of \sQuote{FALSE} to
##' operate more verbosely
##' @return The return value of the \pkg{data.table} function
##' \code{setDTthreads} which is called as a side-effect.
##' @author Dirk Eddelbuettel
##' @export
limitDataTableCores <- function(ncores, verbose = FALSE) {
if (missing(ncores)) {
## start with a simple fallback: 'Ncpus' (if set) or else 2
ncores <- getOption("Ncpus", 2L)
## also consider OMP_THREAD_LIMIT (cf Writing R Extensions), gets NA if envvar unset
ompcores <- as.integer(Sys.getenv("OMP_THREAD_LIMIT"))
## and then keep the smaller
ncores <- min(na.omit(c(ncores, ompcores)))
}
stopifnot("Package 'data.table' must be installed." = requireNamespace("data.table", quietly=TRUE))
stopifnot("Argument 'ncores' must be numeric or character" = is.numeric(ncores) || is.character(ncores))
if (verbose) message("Limiting data.table to '", ncores, "'.")
data.table::setDTthreads(ncores)
}
|
| --
| Best regards,
| Ivan
|
| ______________________________________________
| R-package-devel using r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-package-devel
--
dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
More information about the R-package-devel
mailing list