[R-pkg-devel] [SPAM Warning!]Re: CRAN Debian error installation time

Ivan Krylov |kry|ov @end|ng |rom d|@root@org
Tue Jan 28 17:09:45 CET 2025


В Tue, 28 Jan 2025 16:07:38 +0100
Guillermo Vinue <guillermovinue using gmail.com> пишет:

> Thank you for your help, Ben and Ivan. Unfortunately, the note
> persists:
> https://win-builder.r-project.org/incoming_pretest/fawir_1.0_20250128_113725/Debian/00check.log

> The public source repository from my package is here:
> https://github.com/guivivi/fawir

My bad. There is no compilation done in the code, so setting build
system flags won't help:
https://win-builder.r-project.org/incoming_pretest/fawir_1.0_20250128_113725/Debian/00install.out

Please remove the configure script. The problem must be elsewhere.

> - a zzz.R file with this content:
> .onLoad <- function(libname, pkgname) {
>
>   Sys.setenv(OMP_NUM_THREADS = 1)
>   Sys.setenv(MKL_NUM_THREADS = 1)
>   Sys.setenv(OPENBLAS_NUM_THREADS = 1)
>   Sys.setenv(R_INSTALL_NCPUS = 1)
>
>   RcppParallel::setThreadOptions(numThreads = 1)
> }

This is not a good idea. First of all, the OpenMP standard requires
the changes done by Sys.setenv(OMP_NUM_THREADS=...) calls after the R
process is started to be ignored; they will only take effect for newly
created child processes (e.g. system() calls). Secondly, if a user does
have a preferred number of threads set using any of these variables,
your package's .onLoad will change this number without a way to set it
back.

So who does create threads while your package is being installed? I
couldn't reproduce the 260% load, but I do see average 150% CPU load
during R CMD INSTALL on my computer.

R's package installation is hard to trace directly because R uses a
series of child processes to perform the various stages of the package
installation procedure. 'ltrace' <https://ltrace.org/> can serve as a
semi-automatic debugger that can follow the child processes and deliver
a stack trace when the function with a given name is called. Let's try
'pthread_create':

ltrace -w 10 -f -e pthread_create -- \
 sh -c '~/R-build/bin/R CMD INSTALL fawir_1.0.tar.gz'

(skipping a lot of output)

[pid 10154] libgomp.so.1->pthread_create(0x7ffcc6a75678, 0x7f13bc0f1720, 0x7f13bc0c6c40, 0x7ffcc6a75600) = 0
                        omp_fulfill_event (ip = 0x7f13bc0c7391)
                        GOMP_parallel (ip = 0x7f13bc0be0b1)
                        _Z16omp_thread_countv (ip = 0x7f133153cfa1)
                        _rsparse_omp_thread_count (ip = 0x7f133151c17b)
                        R_doDotCall (ip = 0x7f13bc900712)
                        bcEval_loop (ip = 0x7f13bc94502c)
                        bcEval (ip = 0x7f13bc94c902)
                        Rf_eval (ip = 0x7f13bc94cc7b)
                        forcePromise.part.0 (ip = 0x7f13bc94d5db)
                        Rf_eval (ip = 0x7f13bc94cf70)

This is the 'rsparse' package empirically measuring the number of
OpenMP threads:
https://github.com/dselivanov/rsparse/blob/695d4ebb87209d880ddbb25c418252e85264d603/R/zzz.R#L12C19-L12C44
https://github.com/dselivanov/rsparse/blob/695d4ebb87209d880ddbb25c418252e85264d603/R/zzz.R#L41C67-L41C83
https://github.com/dselivanov/rsparse/blob/695d4ebb87209d880ddbb25c418252e85264d603/src/utils.cpp#L87

Arguably, 'rsparse' should be using omp_get_thread_limit() instead of
spawning a lot of threads and then counting them one by one. This has
caused troubles for other packages before: 
https://github.com/tidymodels/textrecipes/pull/251#issuecomment-1772868032

-- 
Best regards,
Ivan



More information about the R-package-devel mailing list