[R-pkg-devel] rstan issue [Was: CRAN submission error when running tests in testthat]
Nathan Green
n8th@ngreen @end|ng |rom y@hoo@co@uk
Fri Nov 26 14:10:59 CET 2021
Thanks everyone for your help with this.TBH the technical nature is more than a little above my head.What are you advising as the current solution to enable me to resubmit the package to CRAN?BCEA doesn't have a strong dependency on rstan so I could remove it as a simple solution?Thanks again for all the help!
Nathan
Dr Nathan Green
@: n8thangreen using yahoo.co.ukTel: 07821 318353
On Thursday, 25 November 2021, 22:56:45 GMT, Simon Urbanek <simon.urbanek using r-project.org> wrote:
Kevin,
thanks, that's very helpful! So this is a serious bug in rstan - apparently they only do that on macOS which explains why other platforms don't see it:
.onLoad <- function(libname, pkgname) {
[...]
## the tbbmalloc_proxy is not loaded by RcppParallel which is linked
## in by default on macOS; unloading only works under R >= 4.0 so that
## this is only done for R >= 4.0
if(R.version$major >= 4 && Sys.info()["sysname"] == "Darwin") {
tbbmalloc_proxy <- system.file("lib/libtbbmalloc_proxy.dylib", package="RcppParallel", mustWork=FALSE)
tbbmalloc_proxyDllInfo <<- dyn.load(tbbmalloc_proxy, local = FALSE, now = TRUE)
}
I can confirm that commenting out that part solves the segfault and BCEA passes the tests.
@Ben, please fix and submit a new version of rstan (see discussion below).
Thanks,
Simon
> On Nov 26, 2021, at 11:19 AM, Kevin Ushey <kevinushey using gmail.com> wrote:
>
> That shouldn't be happening, at least not by default. However, RcppParallel does ship with tbbmalloc_proxy, which is a library that, when loaded, will overload the default allocators to use TBB's allocators instead. The intention is normally that these libraries would be loaded via e.g. LD_PRELOAD or something similar, since changing the allocator at runtime would cause these sorts of issues.
>
> If I test with the following:
>
> trace(dyn.load, quote({
> if (grepl("tbbmalloc_proxy", x))
> print(rlang::trace_back())
> }), print = FALSE)
>
> devtools::test()
>
> then I see:
>
> 1. ├─base::load(test_path("data", "stanfit.RData")) at test-bcea.R:179:2
> 2. └─base::..getNamespace(`<chr>`, "stanfit")
> 3. ├─base::tryCatch(...)
> 4. │ └─base tryCatchList(expr, classes, parentenv, handlers)
> 5. │ └─base tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 6. │ └─base doTryCatch(return(expr), name, parentenv, handler)
> 7. └─base::loadNamespace(name)
> 8. └─base runHook(".onLoad", env, package.lib, package)
> 9. ├─base::tryCatch(fun(libname, pkgname), error = identity)
> 10. │ └─base tryCatchList(expr, classes, parentenv, handlers)
> 11. │ └─base tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 12. │ └─base doTryCatch(return(expr), name, parentenv, handler)
> 13. └─rstan fun(libname, pkgname)
> 14. └─base::dyn.load(tbbmalloc_proxy, local = FALSE, now = TRUE)
>
> My guess is that the 'rstan' package is trying to forcefully load libtbbmalloc_proxy.dylib at runtime, and that's causing the issue. IMHO 'rstan' shouldn't be doing that, at least definitely not by default.
>
> Best,
> Kevin
>
> On Thu, Nov 25, 2021 at 12:54 PM Simon Urbanek <simon.urbanek using r-project.org> wrote:
> Nathan,
>
> testthat is notorious for obfuscation and unhelpful output as can be clearly seen in the head of testthat.Rout.fail:
>
> > library(testthat)
> > library(BCEA)
>
> Attaching package: 'BCEA'
>
> The following object is masked from 'package:graphics':
>
> contour
>
> >
> > test_check("BCEA")
>
> *** caught segfault ***
> address 0x10d492ffc, cause 'memory not mapped'
>
> However this appears to be hard to debug, because it is a fall out from some memory corruption and/or allocator mistmatch: the crash happens in free() while doing GC (see below). Since it happens in the GC, many bad things happen afterwards.
> With some lldb magic I could trace that the crash happens during
> ..getNamespace(c("Matrix", "1.3-3"), "stanfit")
> load(test_path("data", "stanfit.RData"))
> but as I said that's likely too late - the memory corruption/issue likely happened before. Since BCEA itself doesn't have native code, this is likely a bug in some of the packages it depends on, but quite a serious one since it affects subsequent code in R.
>
> The list of packages loaded at the time of the crash - so one of them is the culprit:
>
> [1] "rstan" "tidyselect" "purrr" "reshape2"
> [5] "lattice" "V8" "colorspace" "vctrs"
> [9] "generics" "testthat" "stats4" "BCEA"
> [13] "loo" "grDevices" "R2jags" "utf8"
> [17] "rlang" "pkgbuild" "pillar" "glue"
> [21] "withr" "DBI" "matrixStats" "lifecycle"
> [25] "plyr" "stringr" "munsell" "gtable"
> [29] "coda" "codetools" "inline" "callr"
> [33] "ps" "parallel" "curl" "fansi"
> [37] "methods" "Rcpp" "scales" "desc"
> [41] "RcppParallel" "StanHeaders" "GrassmannOptim" "jsonlite"
> [45] "abind" "gridExtra" "winch" "rjags"
> [49] "ggplot2" "stats" "datasets" "graphics"
> [53] "stringi" "processx" "dplyr" "grid"
> [57] "rprojroot" "cli" "tools" "magrittr"
> [61] "tibble" "crayon" "pkgconfig" "Matrix"
> [65] "MASS" "ellipsis" "utils" "prettyunits"
> [69] "assertthat" "base" "boot" "R6"
> [73] "R2WinBUGS" "compiler"
>
> My guess would be that the issue could be in RcppParallel which overrides the memory allocator:
>
> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x11649fffc)
> * frame #0: 0x00000001097c517f libtbbmalloc.dylib`__TBB_malloc_safer_msize + 63
> frame #1: 0x00007fff76f746fd libsystem_malloc.dylib`free + 96
> frame #2: 0x00000001001c9227 libR.dylib`RunGenCollect at memory.c:1114 [opt]
> frame #3: 0x00000001001c9038 libR.dylib`RunGenCollect(size_needed=0) at memory.c:1896 [opt]
> frame #4: 0x00000001001bf769 libR.dylib`R_gc_internal(size_needed=0) at memory.c:3129 [opt]
>
> (lldb) image lookup -va 0x00000001097c517f
> Address: libtbbmalloc.dylib[0x000000000001117f] (libtbbmalloc.dylib.__TEXT.__text + 65375)
> Summary: libtbbmalloc.dylib`__TBB_malloc_safer_msize + 63
> Module: file = "/Volumes/Builds/packages/high-sierra-x86_64/Rlib/4.1/RcppParallel/lib/libtbbmalloc.dylib", arch = "x86_64"
> Symbol: id = {0x0000060c}, range = [0x00000001097c5140-0x00000001097c5290), mangled="__TBB_malloc_safer_msize"
>
> but that's just a wild guess... (CCing Kevin just in case he can shed a light on whether TBB allocator should be involved in regular R garbage collection).
>
> Cheers,
> Simon
>
>
>
> > On Nov 25, 2021, at 5:37 AM, Nathan Green via R-package-devel <r-package-devel using r-project.org> wrote:
> >
> > Hi,
> > I've getting an ERROR when submitting a new release of our package BCEA to CRAN which I'm having problems understanding and reproducing. Its passing CHECK locally and GitHub Actions standard check (https://github.com/n8thangreen/BCEA/actions/runs/1494595896).
> > The message is something to do with testthat. Any help would be gratefully received.
> > Thanks!
> > Nathan
> >
> > From https://cran.r-project.org/web/checks/check_results_BCEA.html
> > Here's the error message:
> > Check: tests, Result: ERROR
> > Running ‘testthat.R’ [5s/5s]
> > Running the tests in ‘tests/testthat.R’ failed.
> > Last 13 lines of output:
> > 33: tryCatch(withCallingHandlers({ eval(code, test_env) if (!handled && !is.null(test)) { skip_empty() }}, expectation = handle_expectation, skip = handle_skip, warning = handle_warning, message = handle_message, error = handle_error), error = handle_fatal, skip = function(e) { })
> > 34: test_code(NULL, exprs, env)
> > 35: source_file(path, child_env(env), wrap = wrap)
> > 36: FUN(X[[i]], ...)
> > 37: lapply(test_paths, test_one_file, env = env, wrap = wrap)
> > 38: doTryCatch(return(expr), name, parentenv, handler)
> > 39: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> > 40: tryCatchList(expr, classes, parentenv, handlers)
> > 41: tryCatch(code, testthat_abort_reporter = function(cnd) { cat(conditionMessage(cnd), "\n") NULL})
> > 42: with_reporter(reporters$multi, lapply(test_paths, test_one_file, env = env, wrap = wrap))
> > 43: test_files(test_dir = test_dir, test_package = test_package, test_paths = test_paths, load_helpers = load_helpers, reporter = reporter, env = env, stop_on_failure = stop_on_failure, stop_on_warning = stop_on_warning, wrap = wrap, load_package = load_package)
> > 44: test_files(test_dir = path, test_paths = test_paths, test_package = package, reporter = reporter, load_helpers = load_helpers, env = env, stop_on_failure = stop_on_failure, stop_on_warning = stop_on_warning, wrap = wrap, load_package = load_package, parallel = parallel)
> > 45: test_dir("testthat", package = package, reporter = reporter, ..., load_package = "installed")
> > 46: test_check("BCEA")
> > An irrecoverable exception occurred. R is aborting now ...
> > See: <https://www.r-project.org/nosvn/R.check/r-release-macos-x86_64/BCEA-00check.html>,
> > <https://www.r-project.org/nosvn/R.check/r-oldrel-macos-x86_64/BCEA-00check.html>
> >
> > Dr Nathan Green
> > @: n8thangreen using yahoo.co.ukTel: 07821 318353
> >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-package-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >
>
[[alternative HTML version deleted]]
More information about the R-package-devel
mailing list