[Rd] segfault issue with parallel::mclapply and download.file() on Mac OS X

Martin Maechler m@echler @ending from @t@t@m@th@ethz@ch
Thu Sep 20 10:33:38 CEST 2018


>>>>> Seth Russell 
>>>>>     on Wed, 19 Sep 2018 15:19:48 -0600 writes:

    > I have an lapply function call that I want to parallelize. Below is a very
    > simplified version of the code:

    > url_base <- "https://cloud.r-project.org/src/contrib/"
    > files <- c("A3_1.0.0.tar.gz", "ABC.RAP_0.9.0.tar.gz")
    > res <- parallel::mclapply(files, function(s) download.file(paste0(url_base,
    > s), s))

    > Instead of download a couple of files in parallel, I get a segfault per
    > process with a 'memory not mapped' message. I've been working with Henrik
    > Bengtsson on resolving this issue and he recommended I send a message to
    > the R-Devel mailing list.

Thank you for the simple reproducible (*) example.

If I run the above in either R-devel  or R 3.5.1, it works
flawlessly [on Linux Fedora 28]. .... ah, now I see you say so
much later... also that other methods than "libcurl" work.

To note here is that "libcurl" is also the default method on
Linux where things work.

I've also tried it on the Windows server I've easily access and
the following code -- also explicitly using  "libcurl" --

##--------------------------------------------------------------
url_base <- "https://cloud.r-project.org/src/contrib/"
files <- c("A3_1.0.0.tar.gz", "ABC.RAP_0.9.0.tar.gz")
res <- parallel::mclapply(files, function(s)
            download.file(paste0(url_base, s), s, method="libcurl"))
##--------------------------------------------------------------

works fine there too.

- So maybe this should have gone to the R-SIG-Mac mailing list
  instead of this one ??

- Can other MacOS R users try and see?

--
*) at least till one of the 2 packages gets updated ! ;-)

    > Here's the output:

    > trying URL 'https://cloud.r-project.org/src/contrib/A3_1.0.0.tar.gz'
    > trying URL 'https://cloud.r-project.org/src/contrib/ABC.RAP_0.9.0.tar.gz'

    > *** caught segfault ***
    > address 0x11575ba3a, cause 'memory not mapped'

    > *** caught segfault ***
    > address 0x11575ba3a, cause 'memory not mapped'

    > Traceback:
    > 1: download.file(paste0(url_base, s), s)
    > 2: FUN(X[[i]], ...)
    > 3: lapply(X = S, FUN = FUN, ...)
    > 4: doTryCatch(return(expr), name, parentenv, handler)
    > 5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
    > 6: tryCatchList(expr, classes, parentenv, handlers)
    > 7: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if
    > (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))
    > call <- sys.call(-4L)        dcall <- deparse(call)[1L]
    > prefix <- paste("Error in", dcall, ": ")
    > LONG <- 75LTraceback:
    > sm <- strsplit(conditionMessage(e), "\n")[[1L]] 1:         w <- 14L
    > + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))
    > download.file(paste0(url_base, s), s)            w <- 14L + nchar(dcall,
    > type = "b") + nchar(sm[1L],
    > type = "b")        if (w > LONG)  2: FUN(X[[i]], ...)
    > 3: lapply(X = S, FUN = FUN, ...)
    > 4: doTryCatch(return(expr), name, parentenv, handler)
    > 5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
    > 6:             prefix <- paste0(prefix, "\n  ")tryCatchList(expr, classes,
    > parentenv, handlers)
    > }    else prefix <- "Error : " 7:     msg <- paste0(prefix,
    > conditionMessage(e), "\n")tryCatch(expr, error = function(e) {
    > .Internal(seterrmessage(msg[1L]))    call <- conditionCall(e)    if
    > (!silent && isTRUE(getOption("show.error.messages"))) {    if
    > (!is.null(call)) {        cat(msg, file = outFile)        if
    > (identical(call[[1L]], quote(doTryCatch)))
    > .Internal(printDeferredWarnings())            call <- sys.call(-4L)    }
    > dcall <- deparse(call)[1L]    invisible(structure(msg, class =
    > "try-error", condition = e))        prefix <- paste("Error in", dcall, ":
    > ")})        LONG <- 75L        sm <- strsplit(conditionMessage(e),
    > "\n")[[1L]]
    > w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")
    > if (is.na(w))  8:             w <- 14L + nchar(dcall, type = "b") +
    > nchar(sm[1L], try(lapply(X = S, FUN = FUN, ...), silent = TRUE)
    > type = "b")
    > if (w > LONG)             prefix <- paste0(prefix, "\n  ") 9:
    > }sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE))    else
    > prefix <- "Error : "
    > msg <- paste0(prefix, conditionMessage(e), "\n")
    > .Internal(seterrmessage(msg[1L]))10:     if (!silent &&
    > isTRUE(getOption("show.error.messages"))) {FUN(X[[i]], ...)        cat(msg,
    > file = outFile)
    > .Internal(printDeferredWarnings())    }11:
    > invisible(structure(msg, class = "try-error", condition =
    > e))lapply(seq_len(cores), inner.do)})

    > 12:  8: parallel::mclapply(files, function(s)
    > download.file(paste0(url_base, try(lapply(X = S, FUN = FUN, ...), silent =
    > TRUE)    s), s))

    > 9:
    > sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE))Possible
    > actions:

    > 1: abort (with core dump, if enabled)
    > 2: normal R exit
    > 10: 3: exit R without saving workspace
    > FUN(X[[i]], ...)4: exit R saving workspace

    > 11: lapply(seq_len(cores), inner.do)
    > 12: parallel::mclapply(files, function(s) download.file(paste0(url_base,
    > s), s))

    > Here's my sessionInfo()

    >> sessionInfo()
    > R version 3.5.1 (2018-07-02)
    > Platform: x86_64-apple-darwin16.7.0 (64-bit)
    > Running under: macOS Sierra 10.12.6

    > Matrix products: default
    > BLAS/LAPACK: /usr/local/Cellar/openblas/0.3.3/lib/libopenblasp-r0.3.3.dylib

    > locale:
    > [1] en_US/en_US/en_US/C/en_US/en_US

    > attached base packages:
    > [1] parallel  stats     graphics  grDevices utils     datasets  methods
    > [8] base

    > loaded via a namespace (and not attached):
    > [1] compiler_3.5.1

    > My version of R I'm running was installed via homebrew with "brew install r
    > --with-java --with-openblas"

    > Also, the provided example code works as expected on Linux. Also, if I
    > provide a non-default download method to the download.file() call such as:

    > res <- parallel::mclapply(files, function(s) download.file(paste0(url_base,
    > s), s, method="wget"))
    > res <- parallel::mclapply(files, function(s) download.file(paste0(url_base,
    > s), s, method="curl"))

    > It works correctly - no segfault. If I use method="libcurl" it does
    > segfault.

    > I'm not sure what steps to take to further narrow down the source of the
    > error.

    > Is this a known bug? if not, is this a new bug or an unexpected feature?

    > Thanks,
    > Seth

    > [[alternative HTML version deleted]]

    > ______________________________________________
    > R-devel using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list