[Bioc-devel] The story of tracing a derfinder bug on OSX that sometimes popped up, sometimes it didn't. Related to IRanges/S4Vectors '$<-'

Martin Morgan martin.morgan at roswellpark.org
Mon Mar 27 12:59:06 CEST 2017


On 03/22/2017 01:12 PM, Hervé Pagès wrote:
> Hi Martin,
>
> On 03/22/2017 03:17 AM, Martin Maechler wrote:
>>>>>>> Andrzej Oleś <andrzej.oles at gmail.com>
>>>>>>>     on Wed, 22 Mar 2017 10:29:57 +0100 writes:
>>
>>     > Just for the record, on R-3.3.2 Herve's code fails with the
>> following error:
>>     > Error in x[TRUE] <- new("A") :
>>     > incompatible types (from S4 to logical) in subassignment type fix
>>
>> yes, (of course).... and I would be interested in a small
>> reproducible example which uses _valid_ code.
>
> Looks like before performing the subassignment itself, [<- first tries
> to coerce the RHS to the "mode" of the LHS by calling as.vector() on the
> former. So if we define an as.vector S3 method for A objects:
>
>   setClass("A", representation(stuff="numeric"))
>   as.vector.A <- function (x, mode="any") x at stuff
>   a <- new("A", stuff=c(3.5, 0.1))
>   x <- numeric(10)
>   x[3:4] <- a

The relevant stack trace is

   * frame #0: 0x000000010dded77a 
libR.dylib`R_has_methods(op=<unavailable>) + 74 at objects.c:1415
     frame #1: 0x000000010ddaabf4 
libR.dylib`Rf_DispatchOrEval(call=0x00007fcea36f68a8, 
op=0x00007fcea201a178, generic=0x000000010df0a185, args=<unavailable>, 
rho=0x00007fcea2053318, ans=0x00007fff51f60c48, 
dropmissing=<unavailable>, argsevald=1) + 404 at eval.c:3150
     frame #2: 0x000000010de4e658 libR.dylib`SubassignTypeFix [inlined] 
dispatch_asvector(x=<unavailable>, call=0x00007fcea36f68a8, 
rho=0x00007fcea2053318) + 295 at subassign.c:283


The segfault is at objects.c:1415

     offset = PRIMOFFSET(op);
     if(offset > curMaxOffset || prim_methods[offset] == NO_METHODS
        || prim_methods[offset] == SUPPRESSED)

where offset is negative and prim_methods[offset] fails.

(lldb) p *op
(SEXPREC) $8 = {
   sxpinfo = (type = 0, obj = 0, named = 2, gp = 0, mark = 1, debug = 0, 
trace = 0, spare = 0, gcgen = 1, gccls = 0)
   attrib = 0x00007fcea201a178
   gengc_next_node = 0x00007fcea21874e8
   gengc_prev_node = 0x00007fcea2019ff0
   u = {
     primsxp = (offset = -1576951432)
     symsxp = {


'op' is assigned from subassign.c:287, op = R_Primitive("as.vector")

static Rboolean dispatch_asvector(SEXP *x, SEXP call, SEXP rho) {
     static SEXP op = NULL;
     SEXP args;
     Rboolean ans;
     if (op == NULL)
         op = R_Primitive("as.vector");
     PROTECT(args = list2(*x, mkString("any")));
     ans = DispatchOrEval(call, op, "as.vector", args, rho, x, 0, 1);
     UNPROTECT(1);
     return ans;
}

But as.vector is not a primitive, so gets R_NilValue. This is passed to 
DispatchOrEval, and then to R_has_methods.

It seems like dispatch_asvector() was introduced by

$ svn log -c69747
------------------------------------------------------------------------
r69747 | lawrence | 2015-12-09 09:04:56 -0500 (Wed, 09 Dec 2015) | 3 lines

subassignment of an S4 value into an atomic vector coerces the value
with as.vector

------------------------------------------------------------------------

So maybe Michael can tell us about his thinking here.

Also, should R_has_methods be robust to R_NilValue? And R_NilValue 
explicitly zero it's data?

Martin


>
> then the code is now valid and we still get the segfault on Mac.
>
> I didn't define as.vector.A in my original minimalist reproducible
> code in order to keep it as simple as possible.
>
> H.
>
>
>> We have seen such examples with something (more complicated
>> than, but basically like)
>>
>>   df <- data.frame(x=1:5, y=5:1, m=matrix(-pi*1:30, 5,6))
>>   M <- Matrix::Matrix(exp(0:3),2)
>>   df[1:2,1:2] <- M
>>
>> which actually calls `[<-`, and then `[<-.data.frame`  and
>> always works for me but does seg.fault (in the CRAN checks of
>> package FastImputation (on 3 of the dozen platforms,
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cran.r-2Dproject.org_web_checks_check-5Fresults-5FFastImputation.html&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ILfV0tHrE_BxAkWYlvUUwWcBdBdtVD7BlEljGiO3WbY&s=zUahQYlBHRwNf6lPnSA1515Rm-iL5ffQI7hUcDW-JkE&e=
>>
>>
>> one of them is
>>
>>  https://urldefense.proofpoint.com/v2/url?u=https-3A__www.r-2Dproject.org_nosvn_R.check_r-2Ddevel-2Dmacos-2Dx86-5F64-2Dclang_FastImputation-2D00check.html&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ILfV0tHrE_BxAkWYlvUUwWcBdBdtVD7BlEljGiO3WbY&s=Z7LkVlUzmdmhqxGNFl4LuMVxYwQQGHSV7KdpKCJu12k&e=
>>
>>
>> I strongly suspect this is the same bug as yours, but for a case
>> where the correct behavior is *not* giving an error.
>>
>> I have also written and shown  Herve's example  to the R-core team.
>>
>> Unfortunately, I have no platform where I can trigger the bug.
>> Martin
>>
>>
>>
>>     > Cheers,
>>     > Andrzej
>>
>>
>>
>>     > On Wed, Mar 22, 2017 at 1:28 AM, Martin Morgan <
>>     > martin.morgan at roswellpark.org> wrote:
>>
>>     >> On 03/21/2017 08:21 PM, Hervé Pagès wrote:
>>     >>
>>     >>> Hi Leonardo,
>>     >>>
>>     >>> Thanks for hunting down and isolating that bug! I tried to
>> simplify
>>     >>> your code even more and was able to get a segfault with just:
>>     >>>
>>     >>> setClass("A", representation(stuff="numeric"))
>>     >>> x <- logical(10)
>>     >>> x[TRUE] <- new("A")
>>     >>>
>>     >>> I get the segfault about 50% of the time on a fresh R session
>> on Mac.
>>     >>> I tried this with R 3.3.3 on Mavericks, and with R devel (r72372)
>>     >>> on El Capitan. I get the segfault on both.
>>     >>>
>>     >>> So it looks like a bug in the `[<-` primitive to me
>> (subassignment).
>>     >>>
>>     >>
>>     >> Any insight from
>>     >>
>>     >> R -d valgrind -f herve.R
>>     >>
>>     >> where herve.R contains the code above?
>>     >>
>>     >> Martin
>>     >>
>>     >>
>>     >>
>>     >>> Cheers,
>>     >>> H.
>>     >>>
>>     >>> On 03/21/2017 03:06 PM, Leonardo Collado Torres wrote:
>>     >>>
>>     >>>> Hi bioc-devel,
>>     >>>>
>>     >>>> This is a story about a bug that took me a long time to
>> trace. The
>>     >>>> behaviour was really weird, so I'm sharing the story in case
>> this
>>     >>>> helps others in the future. I was originally writing it to
>> request
>>     >>>> help, but then I was able to find the issue ^^. The story
>> ends right
>>     >>>> now with code that will reproduce the problem with '$<-' from
>>     >>>> IRanges/S4Vectors.
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>> During this Bioc cycle, frequently my package derfinder has
>> failed to
>>     >>>> pass R CMD check in OSX. The error is always the same when it
>> appears
>>     >>>> and sometimes it shows up in release, but not devel and
>> viceversa.
>>     >>>> Right now (3/21/2017) it's visible in both
>>     >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__biocondu
>>     >>>> ctor.org_checkResults_release_bioc-2DLATEST_derfinder_
>>     >>>> morelia-2Dchecksrc.html&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfh
>>     >>>> Q&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Bw-1Kqy-M_
>>     >>>> t4kmpYWTpYkt5bvj_eTpxriUM3UvtOIzQ&s=RS-lsygPtDdgWKAhjA2BcSLk
>>     >>>> Vy9RxxshXWAJaBZa_Yc&e=
>>     >>>>
>>     >>>> and
>>     >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__biocondu
>>     >>>> ctor.org_checkResults_devel_bioc-2DLATEST_derfinder_toluca
>>     >>>> 2-2Dchecksrc.html&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3X
>>     >>>> eAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Bw-1Kqy-M_
>>     >>>> t4kmpYWTpYkt5bvj_eTpxriUM3UvtOIzQ&s=a_K-yK7w2LEV72lpHrpp0UoK
>>     >>>> Rru_7Aad74T5Uk0R-Fo&e=
>>     >>>> .
>>     >>>> The end of "test-all.Rout.fail" looks like this:
>>     >>>>
>>     >>>> Loading required package: foreach
>>     >>>> Loading required package: iterators
>>     >>>> Loading required package: locfit
>>     >>>> locfit 1.5-9.1 2013-03-22
>>     >>>> getSegments: segmenting
>>     >>>> getSegments: splitting
>>     >>>> 2017-03-20 02:36:52 findRegions: smoothing
>>     >>>> 2017-03-20 02:36:52 findRegions: identifying potential segments
>>     >>>> 2017-03-20 02:36:52 findRegions: segmenting information
>>     >>>> 2017-03-20 02:36:52 .getSegmentsRle: segmenting with cutoff(s)
>>     >>>> 16.3681899295041
>>     >>>> 2017-03-20 02:36:52 findRegions: identifying candidate regions
>>     >>>> 2017-03-20 02:36:52 findRegions: identifying region clusters
>>     >>>> 2017-03-20 02:36:52 findRegions: smoothing
>>     >>>> 2017-03-20 02:36:52 findRegions: identifying potential segments
>>     >>>> 2017-03-20 02:36:52 findRegions: segmenting information
>>     >>>> 2017-03-20 02:36:52 .getSegmentsRle: segmenting with cutoff(s)
>>     >>>> 19.7936614060235
>>     >>>> 2017-03-20 02:36:52 findRegions: identifying candidate regions
>>     >>>> 2017-03-20 02:36:52 findRegions: identifying region clusters
>>     >>>> 2017-03-20 02:36:52 findRegions: smoothing
>>     >>>>
>>     >>>> *** caught segfault ***
>>     >>>> address 0x7f87d2f917e0, cause 'memory not mapped'
>>     >>>>
>>     >>>> Traceback:
>>     >>>> 1: (function (y, x, cluster, weights, smoothFun, ...) {
>>     >>>> hostPackage <- environmentName(environment(smoothFun))
>>     >>>> requireNamespace(hostPackage)    smoothed <-
>> .runFunFormal(smoothFun,
>>     >>>> y = y, x = x, cluster = cluster,         weights = weights,
>> ...)    if
>>     >>>> (any(!smoothed$smoothed)) {
>> smoothed$fitted[!smoothed$smoothed]
>>     >>>> <- y[!smoothed$smoothed]    }    res <- Rle(smoothed$fitted)
>>     >>>> return(res)})(dots[[1L]][[1L]], dots[[2L]][[1L]],
>> dots[[3L]][[1L]],
>>     >>>> dots[[4L]][[1L]],     smoothFun = function (y, x = NULL,
>> cluster,
>>     >>>> weights = NULL,         minNum = 7, bpSpan = 1000, minInSpan
>> = 0,
>>     >>>> verbose = TRUE)     {        if (is.null(dim(y)))
>> y <-
>>     >>>> matrix(y, ncol = 1)        if (!is.null(weights) &&
>>     >>>> is.null(dim(weights)))             weights <- matrix(weights,
>> ncol =
>>     >>>> 1)        if (is.null(x))             x <- seq(along =
>> y)        if
>>     >>>> (is.null(weights))             weights <- matrix(1, nrow =
>> nrow(y),
>>     >>>> ncol = ncol(y))        Indexes <- split(seq(along = cluster),
>> cluster)
>>     >>>> clusterL <- sapply(Indexes, length)        smoothed <-
>>     >>>> rep(TRUE, nrow(y))        for (i in seq(along = Indexes)) {
>>     >>>> if (verbose)                 if (i%%10000 == 0)
>>     >>>> cat(".")            Index <- Indexes[[i]]            if
>> (clusterL[i]
>>     >>>>
>>     >>>>> = minNum & sum(rowSums(is.na(y[Index,                 , drop =
>>     >>>>>
>>     >>>> FALSE])) == 0) >= minNum) {                nn <-
>>     >>>> minInSpan/length(Index)                for (j in 1:ncol(y)) {
>>     >>>> sdata <- data.frame(pos = x[Index], y = y[Index,
>>     >>>> j], weights = weights[Index, j])                  fit <-
>>     >>>> locfit(y ˜ lp(pos, nn = nn, h = bpSpan),
>> data =
>>     >>>> sdata, weights = weights, family = "gaussian",
>>     >>>> maxk = 10000)                  pp <- preplot(fit, where =
>> "data", band
>>     >>>> = "local",                     newdata = data.frame(pos =
>> x[Index]))
>>     >>>> y[Index, j] <- pp$trans(pp$fit)                }
>>     >>>> }            else {                y[Index, ] <- NA
>>     >>>> smoothed[Index] <- FALSE            }        }
>>     >>>> return(list(fitted = y, smoothed = smoothed, smoother =
>> "locfit"))
>>     >>>> }, verbose = TRUE, minNum = 1435)
>>     >>>> 2: .mapply(.FUN, dots, .MoreArgs)
>>     >>>> 3: FUN(...)
>>     >>>> 4: doTryCatch(return(expr), name, parentenv, handler)
>>     >>>> 5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>>     >>>> 6: tryCatchList(expr, classes, parentenv, handlers)
>>     >>>> 7: tryCatch({    FUN(...)}, error = handle_error)
>>     >>>> 8: withCallingHandlers({    tryCatch({        FUN(...)    },
>> error =
>>     >>>> handle_error)}, warning = handle_warning)
>>     >>>> 9: FUN(X[[i]], ...)
>>     >>>> 10: lapply(X, FUN, ...)
>>     >>>> 11: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd
>> = ddd,
>>     >>>> .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
>>     >>>> 12: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd
>> = ddd,
>>     >>>> .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
>>     >>>> 13: bpmapply(.smoothFstatsFun, fstatsChunks, posChunks,
>> clusterChunks,
>>     >>>> weightChunks, MoreArgs = list(smoothFun = smoothFunction,
>>     >>>> ...), BPPARAM = BPPARAM)
>>     >>>> 14: bpmapply(.smoothFstatsFun, fstatsChunks, posChunks,
>> clusterChunks,
>>     >>>> weightChunks, MoreArgs = list(smoothFun = smoothFunction,
>>     >>>> ...), BPPARAM = BPPARAM)
>>     >>>> 15: .smootherFstats(fstats = fstats, position = position,
>> weights =
>>     >>>> weights,     smoothFunction = smoothFunction, ...)
>>     >>>> 16: findRegions(prep$position, genomeFstats, "chr21", verbose
>> = TRUE,
>>     >>>> smooth = TRUE, minNum = 1435)
>>     >>>> 17: eval(exprs, env)
>>     >>>> 18: eval(exprs, env)
>>     >>>> 19: source_file(path, new.env(parent = env), chdir = TRUE)
>>     >>>> 20: force(code)
>>     >>>> 21: with_reporter(reporter = reporter, start_end_reporter =
>>     >>>> start_end_reporter,     {
>> lister$start_file(basename(path))
>>     >>>> source_file(path, new.env(parent = env), chdir = TRUE)
>>     >>>> end_context()    })
>>     >>>> 22: FUN(X[[i]], ...)
>>     >>>> 23: lapply(paths, test_file, env = env, reporter =
>> current_reporter,
>>     >>>> start_end_reporter = FALSE, load_helpers = FALSE)
>>     >>>> 24: force(code)
>>     >>>> 25: with_reporter(reporter = current_reporter, results <-
>>     >>>> lapply(paths,     test_file, env = env, reporter =
>> current_reporter,
>>     >>>> start_end_reporter = FALSE,     load_helpers = FALSE))
>>     >>>> 26: test_files(paths, reporter = reporter, env = env, ...)
>>     >>>> 27: test_dir(test_path, reporter = reporter, env = env, filter =
>>     >>>> filter,     ...)
>>     >>>> 28: with_top_env(env, {    test_dir(test_path, reporter =
>> reporter,
>>     >>>> env = env, filter = filter,         ...)})
>>     >>>> 29: run_tests(package, test_path, filter, reporter, ...)
>>     >>>> 30: test_check("derfinder")
>>     >>>> An irrecoverable exception occurred. R is aborting now ...
>>     >>>>
>>     >>>> I was finally able to reproduce this error on my Mac OSX
>> laptop after
>>     >>>> running R CMD build and R CMD check (same options as in Bioc)
>> several
>>     >>>> times. It took me a while, but I figured out what's the exact
>> code
>>     >>>> that's failing. It can be reproduced (noting that it won't
>> always
>>     >>>> fail...) in OSX by running:
>>     >>>>
>>     >>>> library('derfinder')
>>     >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
>>     >>>> chunksize=1e3,
>>     >>>> colsubset=NULL)
>>     >>>> regs_s3 <- findRegions(prep$position, genomeFstats, 'chr21',
>>     >>>> verbose=TRUE, smooth = TRUE, minNum = 1435)
>>     >>>>
>>     >>>>
>>     >>>> Here is the output from my laptop one time it actually failed:
>>     >>>>
>>     >>>> library('derfinder')
>>     >>>>>
>>     >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
>>     >>>> chunksize=1e3,
>>     >>>> colsubset=NULL)
>>     >>>>
>>     >>>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
>>     >>>>> chunksize=1e3,
>>     >>>>>
>>     >>>> +     colsubset=NULL)
>>     >>>>
>>     >>>>> regs_s3 <- findRegions(prep$position, genomeFstats, 'chr21',
>>     >>>>> verbose=TRUE, smooth = TRUE, minNum = 1435)
>>     >>>>>
>>     >>>> 2017-03-21 16:37:39 findRegions: smoothing
>>     >>>>
>>     >>>> *** caught segfault ***
>>     >>>> address 0x7f958dbf2be0, cause 'memory not mapped'
>>     >>>>
>>     >>>> Traceback:
>>     >>>> 1: (function (y, x, cluster, weights, smoothFun, ...) {
>>     >>>> hostPackage <- environmentName(environment(smoothFun))
>>     >>>> requireNamespace(hostPackage)    smoothed <-
>> .runFunFormal(smoothFun,
>>     >>>> y = y, x = x, cluster = cluster,         weights = weights,
>> ...)    if
>>     >>>> (any(!smoothed$smoothed)) {
>> smoothed$fitted[!smoothed$smoothed]
>>     >>>> <- y[!smoothed$smoothed]    }    res <- Rle(smoothed$fitted)
>>     >>>> return(res)})(dots[[1L]][[1L]], dots[[2L]][[1L]],
>> dots[[3L]][[1L]],
>>     >>>> dots[[4L]][[1L]],     smoothFun = function (y, x = NULL,
>> cluster,
>>     >>>> weights = NULL,         minNum = 7, bpSpan = 1000, minInSpan
>> = 0,
>>     >>>> verbose = TRUE)     {        if (is.null(dim(y)))
>> y <-
>>     >>>> matrix(y, ncol = 1)        if (!is.null(weights) &&
>>     >>>> is.null(dim(weights)))             weights <- matrix(weights,
>> ncol =
>>     >>>> 1)        if (is.null(x))             x <- seq(along =
>> y)        if
>>     >>>> (is.null(weights))             weights <- matrix(1, nrow =
>> nrow(y),
>>     >>>> ncol = ncol(y))        Indexes <- split(seq(along = cluster),
>> cluster)
>>     >>>> clusterL <- sapply(Indexes, length)        smoothed <-
>>     >>>> rep(TRUE, nrow(y))        for (i in seq(along = Indexes)) {
>>     >>>> if (verbose)                 if (i%%10000 == 0)
>>     >>>> cat(".")            Index <- Indexes[[i]]            if
>> (clusterL[i]
>>     >>>>
>>     >>>>> = minNum & sum(rowSums(is.na(y[Index,                 , drop =
>>     >>>>>
>>     >>>> FALSE])) == 0) >= minNum) {                nn <-
>>     >>>> minInSpan/length(Index)                for (j in 1:ncol(y)) {
>>     >>>> sdata <- data.frame(pos = x[Index], y = y[Index,
>>     >>>> j], weights = weights[Index, j])                  fit <-
>>     >>>> locfit(y ~ lp(pos, nn = nn, h = bpSpan),
>> data =
>>     >>>> sdata, weights = weights, family = "gaussian",
>>     >>>> maxk = 10000)                  pp <- preplot(fit, where =
>> "data", band
>>     >>>> = "local",                     newdata = data.frame(pos =
>> x[Index]))
>>     >>>> y[Index, j] <- pp$trans(pp$fit)                }
>>     >>>> }            else {                y[Index, ] <- NA
>>     >>>> smoothed[Index] <- FALSE            }        }
>>     >>>> return(list(fitted = y, smoothed = smoothed, smoother =
>> "locfit"))
>>     >>>> }, verbose = TRUE, minNum = 1435)
>>     >>>> 2: .mapply(.FUN, dots, .MoreArgs)
>>     >>>> 3: FUN(...)
>>     >>>> 4: doTryCatch(return(expr), name, parentenv, handler)
>>     >>>> 5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>>     >>>> 6: tryCatchList(expr, classes, parentenv, handlers)
>>     >>>> 7: tryCatch({    FUN(...)}, error = handle_error)
>>     >>>> 8: withCallingHandlers({    tryCatch({        FUN(...)    },
>> error =
>>     >>>> handle_error)}, warning = handle_warning)
>>     >>>> 9: FUN(X[[i]], ...)
>>     >>>> 10: lapply(X, FUN, ...)
>>     >>>> 11: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd
>> = ddd,
>>     >>>> .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
>>     >>>> 12: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd
>> = ddd,
>>     >>>> .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
>>     >>>> 13: bpmapply(.smoothFstatsFun, fstatsChunks, posChunks,
>> clusterChunks,
>>     >>>> weightChunks, MoreArgs = list(smoothFun = smoothFunction,
>>     >>>> ...), BPPARAM = BPPARAM)
>>     >>>> 14: bpmapply(.smoothFstatsFun, fstatsChunks, posChunks,
>> clusterChunks,
>>     >>>> weightChunks, MoreArgs = list(smoothFun = smoothFunction,
>>     >>>> ...), BPPARAM = BPPARAM)
>>     >>>> 15: .smootherFstats(fstats = fstats, position = position,
>> weights =
>>     >>>> weights,     smoothFunction = smoothFunction, ...)
>>     >>>> 16: findRegions(prep$position, genomeFstats, "chr21", verbose
>> = TRUE,
>>     >>>> smooth = TRUE, minNum = 1435)
>>     >>>>
>>     >>>> Possible actions:
>>     >>>> 1: abort (with core dump, if enabled)
>>     >>>> 2: normal R exit
>>     >>>> 3: exit R without saving workspace
>>     >>>> 4: exit R saving workspace
>>     >>>>
>>     >>>> The traceback information ends at's
>> bumphunter::loessByCluster().
>>     >>>>
>>     >>>>
>>     >>>> I have successfully used the following code other times (see
>> below)
>>     >>>> where I test the culprit line 100 times. By successfully, I
>> mean that
>>     >>>> the code ran without problems... so it was unsuccessful at
>> reproducing
>>     >>>> the problem.
>>     >>>>
>>     >>>> library('derfinder')
>>     >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
>>     >>>> chunksize=1e3,
>>     >>>> colsubset=NULL)
>>     >>>>
>>     >>>> for(i in 1:100) {
>>     >>>> print(i)
>>     >>>> regs_s3 <- findRegions(prep$position, genomeFstats, 'chr21',
>>     >>>> verbose=TRUE, smooth = TRUE, minNum = 1435)
>>     >>>> }
>>     >>>> options(width = 120)
>>     >>>> devtools::session_info()
>>     >>>>
>>     >>>>
>>     >>>> I had several R processes open the one time it did fail, but
>> well,
>>     >>>> I've had multiple of them open the times that the code didn't
>> fail. So
>>     >>>> having multiple R processes doesn't seem to be an issue.
>>     >>>>
>>     >>>> The line that triggers the segfault is used simply to test that
>>     >>>> passing the argument 'minNum' to bumphunter::loessByCluster()
>> via
>>     >>>> '...' works. It's not a relevant test for derfinder and I was
>> tempted
>>     >>>> to remove it, although before tracing the bug I talked with
>> Valerie
>>     >>>> about not removing it. With the upcoming Bioconductor release I
>>     >>>> decided to finally trace the line that triggers the segfault.
>> At this
>>     >>>> point I was feeling lost...
>>     >>>>
>>     >>>>
>>     >>>> Running the following code seems to trigger the segfault more
>> often (I
>>     >>>> got it like 4 times in a row):
>>     >>>>
>>     >>>> library('derfinder')
>>     >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
>>     >>>> chunksize=1e3,
>>     >>>> colsubset=NULL)
>>     >>>> regs_s1 <- findRegions(prep$position, genomeFstats, 'chr21',
>>     >>>> verbose=TRUE, smooth = TRUE)
>>     >>>> regs_s2 <- findRegions(prep$position, genomeFstats, 'chr21',
>>     >>>> verbose=TRUE, smooth = TRUE, smoothFunction =
>>     >>>> bumphunter::runmedByCluster)
>>     >>>> regs_s3 <- findRegions(prep$position, genomeFstats, 'chr21',
>>     >>>> verbose=TRUE, smooth = TRUE, minNum = 1435)
>>     >>>>
>>     >>>> But then I can still run the same code without problems on a
>> for loop
>>     >>>> for 100 times:
>>     >>>>
>>     >>>> library('derfinder')
>>     >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
>>     >>>> chunksize=1e3,
>>     >>>> colsubset=NULL)
>>     >>>>
>>     >>>> for(i in 1:100) {
>>     >>>> print(i)
>>     >>>> regs_s1 <- findRegions(prep$position, genomeFstats, 'chr21',
>>     >>>> verbose=TRUE, smooth = TRUE)
>>     >>>> regs_s2 <- findRegions(prep$position, genomeFstats, 'chr21',
>>     >>>> verbose=TRUE, smooth = TRUE, smoothFunction =
>>     >>>> bumphunter::runmedByCluster)
>>     >>>> regs_s3 <- findRegions(prep$position, genomeFstats, 'chr21',
>>     >>>> verbose=TRUE, smooth = TRUE, minNum = 1435)
>>     >>>> }
>>     >>>> options(width = 120)
>>     >>>> devtools::session_info()
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>> I next thought of going through findRegions() to produce simple
>>     >>>> objects that could reproduce the error. I had in mine sharing
>> these
>>     >>>> objects so it would be easier for others to help me figure
>> out what
>>     >>>> was failing. It turns out that this code segfaulted reliably
>> (all the
>>     >>>> times I tested it at least):
>>     >>>>
>>     >>>>
>>     >>>> library('derfinder')
>>     >>>> library('BiocParallel')
>>     >>>> library('IRanges')
>>     >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
>>     >>>> chunksize=1e3,
>>     >>>> colsubset=NULL)
>>     >>>> fstats <- genomeFstats
>>     >>>> position <- prep$position
>>     >>>> weights <- NULL
>>     >>>> cluster <- derfinder:::.clusterMakerRle(position, 300L)
>>     >>>> cluster
>>     >>>> BPPARAM <- SerialParam()
>>     >>>> iChunks <- rep(1, length(cluster))
>>     >>>>
>>     >>>> fstatsChunks <- split(fstats, iChunks)
>>     >>>> posChunks <- split(which(position), iChunks)
>>     >>>> clusterChunks <- split(cluster, iChunks)
>>     >>>> weightChunks <- vector('list', length = length(unique(iChunks)))
>>     >>>>
>>     >>>> res <- bpmapply(bumphunter::loessByCluster, fstatsChunks,
>> posChunks,
>>     >>>> clusterChunks, weightChunks, MoreArgs = list(minNum = 1435),
>>     >>>> BPPARAM = BPPARAM, SIMPLIFY = FALSE)
>>     >>>>
>>     >>>> y <- fstatsChunks[[1]]
>>     >>>> smoothed <- res[[1]]
>>     >>>>
>>     >>>> ## This segfaults:
>>     >>>> if(any(!smoothed$smoothed)) {
>>     >>>> smoothed$fitted[!smoothed$smoothed] <- y[!smoothed$smoothed]
>>     >>>> }
>>     >>>>
>>     >>>>
>>     >>>> The objects on the line that fail are a list and an Rle:
>>     >>>>
>>     >>>> y
>>     >>>>>
>>     >>>> numeric-Rle of length 1434 with 358 runs
>>     >>>> Lengths:                    1                    5
>>     >>>> ...                    1
>>     >>>> Values :       5.109484425367     3.85228949953674 ...
>>     >>>> 3.99765511645983
>>     >>>>
>>     >>>>> lapply(smoothed, head)
>>     >>>>>
>>     >>>> $fitted
>>     >>>> [,1]
>>     >>>> [1,]   NA
>>     >>>> [2,]   NA
>>     >>>> [3,]   NA
>>     >>>> [4,]   NA
>>     >>>> [5,]   NA
>>     >>>> [6,]   NA
>>     >>>>
>>     >>>> $smoothed
>>     >>>> [1] FALSE FALSE FALSE FALSE FALSE FALSE
>>     >>>>
>>     >>>> $smoother
>>     >>>> [1] "loess"
>>     >>>>
>>     >>>>> table(!smoothed$smoothed)
>>     >>>>>
>>     >>>>
>>     >>>> TRUE
>>     >>>> 1434
>>     >>>>
>>     >>>>> y[!smoothed$smoothed]
>>     >>>>>
>>     >>>> numeric-Rle of length 1434 with 358 runs
>>     >>>> Lengths:                    1                    5
>>     >>>> ...                    1
>>     >>>> Values :       5.109484425367     3.85228949953674 ...
>>     >>>> 3.99765511645983
>>     >>>>
>>     >>>> So in my derfinder code I was assigning an Rle to a matrix,
>> and that
>>     >>>> was the segfault. I have no idea why this doesn't always fail
>> on OSX
>>     >>>> and why it never failed on Linux or Windows.
>>     >>>>
>>     >>>>
>>     >>>> This is the super simplified IRanges code that fails:
>>     >>>>
>>     >>>> library('IRanges')
>>     >>>> y <- Rle(runif(10, 1, 1))
>>     >>>> smoothed <- list('fitted' = matrix(NA, ncol = 1, nrow = 10),
>>     >>>> 'smoothed' = rep(FALSE, 10), smoother = 'loess')
>>     >>>> sessionInfo()
>>     >>>> smoothed$fitted[!smoothed$smoothed] <- y[!smoothed$smoothed]
>>     >>>>
>>     >>>> ## Segfault on OSX
>>     >>>>
>>     >>>> library('IRanges')
>>     >>>>> y <- Rle(runif(10, 1, 1))
>>     >>>>> smoothed <- list('fitted' = matrix(NA, ncol = 1, nrow = 10),
>>     >>>>>
>>     >>>> +     'smoothed' = rep(FALSE, 10), smoother = 'loess')
>>     >>>>
>>     >>>>>
>>     >>>>> sessionInfo()
>>     >>>>>
>>     >>>> R Under development (unstable) (2016-10-26 r71594)
>>     >>>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
>>     >>>> Running under: macOS Sierra 10.12.3
>>     >>>>
>>     >>>> locale:
>>     >>>> [1]
>> en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>     >>>>
>>     >>>> attached base packages:
>>     >>>> [1] stats4    parallel  stats     graphics  grDevices utils
>>     >>>> datasets  methods   base
>>     >>>>
>>     >>>> other attached packages:
>>     >>>> [1] IRanges_2.9.19      S4Vectors_0.13.15   BiocGenerics_0.21.3
>>     >>>>
>>     >>>>> smoothed$fitted[!smoothed$smoothed] <- y[!smoothed$smoothed]
>>     >>>>>
>>     >>>>
>>     >>>> *** caught segfault ***
>>     >>>> address 0x7fcdc31dffe0, cause 'memory not mapped'
>>     >>>>
>>     >>>> Possible actions:
>>     >>>> 1: abort (with core dump, if enabled)
>>     >>>> 2: normal R exit
>>     >>>> 3: exit R without saving workspace
>>     >>>> 4: exit R saving workspace
>>     >>>>
>>     >>>>
>>     >>>> ## No problems on Linux
>>     >>>>
>>     >>>> library('IRanges')
>>     >>>>> y <- Rle(runif(10, 1, 1))
>>     >>>>> smoothed <- list('fitted' = matrix(NA, ncol = 1, nrow = 10),
>>     >>>>>
>>     >>>> +     'smoothed' = rep(FALSE, 10), smoother = 'loess')
>>     >>>>
>>     >>>>>
>>     >>>>> sessionInfo()
>>     >>>>>
>>     >>>> R version 3.3.1 Patched (2016-09-30 r71426)
>>     >>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>     >>>> Running under: Red Hat Enterprise Linux Server release 6.6
>> (Santiago)
>>     >>>>
>>     >>>> locale:
>>     >>>> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>     >>>> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>     >>>> [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>     >>>> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>     >>>> [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>     >>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>     >>>>
>>     >>>> attached base packages:
>>     >>>> [1] stats4    parallel  stats     graphics  grDevices
>> datasets  utils
>>     >>>> [8] methods   base
>>     >>>>
>>     >>>> other attached packages:
>>     >>>> [1] IRanges_2.8.2       S4Vectors_0.12.2    BiocGenerics_0.20.0
>>     >>>> [4] colorout_1.1-2
>>     >>>>
>>     >>>> loaded via a namespace (and not attached):
>>     >>>> [1] tools_3.3.1
>>     >>>>
>>     >>>>> smoothed$fitted[!smoothed$smoothed] <- y[!smoothed$smoothed]
>>     >>>>>
>>     >>>>
>>     >>>>
>>     >>>> Best,
>>     >>>> Leo
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>> The session information for my first tests is below:
>>     >>>>
>>     >>>> devtools::session_info()
>>     >>>>>
>>     >>>> Session info
>>     >>>> ------------------------------------------------------------
>>     >>>> -----------------------------------------------
>>     >>>>
>>     >>>> setting  value
>>     >>>> version  R Under development (unstable) (2016-10-26 r71594)
>>     >>>> system   x86_64, darwin13.4.0
>>     >>>> ui       X11
>>     >>>> language (EN)
>>     >>>> collate  en_US.UTF-8
>>     >>>> tz       America/New_York
>>     >>>> date     2017-03-21
>>     >>>>
>>     >>>> Packages
>>     >>>> ------------------------------------------------------------
>>     >>>> ---------------------------------------------------
>>     >>>>
>>     >>>> package              * version  date       source
>>     >>>> acepack                1.4.1    2016-10-29 CRAN (R 3.4.0)
>>     >>>> AnnotationDbi          1.37.4   2017-03-10 Bioconductor
>>     >>>> assertthat             0.1      2013-12-06 CRAN (R 3.4.0)
>>     >>>> backports              1.0.5    2017-01-18 CRAN (R 3.4.0)
>>     >>>> base64enc              0.1-3    2015-07-28 CRAN (R 3.4.0)
>>     >>>> Biobase                2.35.1   2017-02-23 Bioconductor
>>     >>>> BiocGenerics         * 0.21.3 2017-01-12 Bioconductor
>>     >>>> BiocParallel           1.9.5    2017-01-24 Bioconductor
>>     >>>> biomaRt                2.31.4   2017-01-13 Bioconductor
>>     >>>> Biostrings             2.43.5   2017-03-19 cran (@2.43.5)
>>     >>>> bitops                 1.0-6 2013-08-17 CRAN (R 3.4.0)
>>     >>>> BSgenome               1.43.7   2017-02-24 Bioconductor
>>     >>>> bumphunter           * 1.15.0 2016-10-23 Bioconductor
>>     >>>> checkmate              1.8.2    2016-11-02 CRAN (R 3.4.0)
>>     >>>> cluster                2.0.6    2017-03-16 CRAN (R 3.4.0)
>>     >>>> codetools              0.2-15 2016-10-05 CRAN (R 3.4.0)
>>     >>>> colorout             * 1.1-2    2016-11-15 Github
>>     >>>> (jalvesaq/colorout at 6d84420)
>>     >>>> colorspace             1.3-2    2016-12-14 CRAN (R 3.4.0)
>>     >>>> crayon                 1.3.2    2016-06-28 CRAN (R 3.4.0)
>>     >>>> data.table             1.10.4   2017-02-01 CRAN (R 3.4.0)
>>     >>>> DBI                    0.6      2017-03-09 CRAN (R 3.4.0)
>>     >>>> DelayedArray           0.1.7 2017-02-17 Bioconductor
>>     >>>> derfinder            * 1.9.10   2017-03-17 cran (@1.9.10)
>>     >>>> derfinderHelper        1.9.4    2017-03-07 Bioconductor
>>     >>>> devtools               1.12.0 2016-12-05 CRAN (R 3.4.0)
>>     >>>> digest                 0.6.12 2017-01-27 CRAN (R 3.4.0)
>>     >>>> doRNG                  1.6      2014-03-07 CRAN (R 3.4.0)
>>     >>>> foreach              * 1.4.3    2015-10-13 CRAN (R 3.4.0)
>>     >>>> foreign                0.8-67 2016-09-13 CRAN (R 3.4.0)
>>     >>>> Formula                1.2-1    2015-04-07 CRAN (R 3.4.0)
>>     >>>> GenomeInfoDb         * 1.11.9   2017-02-08 Bioconductor
>>     >>>> GenomeInfoDbData       0.99.0 2017-02-14 Bioconductor
>>     >>>> GenomicAlignments      1.11.12  2017-03-16 cran (@1.11.12)
>>     >>>> GenomicFeatures        1.27.10  2017-03-16 cran (@1.27.10)
>>     >>>> GenomicFiles           1.11.4   2017-03-10 Bioconductor
>>     >>>> GenomicRanges        * 1.27.23  2017-02-25 Bioconductor
>>     >>>> ggplot2                2.2.1    2016-12-30 CRAN (R 3.4.0)
>>     >>>> gridExtra              2.2.1    2016-02-29 CRAN (R 3.4.0)
>>     >>>> gtable                 0.2.0 2016-02-26 CRAN (R 3.4.0)
>>     >>>> Hmisc                  4.0-2 2016-12-31 CRAN (R 3.4.0)
>>     >>>> htmlTable              1.9      2017-01-26 CRAN (R 3.4.0)
>>     >>>> htmltools              0.3.5 2016-03-21 CRAN (R 3.4.0)
>>     >>>> htmlwidgets            0.8      2016-11-09 CRAN (R 3.4.0)
>>     >>>> IRanges              * 2.9.19   2017-03-15 cran (@2.9.19)
>>     >>>> iterators            * 1.0.8    2015-10-13 CRAN (R 3.4.0)
>>     >>>> knitr                  1.15.1   2016-11-22 CRAN (R 3.4.0)
>>     >>>> lattice                0.20-34 2016-09-06 CRAN (R 3.4.0)
>>     >>>> latticeExtra           0.6-28 2016-02-09 CRAN (R 3.4.0)
>>     >>>> lazyeval               0.2.0 2016-06-12 CRAN (R 3.4.0)
>>     >>>> locfit               * 1.5-9.1  2013-04-20 CRAN (R 3.4.0)
>>     >>>> magrittr               1.5      2014-11-22 CRAN (R 3.4.0)
>>     >>>> Matrix                 1.2-8    2017-01-20 CRAN (R 3.4.0)
>>     >>>> matrixStats            0.51.0 2016-10-09 CRAN (R 3.4.0)
>>     >>>> memoise                1.0.0 2016-01-29 CRAN (R 3.4.0)
>>     >>>> munsell                0.4.3 2016-02-13 CRAN (R 3.4.0)
>>     >>>> nnet                   7.3-12   2016-02-02 CRAN (R 3.4.0)
>>     >>>> pkgmaker               0.22     2014-05-14 CRAN (R 3.4.0)
>>     >>>> plyr                   1.8.4    2016-06-08 CRAN (R 3.4.0)
>>     >>>> qvalue                 2.7.0 2016-10-23 Bioconductor
>>     >>>> R6                     2.2.0 2016-10-05 CRAN (R 3.4.0)
>>     >>>> RColorBrewer           1.1-2    2014-12-07 CRAN (R 3.4.0)
>>     >>>> Rcpp                   0.12.10  2017-03-19 CRAN (R 3.4.0)
>>     >>>> RCurl                  1.95-4.8 2016-03-01 CRAN (R 3.4.0)
>>     >>>> registry               0.3      2015-07-08 CRAN (R 3.4.0)
>>     >>>> reshape2               1.4.2    2016-10-22 CRAN (R 3.4.0)
>>     >>>> rngtools               1.2.4    2014-03-06 CRAN (R 3.4.0)
>>     >>>> rpart                  4.1-10   2015-06-29 CRAN (R 3.4.0)
>>     >>>> Rsamtools              1.27.13  2017-03-14 cran (@1.27.13)
>>     >>>> RSQLite                1.1-2    2017-01-08 CRAN (R 3.4.0)
>>     >>>> rtracklayer            1.35.9   2017-03-19 cran (@1.35.9)
>>     >>>> S4Vectors            * 0.13.15  2017-02-14 cran (@0.13.15)
>>     >>>> scales                 0.4.1 2016-11-09 CRAN (R 3.4.0)
>>     >>>> stringi                1.1.2    2016-10-01 CRAN (R 3.4.0)
>>     >>>> stringr                1.2.0 2017-02-18 CRAN (R 3.4.0)
>>     >>>> SummarizedExperiment   1.5.7    2017-02-23 Bioconductor
>>     >>>> survival               2.41-2   2017-03-16 CRAN (R 3.4.0)
>>     >>>> testthat             * 1.0.2    2016-04-23 CRAN (R 3.4.0)
>>     >>>> tibble                 1.2      2016-08-26 CRAN (R 3.4.0)
>>     >>>> VariantAnnotation      1.21.17  2017-02-12 Bioconductor
>>     >>>> withr                  1.0.2    2016-06-20 CRAN (R 3.4.0)
>>     >>>> XML                    3.98-1.5 2016-11-10 CRAN (R 3.4.0)
>>     >>>> xtable                 1.8-2    2016-02-05 CRAN (R 3.4.0)
>>     >>>> XVector                0.15.2 2017-02-02 Bioconductor
>>     >>>> zlibbioc               1.21.0 2016-10-23 Bioconductor
>>     >>>>
>>     >>>> _______________________________________________
>>     >>>> Bioc-devel at r-project.org mailing list
>>     >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et
>>     >>>> hz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt
>>     >>>> 84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=
>>     >>>> Bw-1Kqy-M_t4kmpYWTpYkt5bvj_eTpxriUM3UvtOIzQ&s=hEBTd8bPfLVp6H
>>     >>>> oN3XSBk6ppmeRZhdLoB8VseYM_Byk&e=
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>
>>     >>
>>     >> This email message may contain legally privileged
>> and/or...{{dropped:2}}
>>     >>
>>     >>
>>     >> _______________________________________________
>>     >> Bioc-devel at r-project.org mailing list
>>     >>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ILfV0tHrE_BxAkWYlvUUwWcBdBdtVD7BlEljGiO3WbY&s=TAyV6oTRVnq_7U29cOp53zyNEu6sSL7iaaCRECw2YVs&e=
>>
>>     >>
>>
>>     > [[alternative HTML version deleted]]
>>
>>     > _______________________________________________
>>     > Bioc-devel at r-project.org mailing list
>>     >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ILfV0tHrE_BxAkWYlvUUwWcBdBdtVD7BlEljGiO3WbY&s=TAyV6oTRVnq_7U29cOp53zyNEu6sSL7iaaCRECw2YVs&e=
>>
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ILfV0tHrE_BxAkWYlvUUwWcBdBdtVD7BlEljGiO3WbY&s=TAyV6oTRVnq_7U29cOp53zyNEu6sSL7iaaCRECw2YVs&e=
>>
>>
>


This email message may contain legally privileged and/or...{{dropped:2}}



More information about the Bioc-devel mailing list