[Bioc-devel] The story of tracing a derfinder bug on OSX that sometimes popped up, sometimes it didn't. Related to IRanges/S4Vectors '$<-'
Martin Morgan
martin.morgan at roswellpark.org
Wed Mar 22 11:36:19 CET 2017
On 03/22/2017 06:17 AM, Martin Maechler wrote:
>>>>>> Andrzej Oleś <andrzej.oles at gmail.com>
>>>>>> on Wed, 22 Mar 2017 10:29:57 +0100 writes:
>
> > Just for the record, on R-3.3.2 Herve's code fails with the following error:
> > Error in x[TRUE] <- new("A") :
> > incompatible types (from S4 to logical) in subassignment type fix
>
> yes, (of course).... and I would be interested in a small
> reproducible example which uses _valid_ code.
> We have seen such examples with something (more complicated
> than, but basically like)
>
> df <- data.frame(x=1:5, y=5:1, m=matrix(-pi*1:30, 5,6))
> M <- Matrix::Matrix(exp(0:3),2)
> df[1:2,1:2] <- M
>
> which actually calls `[<-`, and then `[<-.data.frame` and
> always works for me but does seg.fault (in the CRAN checks of
> package FastImputation (on 3 of the dozen platforms,
> https://cran.r-project.org/web/checks/check_results_FastImputation.html
>
> one of them is
>
> https://www.r-project.org/nosvn/R.check/r-devel-macos-x86_64-clang/FastImputation-00check.html
>
> I strongly suspect this is the same bug as yours, but for a case
> where the correct behavior is *not* giving an error.
>
> I have also written and shown Herve's example to the R-core team.
On Linux, if I run Leonardo's or Martin's example under gctorture(TRUE)
I see a PROTECT problem; I don't see this with Herve's code, and I don't
think his code goes through this execution path.
e.g.
> library(Matrix)
> gctorture(TRUE)
> df <- data.frame(x=1:5, y=5:1, m=matrix(-pi*1:30, 5,6))
> M <- Matrix::Matrix(exp(0:3),2)
> df[1:2,1:2] <- M
Error in `[<-.data.frame`(`*tmp*`, 1:2, 1:2, value = <S4 object of class
"dgeMatrix">) :
unimplemented type 'integer' in 'coerceToInteger'
This is addressed in r72383.
R-devel$ svn diff -c72383
Index: src/main/array.c
===================================================================
--- src/main/array.c (revision 72382)
+++ src/main/array.c (revision 72383)
@@ -436,8 +436,12 @@
DispatchOrEval(call, op, "length", args, rho, &ans, 0, 1)) {
if (length(ans) == 1 && TYPEOF(ans) == REALSXP) {
double d = REAL(ans)[0];
- if (R_FINITE(d) && d >= 0. && d <= INT_MAX && floor(d) == d)
- return coerceVector(ans, INTSXP);
+ if (R_FINITE(d) && d >= 0. && d <= INT_MAX && floor(d) == d) {
+ PROTECT(ans);
+ ans = coerceVector(ans, INTSXP);
+ UNPROTECT(1);
+ return(ans);
+ }
}
return(ans);
}
Martin Morgan
>
> Unfortunately, I have no platform where I can trigger the bug.
> Martin
>
>
>
> > Cheers,
> > Andrzej
>
>
>
> > On Wed, Mar 22, 2017 at 1:28 AM, Martin Morgan <
> > martin.morgan at roswellpark.org> wrote:
>
> >> On 03/21/2017 08:21 PM, Hervé Pagès wrote:
> >>
> >>> Hi Leonardo,
> >>>
> >>> Thanks for hunting down and isolating that bug! I tried to simplify
> >>> your code even more and was able to get a segfault with just:
> >>>
> >>> setClass("A", representation(stuff="numeric"))
> >>> x <- logical(10)
> >>> x[TRUE] <- new("A")
> >>>
> >>> I get the segfault about 50% of the time on a fresh R session on Mac.
> >>> I tried this with R 3.3.3 on Mavericks, and with R devel (r72372)
> >>> on El Capitan. I get the segfault on both.
> >>>
> >>> So it looks like a bug in the `[<-` primitive to me (subassignment).
> >>>
> >>
> >> Any insight from
> >>
> >> R -d valgrind -f herve.R
> >>
> >> where herve.R contains the code above?
> >>
> >> Martin
> >>
> >>
> >>
> >>> Cheers,
> >>> H.
> >>>
> >>> On 03/21/2017 03:06 PM, Leonardo Collado Torres wrote:
> >>>
> >>>> Hi bioc-devel,
> >>>>
> >>>> This is a story about a bug that took me a long time to trace. The
> >>>> behaviour was really weird, so I'm sharing the story in case this
> >>>> helps others in the future. I was originally writing it to request
> >>>> help, but then I was able to find the issue ^^. The story ends right
> >>>> now with code that will reproduce the problem with '$<-' from
> >>>> IRanges/S4Vectors.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> During this Bioc cycle, frequently my package derfinder has failed to
> >>>> pass R CMD check in OSX. The error is always the same when it appears
> >>>> and sometimes it shows up in release, but not devel and viceversa.
> >>>> Right now (3/21/2017) it's visible in both
> >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__biocondu
> >>>> ctor.org_checkResults_release_bioc-2DLATEST_derfinder_
> >>>> morelia-2Dchecksrc.html&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfh
> >>>> Q&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Bw-1Kqy-M_
> >>>> t4kmpYWTpYkt5bvj_eTpxriUM3UvtOIzQ&s=RS-lsygPtDdgWKAhjA2BcSLk
> >>>> Vy9RxxshXWAJaBZa_Yc&e=
> >>>>
> >>>> and
> >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__biocondu
> >>>> ctor.org_checkResults_devel_bioc-2DLATEST_derfinder_toluca
> >>>> 2-2Dchecksrc.html&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3X
> >>>> eAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Bw-1Kqy-M_
> >>>> t4kmpYWTpYkt5bvj_eTpxriUM3UvtOIzQ&s=a_K-yK7w2LEV72lpHrpp0UoK
> >>>> Rru_7Aad74T5Uk0R-Fo&e=
> >>>> .
> >>>> The end of "test-all.Rout.fail" looks like this:
> >>>>
> >>>> Loading required package: foreach
> >>>> Loading required package: iterators
> >>>> Loading required package: locfit
> >>>> locfit 1.5-9.1 2013-03-22
> >>>> getSegments: segmenting
> >>>> getSegments: splitting
> >>>> 2017-03-20 02:36:52 findRegions: smoothing
> >>>> 2017-03-20 02:36:52 findRegions: identifying potential segments
> >>>> 2017-03-20 02:36:52 findRegions: segmenting information
> >>>> 2017-03-20 02:36:52 .getSegmentsRle: segmenting with cutoff(s)
> >>>> 16.3681899295041
> >>>> 2017-03-20 02:36:52 findRegions: identifying candidate regions
> >>>> 2017-03-20 02:36:52 findRegions: identifying region clusters
> >>>> 2017-03-20 02:36:52 findRegions: smoothing
> >>>> 2017-03-20 02:36:52 findRegions: identifying potential segments
> >>>> 2017-03-20 02:36:52 findRegions: segmenting information
> >>>> 2017-03-20 02:36:52 .getSegmentsRle: segmenting with cutoff(s)
> >>>> 19.7936614060235
> >>>> 2017-03-20 02:36:52 findRegions: identifying candidate regions
> >>>> 2017-03-20 02:36:52 findRegions: identifying region clusters
> >>>> 2017-03-20 02:36:52 findRegions: smoothing
> >>>>
> >>>> *** caught segfault ***
> >>>> address 0x7f87d2f917e0, cause 'memory not mapped'
> >>>>
> >>>> Traceback:
> >>>> 1: (function (y, x, cluster, weights, smoothFun, ...) {
> >>>> hostPackage <- environmentName(environment(smoothFun))
> >>>> requireNamespace(hostPackage) smoothed <- .runFunFormal(smoothFun,
> >>>> y = y, x = x, cluster = cluster, weights = weights, ...) if
> >>>> (any(!smoothed$smoothed)) { smoothed$fitted[!smoothed$smoothed]
> >>>> <- y[!smoothed$smoothed] } res <- Rle(smoothed$fitted)
> >>>> return(res)})(dots[[1L]][[1L]], dots[[2L]][[1L]], dots[[3L]][[1L]],
> >>>> dots[[4L]][[1L]], smoothFun = function (y, x = NULL, cluster,
> >>>> weights = NULL, minNum = 7, bpSpan = 1000, minInSpan = 0,
> >>>> verbose = TRUE) { if (is.null(dim(y))) y <-
> >>>> matrix(y, ncol = 1) if (!is.null(weights) &&
> >>>> is.null(dim(weights))) weights <- matrix(weights, ncol =
> >>>> 1) if (is.null(x)) x <- seq(along = y) if
> >>>> (is.null(weights)) weights <- matrix(1, nrow = nrow(y),
> >>>> ncol = ncol(y)) Indexes <- split(seq(along = cluster), cluster)
> >>>> clusterL <- sapply(Indexes, length) smoothed <-
> >>>> rep(TRUE, nrow(y)) for (i in seq(along = Indexes)) {
> >>>> if (verbose) if (i%%10000 == 0)
> >>>> cat(".") Index <- Indexes[[i]] if (clusterL[i]
> >>>>
> >>>>> = minNum & sum(rowSums(is.na(y[Index, , drop =
> >>>>>
> >>>> FALSE])) == 0) >= minNum) { nn <-
> >>>> minInSpan/length(Index) for (j in 1:ncol(y)) {
> >>>> sdata <- data.frame(pos = x[Index], y = y[Index,
> >>>> j], weights = weights[Index, j]) fit <-
> >>>> locfit(y ˜ lp(pos, nn = nn, h = bpSpan), data =
> >>>> sdata, weights = weights, family = "gaussian",
> >>>> maxk = 10000) pp <- preplot(fit, where = "data", band
> >>>> = "local", newdata = data.frame(pos = x[Index]))
> >>>> y[Index, j] <- pp$trans(pp$fit) }
> >>>> } else { y[Index, ] <- NA
> >>>> smoothed[Index] <- FALSE } }
> >>>> return(list(fitted = y, smoothed = smoothed, smoother = "locfit"))
> >>>> }, verbose = TRUE, minNum = 1435)
> >>>> 2: .mapply(.FUN, dots, .MoreArgs)
> >>>> 3: FUN(...)
> >>>> 4: doTryCatch(return(expr), name, parentenv, handler)
> >>>> 5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> >>>> 6: tryCatchList(expr, classes, parentenv, handlers)
> >>>> 7: tryCatch({ FUN(...)}, error = handle_error)
> >>>> 8: withCallingHandlers({ tryCatch({ FUN(...) }, error =
> >>>> handle_error)}, warning = handle_warning)
> >>>> 9: FUN(X[[i]], ...)
> >>>> 10: lapply(X, FUN, ...)
> >>>> 11: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd,
> >>>> .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
> >>>> 12: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd,
> >>>> .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
> >>>> 13: bpmapply(.smoothFstatsFun, fstatsChunks, posChunks, clusterChunks,
> >>>> weightChunks, MoreArgs = list(smoothFun = smoothFunction,
> >>>> ...), BPPARAM = BPPARAM)
> >>>> 14: bpmapply(.smoothFstatsFun, fstatsChunks, posChunks, clusterChunks,
> >>>> weightChunks, MoreArgs = list(smoothFun = smoothFunction,
> >>>> ...), BPPARAM = BPPARAM)
> >>>> 15: .smootherFstats(fstats = fstats, position = position, weights =
> >>>> weights, smoothFunction = smoothFunction, ...)
> >>>> 16: findRegions(prep$position, genomeFstats, "chr21", verbose = TRUE,
> >>>> smooth = TRUE, minNum = 1435)
> >>>> 17: eval(exprs, env)
> >>>> 18: eval(exprs, env)
> >>>> 19: source_file(path, new.env(parent = env), chdir = TRUE)
> >>>> 20: force(code)
> >>>> 21: with_reporter(reporter = reporter, start_end_reporter =
> >>>> start_end_reporter, { lister$start_file(basename(path))
> >>>> source_file(path, new.env(parent = env), chdir = TRUE)
> >>>> end_context() })
> >>>> 22: FUN(X[[i]], ...)
> >>>> 23: lapply(paths, test_file, env = env, reporter = current_reporter,
> >>>> start_end_reporter = FALSE, load_helpers = FALSE)
> >>>> 24: force(code)
> >>>> 25: with_reporter(reporter = current_reporter, results <-
> >>>> lapply(paths, test_file, env = env, reporter = current_reporter,
> >>>> start_end_reporter = FALSE, load_helpers = FALSE))
> >>>> 26: test_files(paths, reporter = reporter, env = env, ...)
> >>>> 27: test_dir(test_path, reporter = reporter, env = env, filter =
> >>>> filter, ...)
> >>>> 28: with_top_env(env, { test_dir(test_path, reporter = reporter,
> >>>> env = env, filter = filter, ...)})
> >>>> 29: run_tests(package, test_path, filter, reporter, ...)
> >>>> 30: test_check("derfinder")
> >>>> An irrecoverable exception occurred. R is aborting now ...
> >>>>
> >>>> I was finally able to reproduce this error on my Mac OSX laptop after
> >>>> running R CMD build and R CMD check (same options as in Bioc) several
> >>>> times. It took me a while, but I figured out what's the exact code
> >>>> that's failing. It can be reproduced (noting that it won't always
> >>>> fail...) in OSX by running:
> >>>>
> >>>> library('derfinder')
> >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
> >>>> chunksize=1e3,
> >>>> colsubset=NULL)
> >>>> regs_s3 <- findRegions(prep$position, genomeFstats, 'chr21',
> >>>> verbose=TRUE, smooth = TRUE, minNum = 1435)
> >>>>
> >>>>
> >>>> Here is the output from my laptop one time it actually failed:
> >>>>
> >>>> library('derfinder')
> >>>>>
> >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
> >>>> chunksize=1e3,
> >>>> colsubset=NULL)
> >>>>
> >>>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
> >>>>> chunksize=1e3,
> >>>>>
> >>>> + colsubset=NULL)
> >>>>
> >>>>> regs_s3 <- findRegions(prep$position, genomeFstats, 'chr21',
> >>>>> verbose=TRUE, smooth = TRUE, minNum = 1435)
> >>>>>
> >>>> 2017-03-21 16:37:39 findRegions: smoothing
> >>>>
> >>>> *** caught segfault ***
> >>>> address 0x7f958dbf2be0, cause 'memory not mapped'
> >>>>
> >>>> Traceback:
> >>>> 1: (function (y, x, cluster, weights, smoothFun, ...) {
> >>>> hostPackage <- environmentName(environment(smoothFun))
> >>>> requireNamespace(hostPackage) smoothed <- .runFunFormal(smoothFun,
> >>>> y = y, x = x, cluster = cluster, weights = weights, ...) if
> >>>> (any(!smoothed$smoothed)) { smoothed$fitted[!smoothed$smoothed]
> >>>> <- y[!smoothed$smoothed] } res <- Rle(smoothed$fitted)
> >>>> return(res)})(dots[[1L]][[1L]], dots[[2L]][[1L]], dots[[3L]][[1L]],
> >>>> dots[[4L]][[1L]], smoothFun = function (y, x = NULL, cluster,
> >>>> weights = NULL, minNum = 7, bpSpan = 1000, minInSpan = 0,
> >>>> verbose = TRUE) { if (is.null(dim(y))) y <-
> >>>> matrix(y, ncol = 1) if (!is.null(weights) &&
> >>>> is.null(dim(weights))) weights <- matrix(weights, ncol =
> >>>> 1) if (is.null(x)) x <- seq(along = y) if
> >>>> (is.null(weights)) weights <- matrix(1, nrow = nrow(y),
> >>>> ncol = ncol(y)) Indexes <- split(seq(along = cluster), cluster)
> >>>> clusterL <- sapply(Indexes, length) smoothed <-
> >>>> rep(TRUE, nrow(y)) for (i in seq(along = Indexes)) {
> >>>> if (verbose) if (i%%10000 == 0)
> >>>> cat(".") Index <- Indexes[[i]] if (clusterL[i]
> >>>>
> >>>>> = minNum & sum(rowSums(is.na(y[Index, , drop =
> >>>>>
> >>>> FALSE])) == 0) >= minNum) { nn <-
> >>>> minInSpan/length(Index) for (j in 1:ncol(y)) {
> >>>> sdata <- data.frame(pos = x[Index], y = y[Index,
> >>>> j], weights = weights[Index, j]) fit <-
> >>>> locfit(y ~ lp(pos, nn = nn, h = bpSpan), data =
> >>>> sdata, weights = weights, family = "gaussian",
> >>>> maxk = 10000) pp <- preplot(fit, where = "data", band
> >>>> = "local", newdata = data.frame(pos = x[Index]))
> >>>> y[Index, j] <- pp$trans(pp$fit) }
> >>>> } else { y[Index, ] <- NA
> >>>> smoothed[Index] <- FALSE } }
> >>>> return(list(fitted = y, smoothed = smoothed, smoother = "locfit"))
> >>>> }, verbose = TRUE, minNum = 1435)
> >>>> 2: .mapply(.FUN, dots, .MoreArgs)
> >>>> 3: FUN(...)
> >>>> 4: doTryCatch(return(expr), name, parentenv, handler)
> >>>> 5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> >>>> 6: tryCatchList(expr, classes, parentenv, handlers)
> >>>> 7: tryCatch({ FUN(...)}, error = handle_error)
> >>>> 8: withCallingHandlers({ tryCatch({ FUN(...) }, error =
> >>>> handle_error)}, warning = handle_warning)
> >>>> 9: FUN(X[[i]], ...)
> >>>> 10: lapply(X, FUN, ...)
> >>>> 11: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd,
> >>>> .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
> >>>> 12: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd,
> >>>> .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
> >>>> 13: bpmapply(.smoothFstatsFun, fstatsChunks, posChunks, clusterChunks,
> >>>> weightChunks, MoreArgs = list(smoothFun = smoothFunction,
> >>>> ...), BPPARAM = BPPARAM)
> >>>> 14: bpmapply(.smoothFstatsFun, fstatsChunks, posChunks, clusterChunks,
> >>>> weightChunks, MoreArgs = list(smoothFun = smoothFunction,
> >>>> ...), BPPARAM = BPPARAM)
> >>>> 15: .smootherFstats(fstats = fstats, position = position, weights =
> >>>> weights, smoothFunction = smoothFunction, ...)
> >>>> 16: findRegions(prep$position, genomeFstats, "chr21", verbose = TRUE,
> >>>> smooth = TRUE, minNum = 1435)
> >>>>
> >>>> Possible actions:
> >>>> 1: abort (with core dump, if enabled)
> >>>> 2: normal R exit
> >>>> 3: exit R without saving workspace
> >>>> 4: exit R saving workspace
> >>>>
> >>>> The traceback information ends at's bumphunter::loessByCluster().
> >>>>
> >>>>
> >>>> I have successfully used the following code other times (see below)
> >>>> where I test the culprit line 100 times. By successfully, I mean that
> >>>> the code ran without problems... so it was unsuccessful at reproducing
> >>>> the problem.
> >>>>
> >>>> library('derfinder')
> >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
> >>>> chunksize=1e3,
> >>>> colsubset=NULL)
> >>>>
> >>>> for(i in 1:100) {
> >>>> print(i)
> >>>> regs_s3 <- findRegions(prep$position, genomeFstats, 'chr21',
> >>>> verbose=TRUE, smooth = TRUE, minNum = 1435)
> >>>> }
> >>>> options(width = 120)
> >>>> devtools::session_info()
> >>>>
> >>>>
> >>>> I had several R processes open the one time it did fail, but well,
> >>>> I've had multiple of them open the times that the code didn't fail. So
> >>>> having multiple R processes doesn't seem to be an issue.
> >>>>
> >>>> The line that triggers the segfault is used simply to test that
> >>>> passing the argument 'minNum' to bumphunter::loessByCluster() via
> >>>> '...' works. It's not a relevant test for derfinder and I was tempted
> >>>> to remove it, although before tracing the bug I talked with Valerie
> >>>> about not removing it. With the upcoming Bioconductor release I
> >>>> decided to finally trace the line that triggers the segfault. At this
> >>>> point I was feeling lost...
> >>>>
> >>>>
> >>>> Running the following code seems to trigger the segfault more often (I
> >>>> got it like 4 times in a row):
> >>>>
> >>>> library('derfinder')
> >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
> >>>> chunksize=1e3,
> >>>> colsubset=NULL)
> >>>> regs_s1 <- findRegions(prep$position, genomeFstats, 'chr21',
> >>>> verbose=TRUE, smooth = TRUE)
> >>>> regs_s2 <- findRegions(prep$position, genomeFstats, 'chr21',
> >>>> verbose=TRUE, smooth = TRUE, smoothFunction =
> >>>> bumphunter::runmedByCluster)
> >>>> regs_s3 <- findRegions(prep$position, genomeFstats, 'chr21',
> >>>> verbose=TRUE, smooth = TRUE, minNum = 1435)
> >>>>
> >>>> But then I can still run the same code without problems on a for loop
> >>>> for 100 times:
> >>>>
> >>>> library('derfinder')
> >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
> >>>> chunksize=1e3,
> >>>> colsubset=NULL)
> >>>>
> >>>> for(i in 1:100) {
> >>>> print(i)
> >>>> regs_s1 <- findRegions(prep$position, genomeFstats, 'chr21',
> >>>> verbose=TRUE, smooth = TRUE)
> >>>> regs_s2 <- findRegions(prep$position, genomeFstats, 'chr21',
> >>>> verbose=TRUE, smooth = TRUE, smoothFunction =
> >>>> bumphunter::runmedByCluster)
> >>>> regs_s3 <- findRegions(prep$position, genomeFstats, 'chr21',
> >>>> verbose=TRUE, smooth = TRUE, minNum = 1435)
> >>>> }
> >>>> options(width = 120)
> >>>> devtools::session_info()
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> I next thought of going through findRegions() to produce simple
> >>>> objects that could reproduce the error. I had in mine sharing these
> >>>> objects so it would be easier for others to help me figure out what
> >>>> was failing. It turns out that this code segfaulted reliably (all the
> >>>> times I tested it at least):
> >>>>
> >>>>
> >>>> library('derfinder')
> >>>> library('BiocParallel')
> >>>> library('IRanges')
> >>>> prep <- preprocessCoverage(genomeData, cutoff=0, scalefac=32,
> >>>> chunksize=1e3,
> >>>> colsubset=NULL)
> >>>> fstats <- genomeFstats
> >>>> position <- prep$position
> >>>> weights <- NULL
> >>>> cluster <- derfinder:::.clusterMakerRle(position, 300L)
> >>>> cluster
> >>>> BPPARAM <- SerialParam()
> >>>> iChunks <- rep(1, length(cluster))
> >>>>
> >>>> fstatsChunks <- split(fstats, iChunks)
> >>>> posChunks <- split(which(position), iChunks)
> >>>> clusterChunks <- split(cluster, iChunks)
> >>>> weightChunks <- vector('list', length = length(unique(iChunks)))
> >>>>
> >>>> res <- bpmapply(bumphunter::loessByCluster, fstatsChunks, posChunks,
> >>>> clusterChunks, weightChunks, MoreArgs = list(minNum = 1435),
> >>>> BPPARAM = BPPARAM, SIMPLIFY = FALSE)
> >>>>
> >>>> y <- fstatsChunks[[1]]
> >>>> smoothed <- res[[1]]
> >>>>
> >>>> ## This segfaults:
> >>>> if(any(!smoothed$smoothed)) {
> >>>> smoothed$fitted[!smoothed$smoothed] <- y[!smoothed$smoothed]
> >>>> }
> >>>>
> >>>>
> >>>> The objects on the line that fail are a list and an Rle:
> >>>>
> >>>> y
> >>>>>
> >>>> numeric-Rle of length 1434 with 358 runs
> >>>> Lengths: 1 5
> >>>> ... 1
> >>>> Values : 5.109484425367 3.85228949953674 ...
> >>>> 3.99765511645983
> >>>>
> >>>>> lapply(smoothed, head)
> >>>>>
> >>>> $fitted
> >>>> [,1]
> >>>> [1,] NA
> >>>> [2,] NA
> >>>> [3,] NA
> >>>> [4,] NA
> >>>> [5,] NA
> >>>> [6,] NA
> >>>>
> >>>> $smoothed
> >>>> [1] FALSE FALSE FALSE FALSE FALSE FALSE
> >>>>
> >>>> $smoother
> >>>> [1] "loess"
> >>>>
> >>>>> table(!smoothed$smoothed)
> >>>>>
> >>>>
> >>>> TRUE
> >>>> 1434
> >>>>
> >>>>> y[!smoothed$smoothed]
> >>>>>
> >>>> numeric-Rle of length 1434 with 358 runs
> >>>> Lengths: 1 5
> >>>> ... 1
> >>>> Values : 5.109484425367 3.85228949953674 ...
> >>>> 3.99765511645983
> >>>>
> >>>> So in my derfinder code I was assigning an Rle to a matrix, and that
> >>>> was the segfault. I have no idea why this doesn't always fail on OSX
> >>>> and why it never failed on Linux or Windows.
> >>>>
> >>>>
> >>>> This is the super simplified IRanges code that fails:
> >>>>
> >>>> library('IRanges')
> >>>> y <- Rle(runif(10, 1, 1))
> >>>> smoothed <- list('fitted' = matrix(NA, ncol = 1, nrow = 10),
> >>>> 'smoothed' = rep(FALSE, 10), smoother = 'loess')
> >>>> sessionInfo()
> >>>> smoothed$fitted[!smoothed$smoothed] <- y[!smoothed$smoothed]
> >>>>
> >>>> ## Segfault on OSX
> >>>>
> >>>> library('IRanges')
> >>>>> y <- Rle(runif(10, 1, 1))
> >>>>> smoothed <- list('fitted' = matrix(NA, ncol = 1, nrow = 10),
> >>>>>
> >>>> + 'smoothed' = rep(FALSE, 10), smoother = 'loess')
> >>>>
> >>>>>
> >>>>> sessionInfo()
> >>>>>
> >>>> R Under development (unstable) (2016-10-26 r71594)
> >>>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> >>>> Running under: macOS Sierra 10.12.3
> >>>>
> >>>> locale:
> >>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >>>>
> >>>> attached base packages:
> >>>> [1] stats4 parallel stats graphics grDevices utils
> >>>> datasets methods base
> >>>>
> >>>> other attached packages:
> >>>> [1] IRanges_2.9.19 S4Vectors_0.13.15 BiocGenerics_0.21.3
> >>>>
> >>>>> smoothed$fitted[!smoothed$smoothed] <- y[!smoothed$smoothed]
> >>>>>
> >>>>
> >>>> *** caught segfault ***
> >>>> address 0x7fcdc31dffe0, cause 'memory not mapped'
> >>>>
> >>>> Possible actions:
> >>>> 1: abort (with core dump, if enabled)
> >>>> 2: normal R exit
> >>>> 3: exit R without saving workspace
> >>>> 4: exit R saving workspace
> >>>>
> >>>>
> >>>> ## No problems on Linux
> >>>>
> >>>> library('IRanges')
> >>>>> y <- Rle(runif(10, 1, 1))
> >>>>> smoothed <- list('fitted' = matrix(NA, ncol = 1, nrow = 10),
> >>>>>
> >>>> + 'smoothed' = rep(FALSE, 10), smoother = 'loess')
> >>>>
> >>>>>
> >>>>> sessionInfo()
> >>>>>
> >>>> R version 3.3.1 Patched (2016-09-30 r71426)
> >>>> Platform: x86_64-pc-linux-gnu (64-bit)
> >>>> Running under: Red Hat Enterprise Linux Server release 6.6 (Santiago)
> >>>>
> >>>> locale:
> >>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> >>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> >>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> >>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> >>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
> >>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >>>>
> >>>> attached base packages:
> >>>> [1] stats4 parallel stats graphics grDevices datasets utils
> >>>> [8] methods base
> >>>>
> >>>> other attached packages:
> >>>> [1] IRanges_2.8.2 S4Vectors_0.12.2 BiocGenerics_0.20.0
> >>>> [4] colorout_1.1-2
> >>>>
> >>>> loaded via a namespace (and not attached):
> >>>> [1] tools_3.3.1
> >>>>
> >>>>> smoothed$fitted[!smoothed$smoothed] <- y[!smoothed$smoothed]
> >>>>>
> >>>>
> >>>>
> >>>> Best,
> >>>> Leo
> >>>>
> >>>>
> >>>>
> >>>> The session information for my first tests is below:
> >>>>
> >>>> devtools::session_info()
> >>>>>
> >>>> Session info
> >>>> ------------------------------------------------------------
> >>>> -----------------------------------------------
> >>>>
> >>>> setting value
> >>>> version R Under development (unstable) (2016-10-26 r71594)
> >>>> system x86_64, darwin13.4.0
> >>>> ui X11
> >>>> language (EN)
> >>>> collate en_US.UTF-8
> >>>> tz America/New_York
> >>>> date 2017-03-21
> >>>>
> >>>> Packages
> >>>> ------------------------------------------------------------
> >>>> ---------------------------------------------------
> >>>>
> >>>> package * version date source
> >>>> acepack 1.4.1 2016-10-29 CRAN (R 3.4.0)
> >>>> AnnotationDbi 1.37.4 2017-03-10 Bioconductor
> >>>> assertthat 0.1 2013-12-06 CRAN (R 3.4.0)
> >>>> backports 1.0.5 2017-01-18 CRAN (R 3.4.0)
> >>>> base64enc 0.1-3 2015-07-28 CRAN (R 3.4.0)
> >>>> Biobase 2.35.1 2017-02-23 Bioconductor
> >>>> BiocGenerics * 0.21.3 2017-01-12 Bioconductor
> >>>> BiocParallel 1.9.5 2017-01-24 Bioconductor
> >>>> biomaRt 2.31.4 2017-01-13 Bioconductor
> >>>> Biostrings 2.43.5 2017-03-19 cran (@2.43.5)
> >>>> bitops 1.0-6 2013-08-17 CRAN (R 3.4.0)
> >>>> BSgenome 1.43.7 2017-02-24 Bioconductor
> >>>> bumphunter * 1.15.0 2016-10-23 Bioconductor
> >>>> checkmate 1.8.2 2016-11-02 CRAN (R 3.4.0)
> >>>> cluster 2.0.6 2017-03-16 CRAN (R 3.4.0)
> >>>> codetools 0.2-15 2016-10-05 CRAN (R 3.4.0)
> >>>> colorout * 1.1-2 2016-11-15 Github
> >>>> (jalvesaq/colorout at 6d84420)
> >>>> colorspace 1.3-2 2016-12-14 CRAN (R 3.4.0)
> >>>> crayon 1.3.2 2016-06-28 CRAN (R 3.4.0)
> >>>> data.table 1.10.4 2017-02-01 CRAN (R 3.4.0)
> >>>> DBI 0.6 2017-03-09 CRAN (R 3.4.0)
> >>>> DelayedArray 0.1.7 2017-02-17 Bioconductor
> >>>> derfinder * 1.9.10 2017-03-17 cran (@1.9.10)
> >>>> derfinderHelper 1.9.4 2017-03-07 Bioconductor
> >>>> devtools 1.12.0 2016-12-05 CRAN (R 3.4.0)
> >>>> digest 0.6.12 2017-01-27 CRAN (R 3.4.0)
> >>>> doRNG 1.6 2014-03-07 CRAN (R 3.4.0)
> >>>> foreach * 1.4.3 2015-10-13 CRAN (R 3.4.0)
> >>>> foreign 0.8-67 2016-09-13 CRAN (R 3.4.0)
> >>>> Formula 1.2-1 2015-04-07 CRAN (R 3.4.0)
> >>>> GenomeInfoDb * 1.11.9 2017-02-08 Bioconductor
> >>>> GenomeInfoDbData 0.99.0 2017-02-14 Bioconductor
> >>>> GenomicAlignments 1.11.12 2017-03-16 cran (@1.11.12)
> >>>> GenomicFeatures 1.27.10 2017-03-16 cran (@1.27.10)
> >>>> GenomicFiles 1.11.4 2017-03-10 Bioconductor
> >>>> GenomicRanges * 1.27.23 2017-02-25 Bioconductor
> >>>> ggplot2 2.2.1 2016-12-30 CRAN (R 3.4.0)
> >>>> gridExtra 2.2.1 2016-02-29 CRAN (R 3.4.0)
> >>>> gtable 0.2.0 2016-02-26 CRAN (R 3.4.0)
> >>>> Hmisc 4.0-2 2016-12-31 CRAN (R 3.4.0)
> >>>> htmlTable 1.9 2017-01-26 CRAN (R 3.4.0)
> >>>> htmltools 0.3.5 2016-03-21 CRAN (R 3.4.0)
> >>>> htmlwidgets 0.8 2016-11-09 CRAN (R 3.4.0)
> >>>> IRanges * 2.9.19 2017-03-15 cran (@2.9.19)
> >>>> iterators * 1.0.8 2015-10-13 CRAN (R 3.4.0)
> >>>> knitr 1.15.1 2016-11-22 CRAN (R 3.4.0)
> >>>> lattice 0.20-34 2016-09-06 CRAN (R 3.4.0)
> >>>> latticeExtra 0.6-28 2016-02-09 CRAN (R 3.4.0)
> >>>> lazyeval 0.2.0 2016-06-12 CRAN (R 3.4.0)
> >>>> locfit * 1.5-9.1 2013-04-20 CRAN (R 3.4.0)
> >>>> magrittr 1.5 2014-11-22 CRAN (R 3.4.0)
> >>>> Matrix 1.2-8 2017-01-20 CRAN (R 3.4.0)
> >>>> matrixStats 0.51.0 2016-10-09 CRAN (R 3.4.0)
> >>>> memoise 1.0.0 2016-01-29 CRAN (R 3.4.0)
> >>>> munsell 0.4.3 2016-02-13 CRAN (R 3.4.0)
> >>>> nnet 7.3-12 2016-02-02 CRAN (R 3.4.0)
> >>>> pkgmaker 0.22 2014-05-14 CRAN (R 3.4.0)
> >>>> plyr 1.8.4 2016-06-08 CRAN (R 3.4.0)
> >>>> qvalue 2.7.0 2016-10-23 Bioconductor
> >>>> R6 2.2.0 2016-10-05 CRAN (R 3.4.0)
> >>>> RColorBrewer 1.1-2 2014-12-07 CRAN (R 3.4.0)
> >>>> Rcpp 0.12.10 2017-03-19 CRAN (R 3.4.0)
> >>>> RCurl 1.95-4.8 2016-03-01 CRAN (R 3.4.0)
> >>>> registry 0.3 2015-07-08 CRAN (R 3.4.0)
> >>>> reshape2 1.4.2 2016-10-22 CRAN (R 3.4.0)
> >>>> rngtools 1.2.4 2014-03-06 CRAN (R 3.4.0)
> >>>> rpart 4.1-10 2015-06-29 CRAN (R 3.4.0)
> >>>> Rsamtools 1.27.13 2017-03-14 cran (@1.27.13)
> >>>> RSQLite 1.1-2 2017-01-08 CRAN (R 3.4.0)
> >>>> rtracklayer 1.35.9 2017-03-19 cran (@1.35.9)
> >>>> S4Vectors * 0.13.15 2017-02-14 cran (@0.13.15)
> >>>> scales 0.4.1 2016-11-09 CRAN (R 3.4.0)
> >>>> stringi 1.1.2 2016-10-01 CRAN (R 3.4.0)
> >>>> stringr 1.2.0 2017-02-18 CRAN (R 3.4.0)
> >>>> SummarizedExperiment 1.5.7 2017-02-23 Bioconductor
> >>>> survival 2.41-2 2017-03-16 CRAN (R 3.4.0)
> >>>> testthat * 1.0.2 2016-04-23 CRAN (R 3.4.0)
> >>>> tibble 1.2 2016-08-26 CRAN (R 3.4.0)
> >>>> VariantAnnotation 1.21.17 2017-02-12 Bioconductor
> >>>> withr 1.0.2 2016-06-20 CRAN (R 3.4.0)
> >>>> XML 3.98-1.5 2016-11-10 CRAN (R 3.4.0)
> >>>> xtable 1.8-2 2016-02-05 CRAN (R 3.4.0)
> >>>> XVector 0.15.2 2017-02-02 Bioconductor
> >>>> zlibbioc 1.21.0 2016-10-23 Bioconductor
> >>>>
> >>>> _______________________________________________
> >>>> Bioc-devel at r-project.org mailing list
> >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et
> >>>> hz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt
> >>>> 84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=
> >>>> Bw-1Kqy-M_t4kmpYWTpYkt5bvj_eTpxriUM3UvtOIzQ&s=hEBTd8bPfLVp6H
> >>>> oN3XSBk6ppmeRZhdLoB8VseYM_Byk&e=
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >> This email message may contain legally privileged and/or...{{dropped:2}}
> >>
> >>
> >> _______________________________________________
> >> Bioc-devel at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >>
>
> > [[alternative HTML version deleted]]
>
> > _______________________________________________
> > Bioc-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
This email message may contain legally privileged and/or...{{dropped:2}}
More information about the Bioc-devel
mailing list