[Bioc-devel] [devteam-bioc] Error in findOverlaps

Hervé Pagès hpages at fhcrc.org
Fri Jul 26 08:48:50 CEST 2013



On 07/25/2013 11:32 PM, Maintainer wrote:
> Hi Jianhong,
>
> On 07/24/2013 08:18 AM, Ou, Jianhong wrote:
>> Dear Hervé,
>>
>> Is it possible to ignore minoverlap when people set maxgap greater than 0
>> and give a message for that?
>
> That would make sense to me but we're probably overlooking something
> because Michael has a unit test where 'minoverlap' and 'maxgap' are
> used together:
>
>     query <- IRanges(c(1, 5, 3, 4), width=c(2, 2, 4, 6))
>     subject <- IRanges(c(1, 3, 5, 6), width=c(4, 4, 5, 4))
>     tree <- IntervalTree(subject)
>     result <- findOverlaps(query, tree, type = "start", maxgap = 1L,
>                            minoverlap = 3L)
>
>     > result
>     Hits of length 3
>     queryLength: 4
>     subjectLength: 4
>       queryHits subjectHits
>        <integer>   <integer>
>      1         3           2
>      2         4           2
>      3         4           3
>
> This 'result' is what the unit test expects so we would need to
> understand what the meaning of using those 2 arguments together
> is (the man page is not really clear about it).

Checking again and paying more attention to the man page, 'maxgap'
has a special meaning when 'type' is not "any". If type="start" it
specifies the maximum difference in the starts.

So I guess the question that remains is what to do with 'minoverlap'
when type="any" and 'maxgap' is set to a value greater than 0. I'm
OK with your proposal of ignoring it with a warning (unless the
user also specifies 'minoverlap=0L', then no warning of course).

H.


>
> Michael?
>
> Thanks,
> H.
>
>>
>> Yours sincerely,
>>
>> Jianhong Ou
>>
>> LRB 670A
>> Program in Gene Function and Expression
>> 364 Plantation Street Worcester,
>> MA 01605
>>
>>
>>
>>
>> On 7/23/13 5:35 PM, "Hervé Pagès" <hpages at fhcrc.org> wrote:
>>
>>> Hi,
>>>
>>> This fails in Bioc-devel only and the following fix "seems" to address
>>> the problem:
>>>
>>>     hpages at thinkpad:~/svn/bioconductor/Rpacks/IRanges$ svn diff
>>> R/findOverlaps-methods.R
>>>     Index: R/findOverlaps-methods.R
>>>     ===================================================================
>>>     --- R/findOverlaps-methods.R	(revision 78801)
>>>     +++ R/findOverlaps-methods.R	(working copy)
>>>     @@ -142,7 +142,7 @@
>>>                  # preprocess query
>>>                  preprocRes <- .preProcess_findOverlaps_query(query,
>>> maxgap, minoverlap)
>>>                  origQuery <- preprocRes$origQuery
>>>     -            unsortedQuery <- preprocRes$origQuery
>>>     +            unsortedQuery <- preprocRes$unsortedQuery
>>>                  query <- preprocRes$query
>>>                  query_ord <- preprocRes$query_ord
>>>
>>> However a *real* fix would be to actually clarify what's the meaning of
>>> using 'maxgap' and 'minoverlap' together. Right now the result below
>>> doesn't make much sense to me (using Bioc-release):
>>>
>>>     > findOverlaps(IRanges(1, 5), IRanges(7, 10), maxgap=3, minoverlap=2)
>>>     Hits of length 0
>>>     queryLength: 1
>>>     subjectLength: 1
>>>
>>>     > findOverlaps(IRanges(1, 5), IRanges(7, 10), maxgap=4, minoverlap=2)
>>>     Hits of length 1
>>>     queryLength: 1
>>>     subjectLength: 1
>>>       queryHits subjectHits
>>>        <integer>   <integer>
>>>      1         1           1
>>>
>>> H.
>>>
>>>
>>> On 07/23/2013 12:08 PM, Maintainer wrote:
>>>> Hi,
>>>>
>>>> I got "negative widths are not allowed" error when I use findOverlaps.
>>>> Could anybody help me to figure this out? Thanks. Here is the code:
>>>>
>>>>    > library("IRanges")
>>>> Loading required package: BiocGenerics
>>>> Loading required package: parallel
>>>>
>>>> Attaching package: ŒBiocGenerics¹
>>>>
>>>> The following objects are masked from Œpackage:parallel¹:
>>>>
>>>>        clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
>>>>        clusterExport, clusterMap, parApply, parCapply, parLapply,
>>>>        parLapplyLB, parRapply, parSapply, parSapplyLB
>>>>
>>>> The following object is masked from Œpackage:stats¹:
>>>>
>>>>        xtabs
>>>>
>>>> The following objects are masked from Œpackage:base¹:
>>>>
>>>>        anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
>>>>        duplicated, eval, Filter, Find, get, intersect, lapply, Map,
>>>>        mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
>>>>        Position, rank, rbind, Reduce, rep.int, rownames, sapply, setdiff,
>>>>        sort, table, tapply, union, unique, unlist
>>>>
>>>>    > Ranges1 <- sort(IRanges(start=c(3123260, 167889600), end=c(3123360,
>>>> 167893599), names=c("Site5", "Site12")))
>>>>    > Ranges2 <- sort(IRanges(start=c(312326, 3123260, 167888600),
>>>> end=c(312586, 3123470, 167888999), names=c("t11", "t5", "t17")))
>>>>    > tree = IntervalTree(Ranges2)
>>>>    > findOverlaps(tree, query = Ranges1, maxgap = 1000, minoverlap = 20,
>>>> select = "first")
>>>> Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE =
>>>> "IRanges") :
>>>>      solving row 2: negative widths are not allowed
>>>>    > traceback()
>>>> 11: .Call(.NAME, ..., PACKAGE = PACKAGE)
>>>> 10: .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges")
>>>> 9: solveUserSEW0(start = start, end = end, width = width)
>>>> 8: IRanges(pmax.int(qstart, sstart), pmin.int(send, qend))
>>>> 7: .local(x, ...)
>>>> 6: ranges(result, unsortedQuery, subject)
>>>> 5: ranges(result, unsortedQuery, subject)
>>>> 4: .postProcess_findOverlaps_result(result, unsortedQuery, origQuery,
>>>>           subject, type, minoverlap, maxgap, origSelect)
>>>> 3: .local(query, subject, maxgap, minoverlap, type, select, ...)
>>>> 2: findOverlaps(tree, query = Ranges1, maxgap = 1000, minoverlap = 20,
>>>>           select = "first")
>>>> 1: findOverlaps(tree, query = Ranges1, maxgap = 1000, minoverlap = 20,
>>>>           select = "first")
>>>>    > sessionInfo()
>>>> R Under development (unstable) (2013-04-30 r62697)
>>>> Platform: x86_64-apple-darwin12.3.0 (64-bit)
>>>>
>>>> locale:
>>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>>
>>>> attached base packages:
>>>> [1] parallel  stats     graphics  grDevices utils     datasets  methods
>>>> [8] base
>>>>
>>>> other attached packages:
>>>> [1] IRanges_1.19.19    BiocGenerics_0.7.3
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] stats4_3.1.0
>>>>
>>>>
>>>> Yours sincerely,
>>>>
>>>> Jianhong Ou
>>>>
>>>> LRB 670A
>>>> Program in Gene Function and Expression
>>>> 364 Plantation Street Worcester,
>>>> MA 01605
>>>>
>>>>
>>>> ________________________________________________________________________
>>>> devteam-bioc mailing list
>>>> To unsubscribe from this mailing list send a blank email to
>>>> devteam-bioc-leave at lists.fhcrc.org
>>>> You can also unsubscribe or change your personal options at
>>>> https://lists.fhcrc.org/mailman/listinfo/devteam-bioc
>>>>
>>>
>>> --
>>> Hervé Pagès
>>>
>>> Program in Computational Biology
>>> Division of Public Health Sciences
>>> Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N, M1-B514
>>> P.O. Box 19024
>>> Seattle, WA 98109-1024
>>>
>>> E-mail: hpages at fhcrc.org
>>> Phone:  (206) 667-5791
>>> Fax:    (206) 667-1319
>>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list