[Bioc-devel] [devteam-bioc] Error in findOverlaps

Hervé Pagès hpages at fhcrc.org
Fri Jul 26 08:32:44 CEST 2013


Hi Jianhong,

On 07/24/2013 08:18 AM, Ou, Jianhong wrote:
> Dear Hervé,
>
> Is it possible to ignore minoverlap when people set maxgap greater than 0
> and give a message for that?

That would make sense to me but we're probably overlooking something
because Michael has a unit test where 'minoverlap' and 'maxgap' are
used together:

   query <- IRanges(c(1, 5, 3, 4), width=c(2, 2, 4, 6))
   subject <- IRanges(c(1, 3, 5, 6), width=c(4, 4, 5, 4))
   tree <- IntervalTree(subject)
   result <- findOverlaps(query, tree, type = "start", maxgap = 1L,
                          minoverlap = 3L)

   > result
   Hits of length 3
   queryLength: 4
   subjectLength: 4
     queryHits subjectHits
      <integer>   <integer>
    1         3           2
    2         4           2
    3         4           3

This 'result' is what the unit test expects so we would need to
understand what the meaning of using those 2 arguments together
is (the man page is not really clear about it).

Michael?

Thanks,
H.

>
> Yours sincerely,
>
> Jianhong Ou
>
> LRB 670A
> Program in Gene Function and Expression
> 364 Plantation Street Worcester,
> MA 01605
>
>
>
>
> On 7/23/13 5:35 PM, "Hervé Pagès" <hpages at fhcrc.org> wrote:
>
>> Hi,
>>
>> This fails in Bioc-devel only and the following fix "seems" to address
>> the problem:
>>
>>    hpages at thinkpad:~/svn/bioconductor/Rpacks/IRanges$ svn diff
>> R/findOverlaps-methods.R
>>    Index: R/findOverlaps-methods.R
>>    ===================================================================
>>    --- R/findOverlaps-methods.R	(revision 78801)
>>    +++ R/findOverlaps-methods.R	(working copy)
>>    @@ -142,7 +142,7 @@
>>                 # preprocess query
>>                 preprocRes <- .preProcess_findOverlaps_query(query,
>> maxgap, minoverlap)
>>                 origQuery <- preprocRes$origQuery
>>    -            unsortedQuery <- preprocRes$origQuery
>>    +            unsortedQuery <- preprocRes$unsortedQuery
>>                 query <- preprocRes$query
>>                 query_ord <- preprocRes$query_ord
>>
>> However a *real* fix would be to actually clarify what's the meaning of
>> using 'maxgap' and 'minoverlap' together. Right now the result below
>> doesn't make much sense to me (using Bioc-release):
>>
>>    > findOverlaps(IRanges(1, 5), IRanges(7, 10), maxgap=3, minoverlap=2)
>>    Hits of length 0
>>    queryLength: 1
>>    subjectLength: 1
>>
>>    > findOverlaps(IRanges(1, 5), IRanges(7, 10), maxgap=4, minoverlap=2)
>>    Hits of length 1
>>    queryLength: 1
>>    subjectLength: 1
>>      queryHits subjectHits
>>       <integer>   <integer>
>>     1         1           1
>>
>> H.
>>
>>
>> On 07/23/2013 12:08 PM, Maintainer wrote:
>>> Hi,
>>>
>>> I got "negative widths are not allowed" error when I use findOverlaps.
>>> Could anybody help me to figure this out? Thanks. Here is the code:
>>>
>>>   > library("IRanges")
>>> Loading required package: BiocGenerics
>>> Loading required package: parallel
>>>
>>> Attaching package: ŒBiocGenerics¹
>>>
>>> The following objects are masked from Œpackage:parallel¹:
>>>
>>>       clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
>>>       clusterExport, clusterMap, parApply, parCapply, parLapply,
>>>       parLapplyLB, parRapply, parSapply, parSapplyLB
>>>
>>> The following object is masked from Œpackage:stats¹:
>>>
>>>       xtabs
>>>
>>> The following objects are masked from Œpackage:base¹:
>>>
>>>       anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
>>>       duplicated, eval, Filter, Find, get, intersect, lapply, Map,
>>>       mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
>>>       Position, rank, rbind, Reduce, rep.int, rownames, sapply, setdiff,
>>>       sort, table, tapply, union, unique, unlist
>>>
>>>   > Ranges1 <- sort(IRanges(start=c(3123260, 167889600), end=c(3123360,
>>> 167893599), names=c("Site5", "Site12")))
>>>   > Ranges2 <- sort(IRanges(start=c(312326, 3123260, 167888600),
>>> end=c(312586, 3123470, 167888999), names=c("t11", "t5", "t17")))
>>>   > tree = IntervalTree(Ranges2)
>>>   > findOverlaps(tree, query = Ranges1, maxgap = 1000, minoverlap = 20,
>>> select = "first")
>>> Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE =
>>> "IRanges") :
>>>     solving row 2: negative widths are not allowed
>>>   > traceback()
>>> 11: .Call(.NAME, ..., PACKAGE = PACKAGE)
>>> 10: .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges")
>>> 9: solveUserSEW0(start = start, end = end, width = width)
>>> 8: IRanges(pmax.int(qstart, sstart), pmin.int(send, qend))
>>> 7: .local(x, ...)
>>> 6: ranges(result, unsortedQuery, subject)
>>> 5: ranges(result, unsortedQuery, subject)
>>> 4: .postProcess_findOverlaps_result(result, unsortedQuery, origQuery,
>>>          subject, type, minoverlap, maxgap, origSelect)
>>> 3: .local(query, subject, maxgap, minoverlap, type, select, ...)
>>> 2: findOverlaps(tree, query = Ranges1, maxgap = 1000, minoverlap = 20,
>>>          select = "first")
>>> 1: findOverlaps(tree, query = Ranges1, maxgap = 1000, minoverlap = 20,
>>>          select = "first")
>>>   > sessionInfo()
>>> R Under development (unstable) (2013-04-30 r62697)
>>> Platform: x86_64-apple-darwin12.3.0 (64-bit)
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] parallel  stats     graphics  grDevices utils     datasets  methods
>>> [8] base
>>>
>>> other attached packages:
>>> [1] IRanges_1.19.19    BiocGenerics_0.7.3
>>>
>>> loaded via a namespace (and not attached):
>>> [1] stats4_3.1.0
>>>
>>>
>>> Yours sincerely,
>>>
>>> Jianhong Ou
>>>
>>> LRB 670A
>>> Program in Gene Function and Expression
>>> 364 Plantation Street Worcester,
>>> MA 01605
>>>
>>>
>>> ________________________________________________________________________
>>> devteam-bioc mailing list
>>> To unsubscribe from this mailing list send a blank email to
>>> devteam-bioc-leave at lists.fhcrc.org
>>> You can also unsubscribe or change your personal options at
>>> https://lists.fhcrc.org/mailman/listinfo/devteam-bioc
>>>
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages at fhcrc.org
>> Phone:  (206) 667-5791
>> Fax:    (206) 667-1319
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list