[BioC] findOveraps suggestions

Janet Young jayoung at fhcrc.org
Thu Jan 21 21:07:20 CET 2010


Hi,

I'm not sure if this is a suggestion for enhancement to IRanges, or a question
about whether efficient code already exists to do what I want - I might have
missed something.  

I have a single dataset of genomic regions (as a RangedData object), a minority
of which overlap with one another, and I'm interested in looking at which ones
overlap.

I'll illustrate what I mean using example data from the findOverlaps help page:
 query <- IRanges(c(1, 4, 9), c(5, 7, 10))

First, a trivial cosmetic comment:
     findOverlaps(query)
works fine (as it knows that I mean subject=query)

but if query is a RangedData object, it doesn't work unless I specify both 
subject and query.
    query_RD <- RangedData(query,space="chr1")
    findOverlaps(query_RD)
Error in function (classes, fdef, mtable)  : 
  unable to find an inherited method for function "findOverlaps", for signature
  "RangedData", "missing"

Instead we need to specify query like so:
    findOverlaps(query_RD,query_RD)
(no big deal, I know, but could be good to fix it for consistency)

Second, a truly functional comment.  In the special case when query=subject, it
would be really nice to have an option not to report self-self matches, by which
I mean only the second and third lines from the following example are really
interesting:

findOverlaps(query)
An object of class “RangesMatching”
Slot "matchMatrix":
     query subject
[1,]     1       1
[2,]     1       2
[3,]     2       1
[4,]     2       2
[5,]     3       3

Even nicer would be to only report each symmetrical match once, not twice (i.e.
tell me that 1 matches 2, but no need to also tell me that 2 matches 1).

I think I can figure out the code to do each of those things the long way
around, but it'd be great to have it built in. (is it already?) 

What do you think?  I imagine this could be useful to others too.

thanks,

Janet Young

------------------------------------------------------------------- 

Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168, 
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung  ...at...  fhcrc.org

http://www.fhcrc.org/labs/trask/



More information about the Bioconductor mailing list