[Bioc-sig-seq] Add ability for `subset`ing IRanges-like objects based on their elementMetadata?

Steve Lianoglou mailinglist.honeypot at gmail.com
Fri Jun 4 23:27:49 CEST 2010


Hi,

Random question I thought I'd shoot out there ...

I'm finding myself wanting to slice and dice IRanges-like objects (I'm
playing with GRanges right now) based on some column of their
elementMetadata.

Are other people finding that they want to do this, too?
Would it make sense to add some subset-mojo to do that?

Here's a motivating example:

Say I have a GRanges object (`tags`), that looks something like:

GRanges with 2217486 ranges and 8 elementMetadata values
    seqnames           ranges strand   |    tag.id genome.hits gene.hits
       <Rle>        <IRanges>  <Rle>   | <integer>   <integer> <integer>
[1]     chr1   [ 4850,  4866]      -   |    405384          10         3
[2]     chr1   [ 7804,  7820]      -   |    405387           6         4
[3]     chr1   [13162, 13178]      -   |    405397           5         4
[4]     chr1   [16712, 16728]      +   |        35       12164      2475
[5]     chr1   [21381, 21397]      +   |        45         497        79
[6]     chr1   [21479, 21495]      -   |      1466        3823       957

And say that I want all "tags" with < 5 genome.hits on the "+" strand.
I'd like to:

R> subset(tags, genome.hits < 5 & strand == '+')

To do the same as:

R> tags[elementMetadata(tags)$genome.hits < 5 & strand(tag) == '+']

I realize that using `genome.hits` (from the elementMetadata) and
`strand` (not in the metadata) is crossing some boundaries, but I just
wanted to point out one of the more "complex" cases.

Just curious,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioc-sig-sequencing mailing list