[Bioc-devel] GRanges Unique [actually -- `order`] Method

Steve Lianoglou mailinglist.honeypot at gmail.com
Wed Jun 15 02:06:36 CEST 2011


Hi,

(Digging up an old [related] thread since I'm not sure of the status
of the code that Michael referred to in this context is ...)

I have a suboptimal-but-working implementation of `order` (and by
extension, `sort`) for GenomicRanges objects, eg. it calculates the
`order`ing of a GRanges object of length 1 million (randomly spread
across all Hsapiens chromosomes and strands) in ~ 22 seconds[*].

The resulting/ordered ranges are sorted/grouped by
seqnames,strand,ranges (the caller can specify the ordering of the
seqnames, otherwise the ordering as defined by
seqleves(your.granges.object) is used.

Also it is only defined for one GRanges object (not sure what the
appropriate result would be if multiple granges objects are passed in)

I can check it into SVN if that sounds good so it can work as a
stop-gap until one of the *Ranges-guru's can whip up a superior one.

[*] By the by, the runtime is dominated by iterating over the seqnames
and subselecting the appropriate ranges to work for one at a time ...
maybe the speed can be increased by using `split` a few times, but
then you have several copies of your GRanges object in memory, so ...
not sure what's best atm or how useful it is to talk about code in the
"abstract," but we can continue the discussion if you reckon it's
worthy to be checked in for now ...

-steve

On Wed, May 25, 2011 at 9:02 AM, Michael Lawrence
<lawrence.michael at gene.com> wrote:
> Someone has to write the methods...
>
> On Tue, May 24, 2011 at 11:00 PM, Dario Strbenac
> <D.Strbenac at garvan.org.au>wrote:
>
>> >   Yes, the sort method just calls order.
>>
>> Something isn't quite working out for me.
>>
>> library(GenomicRanges) # 1.4.5
>> gr <- GRanges("chr1", IRanges(c(1, 10), c(50, 60)), '+')
>> sort(gr)
>>
>> --------------------------------------
>> Dario Strbenac
>> Research Assistant
>> Cancer Epigenetics
>> Garvan Institute of Medical Research
>> Darlinghurst NSW 2010
>> Australia
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioc-devel mailing list