Getting there... Thanks for the report Kasper!


On Sun, Jul 7, 2013 at 2:10 PM, Michael Lawrence
<lawrence.michael@gene.com>wrote:

> Awesome. If Hector is finished cleaning up, I'd be glad to merge it.
>
> Michael
>
>
> On Sat, Jul 6, 2013 at 6:18 PM, Kasper Daniel Hansen <
> kasperdanielhansen@gmail.com> wrote:
>
>> A little late, I can report that this speeds up my "many seqlevels"
>> problem, by 3 orders of magnitude.
>>
>> library(IRanges, lib.loc = "library")
>> library(GenomicRanges, lib.loc = "library")
>> library(BSgenome.Amellifera.BeeBase.assembly4)
>> Un <- Amellifera$GroupUn
>> gr <- GRanges(seqnames = names(Un),
>>               ranges= IRanges(start = 1 , width = width(Un)))
>>
>> ## gr has a length of 9244, but each interval is in a new seqname.
>> ## this makes traditional findOverlaps extremely slow
>>
>> system.time({
>>     findOverlaps(gr, gr)
>> })  ## roughly 240 secs
>>
>> system.time({
>>     grF <- as(gr, "GIntervalTree")
>> })
>> system.time({
>>     findOverlaps(grF, grF)
>> }) ## roughly 0.1 secs
>>
>> ## speedup (for this example): 2400x fold !!!
>>
>> Kasper
>>
>>
>> On Thu, May 30, 2013 at 6:51 AM, Hector Corrada Bravo <
>> hcorrada@umiacs.umd.edu> wrote:
>>
>>> Great. I already have unit tests there for IntervalForest and
>>> GIntervalTree.
>>> Hector
>>>
>>>
>>> On Wed, May 29, 2013 at 8:31 PM, Vincent Carey
>>> <stvjc@channing.harvard.edu>wrote:
>>>
>>> > Fine with me, as long as he is acquainted with the build/test before
>>> commit
>>> > practices that we are supposed
>>> > to follow.  Breaking IRanges can have severe repercussions.
>>> >
>>> > On Wed, May 29, 2013 at 6:36 PM, Michael Lawrence <
>>> > lawrence.michael@gene.com
>>> > > wrote:
>>> >
>>> > > Would it be feasible/acceptable to give Hector permission to commit?
>>> > >
>>> > > Michael
>>> > >
>>> > >
>>> > > On Wed, May 29, 2013 at 2:12 PM, Hector Corrada Bravo <
>>> > hcorrada@gmail.com
>>> > > >wrote:
>>> > >
>>> > > > That's great! There's some cleaning up to do there how should we do
>>> > this
>>> > > > post-merge?
>>> > > >
>>> > > >
>>> > > > On Wed, May 29, 2013 at 4:19 PM, Valerie Obenchain <
>>> vobencha@fhcrc.org
>>> > > >wrote:
>>> > > >
>>> > > >> Hi Hector, Michael,
>>> > > >>
>>> > > >> This sounds great. Bringing these into svn is fine with us.
>>> Michael,
>>> > do
>>> > > >> you want to merge these in?
>>> > > >>
>>> > > >> Val
>>> > > >>
>>> > > >> On 05/24/2013 07:30 AM, Hector Corrada Bravo wrote:
>>> > > >> > Thanks Michael,
>>> > > >> >
>>> > > >> > It has made significant difference for our visualization
>>> project. I
>>> > > >> would
>>> > > >> > like to merge this into svn asap. Can I get a ruling from the
>>> rest
>>> > of
>>> > > >> the
>>> > > >> > core group? Please let me know if/when/how to proceed.
>>> > > >> >
>>> > > >> > Cheers,
>>> > > >> > Hector
>>> > > >> >
>>> > > >> >
>>> > > >> > On Wed, May 22, 2013 at 1:00 PM, Michael Lawrence <
>>> > > >> lawrence.michael@gene.com
>>> > > >> >> wrote:
>>> > > >> >
>>> > > >> >> *Added bioc-devel; hope you don't mind*
>>> > > >> >>
>>> > > >> >> Hector,
>>> > > >> >>
>>> > > >> >> This is great stuff. The overall design is on the right track.
>>> As
>>> > you
>>> > > >> >> said, there's a bit of cleaning to do, but I think we should
>>> merge
>>> > > >> this
>>> > > >> >> into svn and work the rest out from there. This will really
>>> benefit
>>> > > >> >> performance, especially for visualization. Of course, I can't
>>> speak
>>> > > >> for the
>>> > > >> >> others.
>>> > > >> >>
>>> > > >> >> Michael
>>> > > >> >>
>>> > > >> >>
>>> > > >> >>
>>> > > >> >> On Tue, May 21, 2013 at 11:52 AM, Hector Corrada Bravo <
>>> > > >> >> hcorrada@umiacs.umd.edu> wrote:
>>> > > >> >>
>>> > > >> >>> Since the semester is over I finally finished this...
>>> > > >> >>>
>>> > > >> >>> Recall that I wanted a persistent set of IntervalTrees for
>>> GRanges
>>> > > >> >>> objects for repeated querying. (The application is this:
>>> > > >> >>> http://epiviz.cbcb.umd.edu/help/?page_id=62 which I hope to
>>> get
>>> > out
>>> > > >> >>> soon). Folding this into IRanges and GenomicRanges would make
>>> our
>>> > > >> life
>>> > > >> >>> easier come installation time.
>>> > > >> >>>
>>> > > >> >>> I've implemented class 'IntervalForest' within IRanges
>>> following
>>> > > >> >>> Michael's suggestion of storing this as an array of rbTree on
>>> the
>>> > C
>>> > > >> side.
>>> > > >> >>> I've implemented findOverlaps that operates with this array
>>> in C.
>>> > > >> There is
>>> > > >> >>> code duplication in IntervalTree.c that could be reduced but
>>> > that's
>>> > > >> if this
>>> > > >> >>> makes it into the package.
>>> > > >> >>>
>>> > > >> >>> I've also implemented a 'GIntervalTree' that uses
>>> 'IntervalForest'
>>> > > >> >>> underneath. findOverlaps-GenomicRanges-GIntervalTree-method is
>>> > > >> implemented
>>> > > >> >>> for this class. I didn't touch the existing
>>> > > >> >>> findOverlaps-GenomicRanges-GenomicRanges-method.
>>> > > >> >>>
>>> > > >> >>> You can pull these here:
>>> > > >> >>> http://github.com/hcorrada/IRanges
>>> > > >> >>> http://github.com/hcorrada/GenomicRanges
>>> > > >> >>>
>>> > > >> >>> These track the devel branch of the two packages. Let me know
>>> the
>>> > > >> best
>>> > > >> >>> way to propagate to svn if you guys want this. It needs
>>> > > >> documentation, but
>>> > > >> >>> I'll add that once implementation is settled.
>>> > > >> >>>
>>> > > >> >>> Kasper, I'm not sure if this would help with the 'too many
>>> > > seqlevels'
>>> > > >> >>> problem but I'd be curious to know if you try it.
>>> > > >> >>>
>>> > > >> >>> Cheers,
>>> > > >> >>> Hector
>>> > > >> >>>
>>> > > >> >>
>>> > > >> >>
>>> > > >> >
>>> > > >> > [[alternative HTML version deleted]]
>>> > > >> >
>>> > > >> > _______________________________________________
>>> > > >> > Bioc-devel@r-project.org mailing list
>>> > > >> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> > > >> >
>>> > > >>
>>> > > >
>>> > > >
>>> > >
>>> > >         [[alternative HTML version deleted]]
>>> > >
>>> > > _______________________________________________
>>> > > Bioc-devel@r-project.org mailing list
>>> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> > >
>>> >
>>> >         [[alternative HTML version deleted]]
>>> >
>>> > _______________________________________________
>>> > Bioc-devel@r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> >
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>
>>
>

	[[alternative HTML version deleted]]

