[Bioc-devel] viewMedians

Hervé Pagès hpages at fhcrc.org
Mon Jun 2 19:48:34 CEST 2014


Hi Peter,

Seems like you have a pretty good implementation of the view* functions
in genoset. Nice work! And great to hear that there is so much room for
improvements to the implementation currently in IRanges. I'll try to
give this a shot soon but first I want to move Rle's to the S4Vectors
package.

Cheers,
H.


On 06/01/2014 07:58 PM, Peter Haverty wrote:
> I think viewMedians would be great.  While you have the hood up, there are
> some opportunities for some speedups and code simplification, I believe.
>
> I did some experimentation with view* in the genoset package. I made an
> alternate version of the C for viewMeans and found about a 10X speedup.  I
> hoisted the branching for the different types and did the NA handling with
> arithmetic rather than branching. The search for the Rle runs covered by
> each view is now done with findInterval.  There are quite a few code
> sections that differ only in the type of the NA value and the pointers to
> the input/output vectors. I think it would be worth considering C++
> templates.
>
> On the R side, each view* function is pretty similar too. In
> genoset/R/RleDataFrame-views.R I tried to factor out all the shared pieces.
>
> While we're on the topic, I think the view* functions should have range*
> equivalents that skip the View object and work on an Rle and an IRanges.
>   If you already have a Views object around, view* are perfect. Otherwise,
> making the Views objects uses time that could be saved.
>
> Overall I found about a 90X speedup over viewMeans(RleViewsList).
>
> I hope there is some useful food for thought in these experiments. I have a
> vignette that shows some of the timings if anyone is interested.
>
> Regards,
> Pete
>
> ____________________
> Peter M. Haverty, Ph.D.
> Genentech, Inc.
> phaverty at gene.com
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list