[Bioc-devel] Combining Ordinary List of GRanges Optimisation

Ryan C. Thompson rct at thompsonclan.org
Mon Jan 7 03:25:37 CET 2013


Hi Dario Strbenac,

Are you asking if you can rewrite your code to work faster, or are you 
asking if the BioC devs need to improve the code to be faster? As a 
first test, I would try a few alternatives to see if they are 
significantly faster. One would be "unlist(GRangesList(blockRanges))". 
Another would be manually splitting each GRanges objects into its 
components: seqnames, IRanges, strand, and metadata. Then concatenate 
these components and build one big GRanges object. Try both of these 
approaches and see if either one makes things faster.

Alternatively, give me some code to generate a list of GRanges similar 
in size to your blockRanges object, and I'll test them myself.

-Ryan Thompson


On 01/06/2013 06:00 PM, Dario Strbenac wrote:
> Hello,
>
> For a not so large list of GRanges:
>
>> length(blockRanges)
> [1] 4029
>> class(blockRanges)
> [1] "list"
>
> Which don't have an unreasonable number of elements in them:
>
>> summary(sapply(blockRanges, length))
>     Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>        1     961   20710   55210   77680  759600
>
> Combining them takes 15 minutes:
>
>> system.time(allRanges <- do.call(c, blockRanges))
> sessionInfo()
>     user  system elapsed
> 935.770  23.657 961.952
>
>> head(blockRanges[[1]])
> GRanges with 6 ranges and 1 metadata column:
>        seqnames         ranges strand | conservation
>           <Rle>      <IRanges>  <Rle> |    <numeric>
>    [1]     chr1 [10918, 10918]      * |        0.064
>    [2]     chr1 [10919, 10919]      * |        0.056
>    [3]     chr1 [10920, 10920]      * |        0.064
>    [4]     chr1 [10921, 10921]      * |        0.056
>    [5]     chr1 [10922, 10922]      * |        0.064
>    [6]     chr1 [10923, 10923]      * |        0.064
>    ---
>    seqlengths:
>     chr1
>       NA
>
> Could this code be faster ?
>
>> sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> other attached packages:
> [1] GenomicRanges_1.10.5 IRanges_1.16.4       BiocGenerics_0.4.0
>
> --------------------------------------
> Dario Strbenac
> PhD Student
> University of Sydney
> Camperdown NSW 2050
> Australia
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list