[Bioc-devel] Combining Ordinary List of GRanges Optimisation
Ryan C. Thompson
rct at thompsonclan.org
Mon Jan 7 03:25:37 CET 2013
Hi Dario Strbenac,
Are you asking if you can rewrite your code to work faster, or are you
asking if the BioC devs need to improve the code to be faster? As a
first test, I would try a few alternatives to see if they are
significantly faster. One would be "unlist(GRangesList(blockRanges))".
Another would be manually splitting each GRanges objects into its
components: seqnames, IRanges, strand, and metadata. Then concatenate
these components and build one big GRanges object. Try both of these
approaches and see if either one makes things faster.
Alternatively, give me some code to generate a list of GRanges similar
in size to your blockRanges object, and I'll test them myself.
-Ryan Thompson
On 01/06/2013 06:00 PM, Dario Strbenac wrote:
> Hello,
>
> For a not so large list of GRanges:
>
>> length(blockRanges)
> [1] 4029
>> class(blockRanges)
> [1] "list"
>
> Which don't have an unreasonable number of elements in them:
>
>> summary(sapply(blockRanges, length))
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 1 961 20710 55210 77680 759600
>
> Combining them takes 15 minutes:
>
>> system.time(allRanges <- do.call(c, blockRanges))
> sessionInfo()
> user system elapsed
> 935.770 23.657 961.952
>
>> head(blockRanges[[1]])
> GRanges with 6 ranges and 1 metadata column:
> seqnames ranges strand | conservation
> <Rle> <IRanges> <Rle> | <numeric>
> [1] chr1 [10918, 10918] * | 0.064
> [2] chr1 [10919, 10919] * | 0.056
> [3] chr1 [10920, 10920] * | 0.064
> [4] chr1 [10921, 10921] * | 0.056
> [5] chr1 [10922, 10922] * | 0.064
> [6] chr1 [10923, 10923] * | 0.064
> ---
> seqlengths:
> chr1
> NA
>
> Could this code be faster ?
>
>> sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> other attached packages:
> [1] GenomicRanges_1.10.5 IRanges_1.16.4 BiocGenerics_0.4.0
>
> --------------------------------------
> Dario Strbenac
> PhD Student
> University of Sydney
> Camperdown NSW 2050
> Australia
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
More information about the Bioc-devel
mailing list