[Bioc-devel] poor performance of snpsByOverlaps()

Hervé Pagès hpages at fredhutch.org
Tue Jun 21 09:02:57 CEST 2016


Hi Robert,

Thanks for report this. I'll look into it.

H.

On 06/17/2016 09:53 AM, Robert Castelo wrote:
> hi,
>
> the performance of snpsByOverlaps() in terms of time and memory
> consumption is quite poor and i wonder whether there is some bug in the
> code. here's one example:
>
> library(GenomicRanges)
> library(SNPlocs.Hsapiens.dbSNP144.GRCh37)
>
> snps <- SNPlocs.Hsapiens.dbSNP144.GRCh37
>
> gr <- GRanges(seqnames="ch10", IRanges(123276830, 123276830))
>
> system.time(ov <- snpsByOverlaps(snps, gr))
>     user  system elapsed
>   33.768   0.124  33.955
>
> system.time(ov <- snpsByOverlaps(snps, gr))
>     user  system elapsed
>   33.150   0.281  33.494
>
>
> i've shown the call to snpsByOverlaps() twice to account for the fact
> that maybe the first call was caching data and the second could be much
> faster, but it is not the case.
>
> if i do the same but with a larger GRanges object, for instance the one
> attached to this email, then the memory consumption grows until about 20
> Gbytes. to me this in conjunction with the previous observation,
> suggests something wrong about the caching of the data.
>
>
>
> i look forward to your comments and possible solutions,
>
>
> thanks!!!
>
>
> robert.
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list