[Bioc-sig-seq] Replace Elements of RleList is Slow

Dario Strbenac D.Strbenac at garvan.org.au
Sun Feb 13 05:00:11 CET 2011


Hello,

I have an RleList where sometimes I'd like to substitute some elements, before doing a viewApply on this modified list. I have an example where I just have one element and this takes almost a minute. Is there a chance of optimising this code in IRanges ? I'm trying to avoid casting the RleLists to lists then back to RleLists, to make my code shorter.

> class(coverageGenes)
[1] "SimpleRleList"
attr(,"package")
[1] "IRanges"
> length(coverageGenes)
[1] 17805
> w
[1] 8007
> xxx
SimpleRleList of length 1
$chr18
'numeric' Rle of length 51001 with 15108 runs
  Lengths:               501                 1 ...             14735
  Values :  2.18311877482829  2.18461816959122 ...                 0

> system.time(coverageGenes[w]  <- xxx)
   user  system elapsed 
 55.780   2.070  57.866

> cgLIST <- as.list(coverageGenes)
> xxxL <- as.list(xxx)

> system.time(cgLIST[w]  <- xxxL)
   user  system elapsed 
      0       0       0 

So, the plain list based method works in a flash.
Note that xxx and w can sometimes be longer than 1, but I am just illustrating the base case. I had memory problems working with them being a bit longer. With only length 6 xxx and w, RAM usage shot up by 17 GB.

> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=C              LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] aroma.affymetrix_1.7.0             aroma.apd_0.1.7                   
 [3] affxparser_1.22.0                  R.huge_0.2.0                      
 [5] aroma.core_1.7.0                   aroma.light_1.18.0                
 [7] matrixStats_0.2.2                  R.rsp_0.4.0                       
 [9] R.cache_0.3.0                      R.filesets_0.9.0                  
[11] digest_0.4.2                       R.utils_1.5.3                     
[13] R.oo_1.7.4                         R.methodsS3_1.2.1                 
[15] BSgenome.Hsapiens.UCSC.hg18_1.3.16 BSgenome_1.18.3                   
[17] Biostrings_2.18.2                  GenomicRanges_1.2.3               
[19] IRanges_1.8.9                     

loaded via a namespace (and not attached):
[1] Biobase_2.10.0 tools_2.12.0 

--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia



More information about the Bioc-sig-sequencing mailing list