[Bioc-sig-seq] viewApplying Efficiently

Martin Morgan mtmorgan at fhcrc.org
Mon Feb 14 01:44:19 CET 2011


On 02/13/2011 03:00 PM, Dario Strbenac wrote:
> Hello,
> 

> I have an RleList of about 17000 Rles and I'd like to get the
regularly spaced values out of each one of them and have a list of
vectors of numbers as the result.

Maybe along the lines of

elt <- Rle(as.numeric(floor(runif(50100, 1, 2.2)))) # simulated
m <- Rle(rep(c(FALSE, TRUE), 100), rep(c(500, 1), 100)) # mask
as.numeric((elt * m)[m])

? This might be in an lapply on the RleList; m would have to be
constructed to be the right length for each element.

Martin
> 
> e.g. my views locations is 17000 of these :
> 
>> samplingRL[[1]] # is a RangesList
> IRanges of length 101
>       start   end width
> [1]     501   501     1
> [2]    1001  1001     1
> [3]    1501  1501     1
> [4]    2001  2001     1
> [5]    2501  2501     1
> [6]    3001  3001     1
> [7]    3501  3501     1
> [8]    4001  4001     1
> [9]    4501  4501     1
> ...     ...   ...   ...
> [93]  46501 46501     1
> [94]  47001 47001     1
> [95]  47501 47501     1
> [96]  48001 48001     1
> [97]  48501 48501     1
> [98]  49001 49001     1
> [99]  49501 49501     1
> [100] 50001 50001     1
> [101] 50501 50501     1
> 
> and my RleList has data like :
> 
>> rleList[[1]]
> 'numeric' Rle of length 51001 with 38620 runs
>   Lengths:               501                 1 ...              1089
>   Values : 0.671728853793319 0.677726432845045 ... 0.224909214439609
> 
> I do the following to get the sampling position values in one step, but it uses up over 20 GB RAM in a matter of seconds, and I have to kill the process.
> 
> result <- viewApply(Views(rleList, samplingRL), function(samples) as.numeric(samples), simplify = TRUE)
> 
> Is there a better way ?
> 
> --------------------------------------
> Dario Strbenac
> Research Assistant
> Cancer Epigenetics
> Garvan Institute of Medical Research
> Darlinghurst NSW 2010
> Australia
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioc-sig-sequencing mailing list