[BioC] Summing Views on coverage by base

Ivan Gregoretti ivangreg at gmail.com
Tue Mar 20 21:56:57 CET 2012


The C implementation would be highly appreciated.

I currently do this operation with tools that are not
R/Bioconductor-based for performance reasons.

Thank you,

Ivan

Ivan Gregoretti, PhD



2012/3/20 Hervé Pagès <hpages at fhcrc.org>:
> Hi Sean,
>
>
> On 03/20/2012 01:14 PM, Sean Davis wrote:
>>
>> I have a set of Views of equal width (think upstream of tss) and want
>> to sum each base across those views.  I can extract each view as an
>> integer vector and create a matrix, but this matrix can get pretty
>> large.  I'm missing the skills with SimpleRleViewsList, though, to
>> work directly on at object.  Any suggestions?
>
>
>> subject <- Rle(rep(c(0L, 1L, 3L, 2L, 18L, 0L), c(3,2,1,5,2,4)))
>> myViews <- Views(subject, start=4:11, width=5)
>> myViews
> Views on a 17-length Rle subject
>
> views:
>    start end width
> [1]     4   8     5 [1 1 3 2 2]
> [2]     5   9     5 [1 3 2 2 2]
> [3]     6  10     5 [3 2 2 2 2]
> [4]     7  11     5 [2 2 2 2 2]
> [5]     8  12     5 [ 2  2  2  2 18]
> [6]     9  13     5 [ 2  2  2 18 18]
> [7]    10  14     5 [ 2  2 18 18  0]
> [8]    11  15     5 [ 2 18 18  0  0]
>
> This maybe would be fast enough if you don't have too many columns:
>
> viewColSums <- function(x)
> {
>    sapply(seq_len(width(x)[1L]),
>           function(i)
>               sum(subject[start(x)+i-1L]))
> }
>
>> viewColSums(myViews)
> [1] 15 32 49 46 44
>
> Then if your SimpleRleViewsList object is not too long (1 elt per
> chromosome?), you can sapply( , viewColSums) on it.
>
> Maybe we should make viewColSums the "colSums" method for RleViews
> objects? (and eventually implement it in C?)
>
> Cheers,
> H.
>
>
>>
>> Thanks,
>> Sean
>>
>>> sessionInfo()
>>
>> R Under development (unstable) (2012-01-19 r58141)
>> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>>
>> locale:
>> [1] C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] GenomicRanges_1.7.30 IRanges_1.13.28      BiocGenerics_0.1.12
>>
>> loaded via a namespace (and not attached):
>> [1] stats4_2.15.0 tools_2.15.0
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list