[BioC] Summing Views on coverage by base
Ivan Gregoretti
ivangreg at gmail.com
Tue Mar 20 21:56:57 CET 2012
The C implementation would be highly appreciated.
I currently do this operation with tools that are not
R/Bioconductor-based for performance reasons.
Thank you,
Ivan
Ivan Gregoretti, PhD
2012/3/20 Hervé Pagès <hpages at fhcrc.org>:
> Hi Sean,
>
>
> On 03/20/2012 01:14 PM, Sean Davis wrote:
>>
>> I have a set of Views of equal width (think upstream of tss) and want
>> to sum each base across those views. I can extract each view as an
>> integer vector and create a matrix, but this matrix can get pretty
>> large. I'm missing the skills with SimpleRleViewsList, though, to
>> work directly on at object. Any suggestions?
>
>
>> subject <- Rle(rep(c(0L, 1L, 3L, 2L, 18L, 0L), c(3,2,1,5,2,4)))
>> myViews <- Views(subject, start=4:11, width=5)
>> myViews
> Views on a 17-length Rle subject
>
> views:
> start end width
> [1] 4 8 5 [1 1 3 2 2]
> [2] 5 9 5 [1 3 2 2 2]
> [3] 6 10 5 [3 2 2 2 2]
> [4] 7 11 5 [2 2 2 2 2]
> [5] 8 12 5 [ 2 2 2 2 18]
> [6] 9 13 5 [ 2 2 2 18 18]
> [7] 10 14 5 [ 2 2 18 18 0]
> [8] 11 15 5 [ 2 18 18 0 0]
>
> This maybe would be fast enough if you don't have too many columns:
>
> viewColSums <- function(x)
> {
> sapply(seq_len(width(x)[1L]),
> function(i)
> sum(subject[start(x)+i-1L]))
> }
>
>> viewColSums(myViews)
> [1] 15 32 49 46 44
>
> Then if your SimpleRleViewsList object is not too long (1 elt per
> chromosome?), you can sapply( , viewColSums) on it.
>
> Maybe we should make viewColSums the "colSums" method for RleViews
> objects? (and eventually implement it in C?)
>
> Cheers,
> H.
>
>
>>
>> Thanks,
>> Sean
>>
>>> sessionInfo()
>>
>> R Under development (unstable) (2012-01-19 r58141)
>> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>>
>> locale:
>> [1] C
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] GenomicRanges_1.7.30 IRanges_1.13.28 BiocGenerics_0.1.12
>>
>> loaded via a namespace (and not attached):
>> [1] stats4_2.15.0 tools_2.15.0
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone: (206) 667-5791
> Fax: (206) 667-1319
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list