[BioC] fastest way to keep score when reduce Granges data
Hervé Pagès
hpages at fhcrc.org
Wed Feb 26 02:21:32 CET 2014
Hi Jianhong,
It would help enormously if you could send code that we can actually
run. Thanks!
H.
On 02/24/2014 07:53 AM, Ou, Jianhong wrote:
> Hi ALL,
>
> I want to reduce a GRanges data by fixed window size and keep scores after reduce. My code is
>
> .dat <- GRanges("chr1", Iranges(start=1:50, width=2), strand="+", score=Sample(1:50, 50))
> windowSize <- 10
> Grwin <- GRanges("chr1", IRanges(start=(0:5)*windowSize+scale[1]-1,
> width=windowSize), strand="+")
> ol <- findOverlaps(.dat, GRwin)
> ol <- as.data.frame(ol)
> ol <- ol[!duplicated(ol[,1]),]
> .dat <- split(.dat, ol[,2])
> reduceValue <- function(.datReduce){
> .datReduceM <- reduce(.datReduce, with.mapping=TRUE)
> wid <- width(.datReduce)
> .datReduceScore <- .datReduce$value
> .datReduceM$score <- sapply(.datReduceM$mapping, function(.idx){
> round(sum(.datReduceScore[.idx]*wid[.idx])/sum(wid[.idx]))
> })
> .datReduceM$mapping <- NULL
> .datReduceM
> }
> .dat <- lapply(.dat, reduceValue)
> .dat <- unlist(GRangesList(.dat))
>
> But the efficiency is very low. What is the best way to keep scores when reduce GRanges data by fixed window size? Thanks for your help.
>
> Yours sincerely,
>
> Jianhong Ou
>
> LRB 670A
> Program in Gene Function and Expression
> 364 Plantation Street Worcester,
> MA 01605
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list