[BioC] Lapply for two GRanges objects?
Martin Morgan
mtmorgan at fhcrc.org
Tue Mar 29 02:08:41 CEST 2011
On 03/28/2011 02:41 PM, Hollis Wright wrote:
> Hello; I apologize if this is an obvious question or more suited for
> the general R list, but I have not been able to find a good solution
> with the Google: it is possible to use lapply or relatives to speed
> up overlapping of multiple GRanges objects? Specifically, I'm getting
> methylation values from bisulfite sequencing CpGs over specific
> windows in the genome and I need to calculate means across the
> windows, so for right now what I have been doing is basically:
> for(i<= length(methyl))
> methlymean<- mean(subsetByOverlaps(methyl, windows[i]);
>
> but this is fairly slow. I tried something like:
>
>
> m<- lapply(windows, methylmeans(methyl, windows)
>
> and defining:
>
>
> methylmeans<- function(methyl, windows)
> {
> return(mean(subsetByOverlaps(methyl, windows)@elementMetadata$methyl));
> }
>
Hi Hollis -- maybe along the lines of (over all ranges in methyl and
windows)
olaps <- findOverlaps(methyl, windows)
v <- elementMetadata(methyl)[queryHits(olaps), "methyl"]
i <- subjectHits(olaps)
means <- numeric(length(windows))
means[unique(i)] <- sapply(split(v, i), mean)
elementMetadata(windows)[["MethylMeans"]] <- means
?
Martin
> but this doesn't work. Mapply doesn't work since the window and
> methyl
sizes aren't the same. Any thoughts? Is there anything inbuilt into
GRanges for this kind of case?
>
> Hollis Wright, PhD
> Knight Cancer Center
> Oregon Health and Science University
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioconductor
mailing list