[BioC] bug: subsetByOverlaps(GRangesList, GRanges)
Patrick Aboyoun
paboyoun at fhcrc.org
Fri Jul 16 18:26:39 CEST 2010
Cei,
The Details section of the relevant man page for subsetByOverlaps
explains its functionality
help("subsetByOverlaps,GRangesList,GRanges-method")
In short the subsetByOverlaps function operates at the object's "top"
level, not the within-element level, as the "[" operator behaves for a
standard R list. So in the GRangesList, GRanges case, either you select
the list element as it appears in the original object or you drop it
entirely.
I am glad to see that you have mentioned using the endoapply function,
because that is exactly what I would have recommended.
endoapply(grl, subsetByOverlaps, gr3)
If your use case, however, involves a GRangesList with over a hundred
elements, however, this may not be performant enough and I can provide
you with lower level code that will be much faster. If this is a common
use case, we could add a new function that works for
IRangesList,IRanges; GRangesList,GRanges; etc. pairings and avoids the
endoapply framework.
Patrick
On 7/16/10 4:01 AM, Cei Abreu-Goodger wrote:
> Hello,
>
> I think I've found another bug. If you use subsetByOverlaps with a
> GRangesList as query, the full object is returned, instead of the
> subset that overlaps:
>
> library(GenomicRanges)
> gr1 <- GRanges(seqnames=c("a","b"),ranges=IRanges(c(1,11), c(5,15)))
> gr2 <- GRanges(seqnames=c("a","b"),ranges=IRanges(c(1,11), c(5,15)))
> gr3 <- GRanges(seqnames=c("a"),ranges=IRanges(1,5))
> grl <- GRangesList(gr1,gr2)
>
> identical(grl,subsetByOverlaps(grl, gr3))
> [1] TRUE
>
> To get the behavior that I was expecting, you can do:
> endoapply(grl, subsetByOverlaps, gr3)
>
> Cheers,
>
> Cei
>
> > sessionInfo()
> R version 2.11.0 (2010-04-22)
> i386-apple-darwin9.8.0
>
> locale:
> [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices datasets utils methods base
>
> other attached packages:
> [1] GenomicRanges_1.0.6 IRanges_1.6.8 Biobase_2.8.0
>
> loaded via a namespace (and not attached):
> [1] tools_2.11.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list