[BioC] GenomicRanges Use Cases - subsetByOverlaps

James Perkins j.perkins at ucl.ac.uk
Tue Nov 8 10:27:01 CET 2011


Hi,

I am having some problems following the example in the vignette for
GenomicRanges, specifically:

3.4 Identifying reads that do NOT overlap known annotation
...
> filtData <- subsetByOverlaps(aligns, exonRanges)
> length(filtData)
[1] 17311
At this point, the filtData object only contains ranges that did not
overlap with any of the known exons from Saccharomycess cerevisiae.

My understanding of subsetByOverlaps is that it would bring back
exactly the ranges that DO overlap with the known exons?

'subsetByOverlaps(query, subject, maxgap = 0L, minoverlap = 1L, type =
          c("any", "start", "end", "within", "equal"))': Returns the
          subset of 'query' that has an overlap hit with a range in
          'subject' using the specified 'findOverlaps' parameters.
          Both 'query' and 'subject' should be 'Ranges', 'RangesList'
          or 'RangedData' objects.

I don't see how this gets the reads mapping in non-exon ranges. Surely
it gets the reads mapping in the exon ranges? since exonRanges is
obtained using:

exonRanges <- exonsBy(txdb, "tx")

Shouldn't I be looking for the subset that *doesn't* overlap?
Something like subsetByOverlaps(! aligns, exonRanges)? Or have I
missed something obvious (quite likely!)?

Many thanks,

Jim

--
James Perkins, PhD student
Institute of Structural and Molecular Biology
Division of Biosciences
University College London
Gower Steet
London, WC1E 6BT
UK

email: j.perkins at ucl.ac.uk
phone: 0207 679 2198



More information about the Bioconductor mailing list