[BioC] Question about working with GRanges objects

Steve Lianoglou lianoglou.steve at gene.com
Fri Mar 21 22:49:03 CET 2014


Hi,

On Fri, Mar 21, 2014 at 1:50 PM, Jeremy Ng <jeremy.ng.wk1990 at gmail.com> wrote:
> Hi there,
>
> I'm relatively new to the GRanges class of objects, and have some
> questions; and hopefully, I'd be able to better understand what's going on
> here.
>
> I have 2 Granges objects, which are data from GEO. I want to find where the
> overlap, and then after that, the signals from each set. Here's what I have
> so far:
>
>>gsm97tf.ranges
> GRanges with 171378 ranges and 1 metadata column:
>            seqnames                 ranges strand   |    signal
>               <Rle>              <IRanges>  <Rle>   | <numeric>
>        [1]    chr10   [54828986, 54829035]      +   |      0.79
>        [2]    chr10   [54829024, 54829073]      +   |      0.05
>        [3]    chr10   [54829176, 54829225]      +   |      0.04
>        [4]    chr10   [54829746, 54829795]      +   |      0.15
>        [5]    chr10   [54829898, 54829947]      +   |      0.24
>        ...      ...                    ...    ... ...       ...
>   > gsm94tf.ranges
> GRanges with 171249 ranges and 1 metadata column:
>            seqnames                 ranges strand   |    signal
>               <Rle>              <IRanges>  <Rle>   | <numeric>
>        [1]    chr10   [54828834, 54828883]      +   |      0.65
>        [2]    chr10   [54828986, 54829035]      +   |      0.73
>        [3]    chr10   [54829024, 54829073]      +   |      0.33
>        [4]    chr10   [54829138, 54829187]      +   |      0.02
>        [5]    chr10   [54829176, 54829225]      +   |      0.02
>
> In order to find the regions of the genome where both sets overlap, I use
> the following:
>
> overlaps<-intersect(gsm94tf.ranges,gsm97tf.ranges)
>
> This will give me a GRanges object containing the coordinates where both
> sets intersect. The result looks like this:
>>overlaps
> GRanges with 72012 ranges and 0 metadata columns:
>           seqnames                 ranges strand
>              <Rle>              <IRanges>  <Rle>
>       [1]     chr1 [148374757, 148374806]      +
>       [2]     chr1 [148374833, 148374996]      +
>       [3]     chr1 [148375061, 148375148]      +
>       [4]     chr1 [148375821, 148375870]      +
>       [5]     chr1 [148376087, 148376212]      +
>
> My next question seems pretty trivial, but I'm stuck on what to go on next.
> I want to map back the overlaps to the original sets, to find their signal
> values. I was wondering, how do I do this?
>
> Sorry if this question is pretty simple - I'm trying to get a better handle
> of the GRanges classes!

Look at the `?subsetByOverlaps` help file. All of the functions
documented there (as well as the pages that are linked to in the "See
Also" section) are worth internalizing.

HTH,
-steve

-- 
Steve Lianoglou
Computational Biologist
Genentech



More information about the Bioconductor mailing list