[BioC] subset GRanges object via ElementMetadata
Hervé Pagès
hpages at fhcrc.org
Sat Feb 23 02:33:00 CET 2013
Hi Michael,
On 02/22/2013 12:56 PM, Michael Lawrence wrote:
> Btw, I hacked together a subset() method for GenomicRanges yesterday. It
> respects the metadata columns. Someone could probably come up with some
> reason why that violates the conceptual foundations of something, but I
> find it useful.
>
> So you could do:
> subset(gr, over == 2)
Sounds good to me. Hopefully you set the method on Vector objects,
rather than just GenomicRanges objects.
Thanks,
H.
>
> Will commit shortly.
>
> Michael
>
>
>
>
>
> On Fri, Feb 22, 2013 at 10:10 AM, Tim Triche, Jr. <tim.triche at gmail.com>wrote:
>
>> the shorthand method would be
>>
>> GR[ GR$over == 2 ]
>>
>> and in your example,
>>
>> R> test.gr
>> GRanges with 6 ranges and 3 metadata columns:
>> seqnames ranges strand | edensity epeak over
>> <Rle> <IRanges> <Rle> | <integer> <integer> <integer>
>> [1] chr1 [713844, 714487] * | 1000 256 1
>> [2] chr1 [762136, 763199] * | 1000 771 2
>> [3] chr1 [780124, 780289] * | 519 74 0
>> [4] chr1 [780533, 780677] * | 516 68 0
>> [5] chr1 [781104, 781387] * | 601 140 0
>> [6] chr1 [793830, 794396] * | 610 290 0
>> ---
>> seqlengths:
>> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX
>> chrY
>> NA NA NA NA NA NA ... NA NA NA NA NA
>> NA
>> R> test.gr[ test.gr$over == 2 ]
>> GRanges with 1 range and 3 metadata columns:
>> seqnames ranges strand | edensity epeak over
>> <Rle> <IRanges> <Rle> | <integer> <integer> <integer>
>> [1] chr1 [762136, 763199] * | 1000 771 2
>> ---
>> seqlengths:
>> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX
>> chrY
>> NA NA NA NA NA NA ... NA NA NA NA NA
>> NA
>>
>>
>>
>>
>> On Fri, Feb 22, 2013 at 7:33 AM, Hermann Norpois <hnorpois at gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> I am looking for a method to subset a GRangesObject by means of values
>> (or
>>> ElementMetadata column), for instance
>>> over==2.
>>>
>>> How does it work?
>>>
>>> Thanks
>>> Hermann
>>>
>>>
>>>> test.gr
>>> GRanges with 6 ranges and 3 metadata columns:
>>> seqnames ranges strand | edensity epeak over
>>> <Rle> <IRanges> <Rle> | <integer> <integer> <integer>
>>> [1] chr1 [713844, 714487] * | 1000 256 1
>>> [2] chr1 [762136, 763199] * | 1000 771 2
>>> [3] chr1 [780124, 780289] * | 519 74 0
>>> [4] chr1 [780533, 780677] * | 516 68 0
>>> [5] chr1 [781104, 781387] * | 601 140 0
>>> [6] chr1 [793830, 794396] * | 610 290 0
>>> ---
>>> seqlengths:
>>> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX
>>> chrY
>>> NA NA NA NA NA NA ... NA NA NA NA NA
>>> NA
>>>> dput (test.gr)
>>> new("GRanges"
>>> , seqnames = new("Rle"
>>> , values = structure(1L, .Label = c("chr1", "chr10", "chr11",
>> "chr12",
>>> "chr13",
>>> "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr2",
>>> "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", "chr6", "chr7",
>>> "chr8", "chr9", "chrX", "chrY"), class = "factor")
>>> , lengths = 6L
>>> , elementMetadata = NULL
>>> , metadata = list()
>>> )
>>> , ranges = new("IRanges"
>>> , start = c(713844L, 762136L, 780124L, 780533L, 781104L, 793830L)
>>> , width = c(644L, 1064L, 166L, 145L, 284L, 567L)
>>> , NAMES = NULL
>>> , elementType = "integer"
>>> , elementMetadata = NULL
>>> , metadata = list()
>>> )
>>> , strand = new("Rle"
>>> , values = structure(3L, .Label = c("+", "-", "*"), class = "factor")
>>> , lengths = 6L
>>> , elementMetadata = NULL
>>> , metadata = list()
>>> )
>>> , elementMetadata = new("DataFrame"
>>> , rownames = NULL
>>> , nrows = 6L
>>> , listData = structure(list(edensity = c(1000L, 1000L, 519L, 516L,
>>> 601L, 610L
>>> ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = c(1L,
>>> 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", "over"))
>>> , elementType = "ANY"
>>> , elementMetadata = NULL
>>> , metadata = list()
>>> )
>>> , seqinfo = new("Seqinfo"
>>> , seqnames = c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14",
>>> "chr15",
>>> "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", "chr21",
>>> "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9",
>>> "chrX", "chrY")
>>> , seqlengths = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_,
>>> NA_integer_,
>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_)
>>> , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
>>> NA, NA,
>>> NA, NA, NA, NA, NA, NA, NA, NA, NA)
>>> , genome = c(NA_character_, NA_character_, NA_character_,
>>> NA_character_,
>>> NA_character_, NA_character_, NA_character_, NA_character_,
>> NA_character_,
>>> NA_character_, NA_character_, NA_character_, NA_character_,
>> NA_character_,
>>> NA_character_, NA_character_, NA_character_, NA_character_,
>> NA_character_,
>>> NA_character_, NA_character_, NA_character_, NA_character_, NA_character_
>>> )
>>> )
>>> , metadata = list()
>>> )
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>>
>>
>> --
>> *A model is a lie that helps you see the truth.*
>> *
>> *
>> Howard Skipper<
>> http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list