[Bioc-devel] as.character method for GenomicRanges?

Michael Lawrence lawrence.michael at gene.com
Mon Apr 27 23:15:06 CEST 2015


It would be nice to have a single function call that would hide these
details. It could probably be made more efficient also by avoiding multiple
matching, unnecessary revmap lists, etc. tableAsGRanges() is not a good
name but it conveys what I mean (does that make it actually good?).

On Mon, Apr 27, 2015 at 12:23 PM, Hervé Pagès <hpages at fredhutch.org> wrote:

> On 04/24/2015 11:41 AM, Michael Lawrence wrote:
>
>> Taking this a bit off topic but it would be nice if we could get the
>> GRanges equivalent of as.data.frame(table(x)), i.e., unique(x) with a
>> count mcol. Should be easy to support but what should the API be like?
>>
>
> This was actually the motivating use case for introducing
> findMatches/countMatches a couple of years ago:
>
>   ux <- unique(x)
>   mcols(ux)$Freq <- countMatches(ux, x)
>
> Don't know what a good API would be to make this even more
> straightforward though. Maybe via some extra argument to unique()
> e.g. 'with.freq'? This is kind of similar to the 'with.revmap'
> argument of reduce(). Note that unique() could also support the
> 'with.revmap' arg. Once it does, the 'with.freq' arg can also
> be implemented by just calling elementLengths() on the "revmap"
> metadata column.
>
> H.
>
>
>> On Fri, Apr 24, 2015 at 10:54 AM, Hervé Pagès <hpages at fredhutch.org
>> <mailto:hpages at fredhutch.org>> wrote:
>>
>>     On 04/24/2015 10:18 AM, Michael Lawrence wrote:
>>
>>         It is a great idea, but I'm not sure I would use it to implement
>>         table(). Allocating those strings will be costly. Don't we
>>         already have
>>         the 4-way int hash? Of course, my intuition might be completely
>>         off here.
>>
>>
>>     It does use the 4-way int hash internally. as.character() is only used
>>     at the very-end to stick the names on the returned table object.
>>
>>     H.
>>
>>
>>
>>         On Fri, Apr 24, 2015 at 9:59 AM, Hervé Pagès
>>         <hpages at fredhutch.org <mailto:hpages at fredhutch.org>
>>         <mailto:hpages at fredhutch.org <mailto:hpages at fredhutch.org>>>
>> wrote:
>>
>>              Hi Pete,
>>
>>              Excellent idea. That will make things like table() work
>>         out-of-the-box
>>              on GenomicRanges objects. I'll add that.
>>
>>              Thanks,
>>              H.
>>
>>
>>
>>              On 04/24/2015 09:43 AM, Peter Haverty wrote:
>>
>>                  Would people be interested in having this:
>>
>>                  setMethod("as.character", "GenomicRanges",
>>                              function(x) {
>>                                  paste0(seqnames(x), ":", start(x), "-",
>>         end(x))
>>                              })
>>
>>                  ?
>>
>>                  I find myself doing that a lot to make unique names or
>> for
>>                  output that
>>                  goes to collaborators.  I suppose we might want to tack
>>         on the
>>                  strand if it
>>                  isn't "*".  I have some code for going the other
>>         direction too,
>>                  if there is
>>                  interest.
>>
>>
>>
>>                  Pete
>>
>>                  ____________________
>>                  Peter M. Haverty, Ph.D.
>>                  Genentech, Inc.
>>         phaverty at gene.com <mailto:phaverty at gene.com>
>>         <mailto:phaverty at gene.com <mailto:phaverty at gene.com>>
>>
>>                           [[alternative HTML version deleted]]
>>
>>                  _______________________________________________
>>         Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>>         <mailto:Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org
>> >>
>>                  mailing list
>>         https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>>              --
>>              Hervé Pagès
>>
>>              Program in Computational Biology
>>              Division of Public Health Sciences
>>              Fred Hutchinson Cancer Research Center
>>              1100 Fairview Ave. N, M1-B514
>>              P.O. Box 19024
>>              Seattle, WA 98109-1024
>>
>>              E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
>>         <mailto:hpages at fredhutch.org <mailto:hpages at fredhutch.org>>
>>              Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>         <tel:%28206%29%20667-5791>
>>              Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>         <tel:%28206%29%20667-1319>
>>
>>
>>              _______________________________________________
>>         Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>>         <mailto:Bioc-devel at r-project.org
>>         <mailto:Bioc-devel at r-project.org>> mailing list
>>         https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>>
>>     --
>>     Hervé Pagès
>>
>>     Program in Computational Biology
>>     Division of Public Health Sciences
>>     Fred Hutchinson Cancer Research Center
>>     1100 Fairview Ave. N, M1-B514
>>     P.O. Box 19024
>>     Seattle, WA 98109-1024
>>
>>     E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
>>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list