[Bioc-devel] as.character method for GenomicRanges?

Hervé Pagès hpages at fredhutch.org
Fri Apr 24 20:26:09 CEST 2015


On 04/24/2015 11:08 AM, Peter Haverty wrote:
> Good catch. We'll want the strand in case we need to go back to a GRanges.
> I would make the strand addition optional with the default of FALSE.
> It's nice to have a column of strings you can paste right into a genome
> browser (sorry Michael :-) ).  I often pass my bench collaborators a
> spreadsheet with such a column.

as.character(unstrand(gr)) ?

3 reasons I'm not too keen about 'ignore.strand=TRUE' being the default:

(1) Many functions and methods in GenomicRanges/GenomicAlignments
     have an 'ignore.strand' argument. For consistency, the default
     value has been set to FALSE everywhere. Note that this was done
     even if this default doesn't reflect the most common use case
     (e.g. summarizeOverlaps).

(2) I think it's good to have the default behavior of as.character()
     allow going back and forth between GRanges and character vector
     without losing the strand information.

(3) The "table" method for Vector would break if as.character was
     ignoring the strand by default. Can be worked-around by
     implementing a method for GenomicRanges objects but...

Hope that makes sense.

H.


>
> Pete
>
> ____________________
> Peter M. Haverty, Ph.D.
> Genentech, Inc.
> phaverty at gene.com <mailto:phaverty at gene.com>
>
> On Fri, Apr 24, 2015 at 10:50 AM, Hervé Pagès <hpages at fredhutch.org
> <mailto:hpages at fredhutch.org>> wrote:
>
>     On 04/24/2015 10:21 AM, Michael Lawrence wrote:
>
>         Sorry, one more concern, if you're thinking of using as a range
>         key, you
>         will need the strand, but many use cases might not want the
>         strand on
>         there. Like for pasting into a genome browser.
>
>
>     What about appending the strand only for GRanges objects that
>     have at least one range that is not on *?
>
>     setMethod("as.character", "GenomicRanges",
>          function(x)
>          {
>              if (length(x) == 0L)
>                  return(character(0))
>              ans <- paste0(seqnames(x), ":", start(x), "-", end(x))
>              if (any(strand(x) != "*"))
>                    ans <- paste0(ans, ":", strand(x))
>              ans
>          }
>     )
>
>      > as.character(gr)
>       [1] "chr1:1-10"  "chr2:2-10"  "chr2:3-10"  "chr2:4-10"  "chr1:5-10"
>       [6] "chr1:6-10"  "chr3:7-10"  "chr3:8-10"  "chr3:9-10"  "chr3:10-10"
>
>      > strand(gr)[2:3] <- c("-", "+")
>      > as.character(gr)
>       [1] "chr1:1-10:*"  "chr2:2-10:-"  "chr2:3-10:+"  "chr2:4-10:*"
>     "chr1:5-10:*"
>       [6] "chr1:6-10:*"  "chr3:7-10:*"  "chr3:8-10:*"  "chr3:9-10:*"
>     "chr3:10-10:*"
>
>     H.
>
>
>         On Fri, Apr 24, 2015 at 10:18 AM, Michael Lawrence
>         <michafla at gene.com <mailto:michafla at gene.com>
>         <mailto:michafla at gene.com <mailto:michafla at gene.com>>> wrote:
>
>              It is a great idea, but I'm not sure I would use it to
>         implement
>              table(). Allocating those strings will be costly. Don't we
>         already
>              have the 4-way int hash? Of course, my intuition might be
>         completely
>              off here.
>
>
>              On Fri, Apr 24, 2015 at 9:59 AM, Hervé Pagès
>         <hpages at fredhutch.org <mailto:hpages at fredhutch.org>
>              <mailto:hpages at fredhutch.org
>         <mailto:hpages at fredhutch.org>>> wrote:
>
>                  Hi Pete,
>
>                  Excellent idea. That will make things like table() work
>                  out-of-the-box
>                  on GenomicRanges objects. I'll add that.
>
>                  Thanks,
>                  H.
>
>
>
>                  On 04/24/2015 09:43 AM, Peter Haverty wrote:
>
>                      Would people be interested in having this:
>
>                      setMethod("as.character", "GenomicRanges",
>                                  function(x) {
>                                      paste0(seqnames(x), ":", start(x),
>         "-", end(x))
>                                  })
>
>                      ?
>
>                      I find myself doing that a lot to make unique names
>         or for
>                      output that
>                      goes to collaborators.  I suppose we might want to
>         tack on
>                      the strand if it
>                      isn't "*".  I have some code for going the other
>         direction
>                      too, if there is
>                      interest.
>
>
>
>                      Pete
>
>                      ____________________
>                      Peter M. Haverty, Ph.D.
>                      Genentech, Inc.
>         phaverty at gene.com <mailto:phaverty at gene.com>
>         <mailto:phaverty at gene.com <mailto:phaverty at gene.com>>
>
>                               [[alternative HTML version deleted]]
>
>                      _______________________________________________
>         Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>         <mailto:Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>>
>                      mailing list
>         https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
>                  --
>                  Hervé Pagès
>
>                  Program in Computational Biology
>                  Division of Public Health Sciences
>                  Fred Hutchinson Cancer Research Center
>                  1100 Fairview Ave. N, M1-B514
>                  P.O. Box 19024
>                  Seattle, WA 98109-1024
>
>                  E-mail: hpages at fredhutch.org
>         <mailto:hpages at fredhutch.org> <mailto:hpages at fredhutch.org
>         <mailto:hpages at fredhutch.org>>
>                  Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>         <tel:%28206%29%20667-5791>
>                  Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>         <tel:%28206%29%20667-1319>
>
>
>                  _______________________________________________
>         Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>         <mailto:Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>>
>                  mailing list
>         https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
>
>
>     --
>     Hervé Pagès
>
>     Program in Computational Biology
>     Division of Public Health Sciences
>     Fred Hutchinson Cancer Research Center
>     1100 Fairview Ave. N, M1-B514
>     P.O. Box 19024
>     Seattle, WA 98109-1024
>
>     E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list