[BioC] Confusion over inconsistencies with showMethods('Rle') when loaded via GenomicRanges
    Steve Lianoglou 
    mailinglist.honeypot at gmail.com
       
    Wed Jan 30 12:39:22 CET 2013
    
    
  
Hi,
On Wed, Jan 30, 2013 at 4:57 AM,  <hickey at wehi.edu.au> wrote:
> There appear to be different methods available for 'Rle' when loaded via the GenomicRanges package depending on whether a GRanges object has been created. Specifically, prior to a GRanges object being created there are no 'values = character' methods for 'Rle'. This doesn't make sense to me and is causing me problems in code I am developing.
>
> The following code highlights the cause of my confusion:
>
> | > library(GenomicRanges)
> | Loading required package: BiocGenerics
> |
> | Attaching package: ‘BiocGenerics’
> |
> | The following object(s) are masked from ‘package:stats’:
> |
> |     xtabs
> |
> | The following object(s) are masked from ‘package:base’:
> |
> |    anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find,
> |    get, intersect, lapply, Map, mapply, mget, order, paste, pmax,
> |    pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int,
> |    rownames, sapply, setdiff, table, tapply, union, unique
>
> | Loading required package: IRanges
> | > showMethods('Rle')
> | Function: Rle (package IRanges)
> | values="missing", lengths="missing"
> | values="vectorORfactor", lengths="integer"
> | values="vectorORfactor", lengths="missing"
> | values="vectorORfactor", lengths="numeric"
>
> ## Only 4 methods are available for Rle
>
> | > seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1")
> | > gr <- GRanges(seqnames = Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
> | +               ranges = IRanges(1:10, width = 10:1, names = head(letters,10)),
> | +               strand = Rle(strand(c("-", "+", "*", "+", "-")),c(1, 2, 2, 3, 2)),
> | +               score = 1:10, GC = seq(1, 0, length=10),
> | +               seqinfo = seqinfo)
> | > gr
> | GRanges with 10 ranges and 2 metadata columns:
> |     seqnames    ranges strand |     score                GC
> |        <Rle> <IRanges>  <Rle> | <integer>         <numeric>
> |   a     chr1  [ 1, 10]      - |         1                 1
> |   b     chr2  [ 2, 10]      + |         2 0.888888888888889
> |   c     chr2  [ 3, 10]      + |         3 0.777777777777778
> |   d     chr2  [ 4, 10]      * |         4 0.666666666666667
> |   e     chr1  [ 5, 10]      * |         5 0.555555555555556
> |   f     chr1  [ 6, 10]      + |         6 0.444444444444444
> |   g     chr3  [ 7, 10]      + |         7 0.333333333333333
> |   h     chr3  [ 8, 10]      + |         8 0.222222222222222
> |   i     chr3  [ 9, 10]      - |         9 0.111111111111111
> |   j     chr3  [10, 10]      - |        10                 0
> |   ---
> |   seqlengths:
> |   chr1 chr2 chr3
> |   1000 2000 1500
> | > showMethods('Rle')
> | Function: Rle (package IRanges)
> | values="character", lengths="integer"
> |     (inherited from: values="vectorORfactor", lengths="integer")
> | values="character", lengths="numeric"
> |     (inherited from: values="vectorORfactor", lengths="numeric")
> | values="factor", lengths="integer"
> |     (inherited from: values="vectorORfactor", lengths="integer")
> | values="factor", lengths="numeric"
> |     (inherited from: values="vectorORfactor", lengths="numeric")
> | values="missing", lengths="missing"
> | values="vectorORfactor", lengths="integer"
> | values="vectorORfactor", lengths="missing"
> | values="vectorORfactor", lengths="numeric"
>
> ## Now, there are 8 methods available for Rle
>
> Is this a bug or am I missing something? If I'm just missing something, can someone please explain how I can ensure that the methods involving 'values = character' are available to me upon loading of the GenomicRanges package?
It's not a bug, it just means that Rle was used on a character vector,
which doesn't have it's own signature and you are being told that it
used the one defined from values="vectorOrFactor", as it is the one
most closely related to the inputs that have been provided given the
functions already defined and the class hierarchy.
For instance, let's get on the same initial page:
R> library(GenomicRanges)
R> showMethods("Rle")
Function: Rle (package IRanges)
values="missing", lengths="missing"
values="vectorORfactor", lengths="integer"
values="vectorORfactor", lengths="missing"
values="vectorORfactor", lengths="numeric"
Ok -- now I try to create an Rle from a character vector:
R> set.seed(123)
R> x <- Rle(sample(letters[1:5], 100, replace=TRUE))
R> showMethods("Rle")
Function: Rle (package IRanges)
values="character", lengths="integer"
    (inherited from: values="vectorORfactor", lengths="integer")
values="character", lengths="missing"
    (inherited from: values="vectorORfactor", lengths="missing")
values="missing", lengths="missing"
values="vectorORfactor", lengths="integer"
values="vectorORfactor", lengths="missing"
values="vectorORfactor", lengths="numeric"
Looks like the initial call to Rle(x) triggers the
c(values="character", lengths="missing") "version" of the function.
There is no "direct/specific" implementation of this function, so R
grabs the next closest thing (inherited from
c(values="vectorORfactor", values="missing")). That function likely
internally will call a version of the function with a signature like
so c(values="character", lengths="integer"), which R wants to tell you
has no direct implementation, and is using makes the second "inherited
version" defined w/ vectorOrFactor and integer inputs.
The question is -- what makes you think there is no version of Rle
that accepts just a character vector as its first argument when you
load GenomicRanges from the get go? Does the above example not work
for you in a clean R session?
I'm guessing something else is going wrong with your code, but we'll
need some sort of minimal reproducible example to help sort that out.
HTH,
-steve
-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
    
    
More information about the Bioconductor
mailing list