[Bioc-sig-seq] GRangesList with duplicate names

Pages, Herve hpages at fhcrc.org
Fri Feb 25 09:08:47 CET 2011

Hi Dario,

A GRangesList object with duplicated names is apparently
considered broken:

> grl <- GRangesList(GRanges(), GRanges())
> names(grl) <- c("a", "a")
> validObject(grl)
Error in `rownames<-`(`*tmp*`, value = c("a", "a")) : 
  duplicate rownames not allowed

If we are ok with this feature, we should fix the "names<-"
method (and any other code around that lets the user generate
broken objects).

But if we are not ok with this feature, we should modify
the validity method for GRangesList objects. I tend to prefer
this solution for 3 reasons:

  1. Consistency with ordinary vectors: the names of a vector
     in R are not required to be unique.

  2. It's not uncommon to see the same name used for 2 different
     genes. One might still want to be able to stick those names
     on a GRangesList object where each top-level element corresponds
     to a gene (e.g. exons grouped by gene).

  3. It's easier to modify the validity method than to go around
     trying to find and fix every piece of code in GenomicRanges
     (and maybe other places) that can potentially produce a
     GRangesList object with duplicated names.

How do our power users feel about this?


----- Original Message -----
From: "Dario Strbenac" <D.Strbenac at garvan.org.au>
To: bioc-sig-sequencing at r-project.org
Sent: Thursday, February 24, 2011 10:00:11 PM
Subject: [Bioc-sig-seq] GRangesList with duplicate names


It is possible to create a GRangesList with duplicated names, but not to re-order it.

> summary(grl)
     Length       Class        Mode 
          3 GRangesList          S4 
> names(grl) <- c("Cancer", "Cancer", "Normal")
> grl[3:1]
Error in `rownames<-`(`*tmp*`, value = c("Normal", "Cancer", "Cancer")) : 
  duplicate rownames not allowed
> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-unknown-linux-gnu (64-bit)

 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=C              LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] GenomicRanges_1.2.3 IRanges_1.8.9      

Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010

Bioc-sig-sequencing mailing list
Bioc-sig-sequencing at r-project.org

More information about the Bioc-sig-sequencing mailing list