[Bioc-sig-seq] GRangesList with duplicate names - WORKAROUND

Hervé Pagès hpages at fhcrc.org
Wed Aug 3 21:46:59 CEST 2011


Hi Malcolm,

The requirement that names of a GRangesList must be unique has actually
been dropped in BioC 2.9 (current devel):

   > library(GenomicRanges)
   > grl <- GRangesList(a=GRanges(), a=GRanges())
   > validObject(grl)
   [1] TRUE

Note that before BioC 2.9 this requirement was not only for GRangesList
but also for GRanges objects. In BioC 2.9, this requirement has been
dropped for both containers:

   > gr <- GRanges(c("chr1", "chr1"), IRanges(1:2, 4:5, names=c("a", "a")))
   > validObject(gr)
   [1] TRUE

Cheers,
H.

 > sessionInfo()
R Under development (unstable) (2011-07-31 r56578)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8
  [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] GenomicFeatures_1.5.16 GenomicRanges_1.5.21   IRanges_1.11.17

loaded via a namespace (and not attached):
  [1] Biobase_2.13.7      biomaRt_2.9.2       Biostrings_2.21.7
  [4] BSgenome_1.21.3     DBI_0.2-5           RCurl_1.6-6
  [7] RSQLite_0.9-4       rtracklayer_1.13.10 XML_3.4-0
[10] zlibbioc_0.1.7


On 11-08-03 09:23 AM, Cook, Malcolm wrote:
> Hi,
>
> re: https://stat.ethz.ch/pipermail/bioc-sig-sequencing/2011-February/001867.html
>
> It appears that the requirement that names of a GRangesList must be unique has won the day.
>
> I'm not sure I agree with this result, as there are many operations that are optimized for GRangesList that have use cases that do not depend upon such uniqueness.
>
> However, I find that such operations can proceed after setting the names to NULL,, as is demonstrated in the R session following my signature, in which I am creating a GRanges list with entirely duplicate (unnamed) elements.
>
> I hope this workaround proves useful to others...
>
> Malcolm Cook
> Computational Biology - Stowers Institute for Medical Research
>
>
>
> Example:
>
>> grl=GRangesList(a=GRanges(c(10,20,30),c(15,25,35)),b=GRanges(c(100,200,300),c(150,250,350)))
>> grl
> GRangesList of length 2
> $a
> GRanges with 3 ranges and 0 elementMetadata values
>      seqnames    ranges strand |
>         <Rle>  <IRanges>   <Rle>  |
> [1]       10  [15, 15]      * |
> [2]       20  [25, 25]      * |
> [3]       30  [35, 35]      * |
>
> $b
> GRanges with 3 ranges and 0 elementMetadata values
>      seqnames     ranges strand |
>         <Rle>   <IRanges>   <Rle>  |
> [1]      100 [150, 150]      * |
> [2]      200 [250, 250]      * |
> [3]      300 [350, 350]      * |
>
>
> seqlengths
>    10  20  30 100 200 300
>    NA  NA  NA  NA  NA  NA
>> grl[rep(1,2)]
> Error in `rownames<-`(`*tmp*`, value = c("a", "a")) :
>    duplicate rownames not allowed
>> names(grl)=NULL
>> grl[rep(1,2)]
> GRangesList of length 2
> [[1]]
> GRanges with 3 ranges and 0 elementMetadata values
>      seqnames    ranges strand |
>         <Rle>  <IRanges>   <Rle>  |
> [1]       10  [15, 15]      * |
> [2]       20  [25, 25]      * |
> [3]       30  [35, 35]      * |
>
> [[2]]
> GRanges with 3 ranges and 0 elementMetadata values
>      seqnames    ranges strand |
>         <Rle>  <IRanges>   <Rle>  |
> [1]       10  [15, 15]      * |
> [2]       20  [25, 25]      * |
> [3]       30  [35, 35]      * |
>
>
> seqlengths
>    10  20  30 100 200 300
>    NA  NA  NA  NA  NA  NA
>>
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-sig-sequencing mailing list