[Bioc-sig-seq] GRangesList with duplicate names - WORKAROUND

Cook, Malcolm MEC at stowers.org
Wed Aug 3 22:08:01 CEST 2011


Good to hear, thanks for the update, it makes best sense to me, and is most consistent with names elsewhere in R.

~Malcolm


> -----Original Message-----
> From: Hervé Pagès [mailto:hpages at fhcrc.org]
> Sent: Wednesday, August 03, 2011 2:47 PM
> To: Cook, Malcolm
> Cc: 'bioc-sig-sequencing at r-project.org'
> Subject: Re: [Bioc-sig-seq] GRangesList with duplicate names -
> WORKAROUND
> 
> Hi Malcolm,
> 
> The requirement that names of a GRangesList must be unique has actually
> been dropped in BioC 2.9 (current devel):
> 
>    > library(GenomicRanges)
>    > grl <- GRangesList(a=GRanges(), a=GRanges())
>    > validObject(grl)
>    [1] TRUE
> 
> Note that before BioC 2.9 this requirement was not only for GRangesList
> but also for GRanges objects. In BioC 2.9, this requirement has been
> dropped for both containers:
> 
>    > gr <- GRanges(c("chr1", "chr1"), IRanges(1:2, 4:5, names=c("a", "a")))
>    > validObject(gr)
>    [1] TRUE
> 
> Cheers,
> H.
> 
>  > sessionInfo()
> R Under development (unstable) (2011-07-31 r56578)
> Platform: x86_64-unknown-linux-gnu (64-bit)
> 
> locale:
>   [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8
>   [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
> other attached packages:
> [1] GenomicFeatures_1.5.16 GenomicRanges_1.5.21   IRanges_1.11.17
> 
> loaded via a namespace (and not attached):
>   [1] Biobase_2.13.7      biomaRt_2.9.2       Biostrings_2.21.7
>   [4] BSgenome_1.21.3     DBI_0.2-5           RCurl_1.6-6
>   [7] RSQLite_0.9-4       rtracklayer_1.13.10 XML_3.4-0
> [10] zlibbioc_0.1.7
> 
> 
> On 11-08-03 09:23 AM, Cook, Malcolm wrote:
> > Hi,
> >
> > re: https://stat.ethz.ch/pipermail/bioc-sig-sequencing/2011-
> February/001867.html
> >
> > It appears that the requirement that names of a GRangesList must be
> unique has won the day.
> >
> > I'm not sure I agree with this result, as there are many operations that are
> optimized for GRangesList that have use cases that do not depend upon such
> uniqueness.
> >
> > However, I find that such operations can proceed after setting the names
> to NULL,, as is demonstrated in the R session following my signature, in which
> I am creating a GRanges list with entirely duplicate (unnamed) elements.
> >
> > I hope this workaround proves useful to others...
> >
> > Malcolm Cook
> > Computational Biology - Stowers Institute for Medical Research
> >
> >
> >
> > Example:
> >
> >>
> grl=GRangesList(a=GRanges(c(10,20,30),c(15,25,35)),b=GRanges(c(100,200,3
> 00),c(150,250,350)))
> >> grl
> > GRangesList of length 2
> > $a
> > GRanges with 3 ranges and 0 elementMetadata values
> >      seqnames    ranges strand |
> >         <Rle>  <IRanges>   <Rle>  |
> > [1]       10  [15, 15]      * |
> > [2]       20  [25, 25]      * |
> > [3]       30  [35, 35]      * |
> >
> > $b
> > GRanges with 3 ranges and 0 elementMetadata values
> >      seqnames     ranges strand |
> >         <Rle>   <IRanges>   <Rle>  |
> > [1]      100 [150, 150]      * |
> > [2]      200 [250, 250]      * |
> > [3]      300 [350, 350]      * |
> >
> >
> > seqlengths
> >    10  20  30 100 200 300
> >    NA  NA  NA  NA  NA  NA
> >> grl[rep(1,2)]
> > Error in `rownames<-`(`*tmp*`, value = c("a", "a")) :
> >    duplicate rownames not allowed
> >> names(grl)=NULL
> >> grl[rep(1,2)]
> > GRangesList of length 2
> > [[1]]
> > GRanges with 3 ranges and 0 elementMetadata values
> >      seqnames    ranges strand |
> >         <Rle>  <IRanges>   <Rle>  |
> > [1]       10  [15, 15]      * |
> > [2]       20  [25, 25]      * |
> > [3]       30  [35, 35]      * |
> >
> > [[2]]
> > GRanges with 3 ranges and 0 elementMetadata values
> >      seqnames    ranges strand |
> >         <Rle>  <IRanges>   <Rle>  |
> > [1]       10  [15, 15]      * |
> > [2]       20  [25, 25]      * |
> > [3]       30  [35, 35]      * |
> >
> >
> > seqlengths
> >    10  20  30 100 200 300
> >    NA  NA  NA  NA  NA  NA
> >>
> >
> > _______________________________________________
> > Bioc-sig-sequencing mailing list
> > Bioc-sig-sequencing at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> 
> 
> --
> Hervé Pagès
> 
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
> 
> E-mail: hpages at fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319



More information about the Bioc-sig-sequencing mailing list