[Bioc-sig-seq] generic for strand in genomeIntervals and GenomicRanges

Nicolas Delhomme delhomme at embl.de
Tue Mar 29 15:24:18 CEST 2011


Hi Michael,

I'll try to give you feedback whenever I encounter a "broken" gff.

Got your point for the gaps function, so I rephrase myself: it just came to my attention recently in a way that I found useful for my purpose :-)

Cheers,

Nico

---------------------------------------------------------------
Nicolas Delhomme

High Throughput Functional Genomics Center

European Molecular Biology Laboratory

Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------




On 29 Mar 2011, at 15:18, Michael Lawrence wrote:

> 
> 
> On Tue, Mar 29, 2011 at 5:57 AM, Nicolas Delhomme <delhomme at embl.de> wrote:
> Thanks Michael!
> 
> I did not know about the `magic` :-)
> 
> There are 2 reasons I use genomeIntervals in parallel to IRanges and rtracklayer.
> 
> First, IMO and experience, the readGff3 function is more robust to incorrectly formatted gff files. In addition accessing the attributes is extremely fast using the parseGffAttributes and getGffAttribute functions.
> 
> 
> I see. rtracklayer has recently gained an argument for choosing specific attributes from GFF, which might help. The support would obviously be better if I actually encountered many GFF files in my daily work. Feedback on e.g. broken files would be appreciated.
>  
> Second, the genomeIntervals had the interval_complement function implemented from the beginning and this has only recently been addressed in IRanges through the gaps function.
> 
> 
> 'gaps' has been in there since before IRanges even existed, as a method on the IRanges data structure in Biostrings.
> 
>  
> Cheers,
> 
> Nico
> 
> ---------------------------------------------------------------
> Nicolas Delhomme
> 
> High Throughput Functional Genomics Center
> 
> European Molecular Biology Laboratory
> 
> Tel: +49 6221 387 8310
> Email: nicolas.delhomme at embl.de
> Meyerhofstrasse 1 - Postfach 10.2209
> 69102 Heidelberg, Germany
> ---------------------------------------------------------------
> 
> 
> 
> 
> On 29 Mar 2011, at 14:40, Michael Lawrence wrote:
> 
> >
> >
> > On Tue, Mar 29, 2011 at 3:17 AM, Nicolas Delhomme <delhomme at embl.de> wrote:
> > Hi Martin,
> >
> > But how would you do for the "replacement" functions, i.e. strand<- ?
> >
> > The following does not work:
> >
> > library(genomeIntervals)
> > j <- new(
> >               "Genome_intervals_stranded",
> >               matrix(
> >                      c(1,2,
> >                        3,5,
> >                        4,6,
> >                        8,9
> >                        ),
> >                      byrow = TRUE,
> >                      ncol = 2
> >               ),
> >               closed = matrix(
> >                                      c(
> >                                              FALSE, FALSE,
> >                                              TRUE, FALSE,
> >                                              TRUE, TRUE,
> >                                              TRUE, FALSE
> >                                       ),
> >                                      byrow = TRUE,
> >                              ncol = 2
> >                              ),
> >           annotation = data.frame(
> >                                      seq_name = factor( c("chr01","chr01", "chr02","chr02") ),
> >                                              strand = factor( c("+", "+", "+", "-") ),
> >                                              inter_base = c(FALSE,FALSE,FALSE,TRUE)
> >                                              )
> >               )
> >
> > > genomeIntervals::strand(j)<-factor(rep("+",4),levels=c("+","-"))
> > Error in genomeIntervals::strand(j) <- factor(rep("+", 4), levels = c("+",  :
> >  invalid function in complex assignment
> >
> >
> >
> > genomeIntervals::`strand<-`(j, factor(rep("+", 4), levels=c("+", "-")))
> >
> > Btw, I'm kind of curious as to why people are using the both packages at the same time. What are the use-cases for using one vs. another?
> >
> > Cheers,
> >
> > Nico
> >
> >
> > > sessionInfo()
> > R version 2.12.2 (2011-02-25)
> > Platform: x86_64-unknown-linux-gnu (64-bit)
> >
> > locale:
> >  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> >  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> >  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
> >  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> >  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > other attached packages:
> > [1] genomeIntervals_1.6.0 Biobase_2.10.0        intervals_0.13.3
> >
> > loaded via a namespace (and not attached):
> > [1] tools_2.12.2
> > >
> >
> > ---------------------------------------------------------------
> > Nicolas Delhomme
> >
> > High Throughput Functional Genomics Center
> >
> > European Molecular Biology Laboratory
> >
> > Tel: +49 6221 387 8310
> > Email: nicolas.delhomme at embl.de
> > Meyerhofstrasse 1 - Postfach 10.2209
> > 69102 Heidelberg, Germany
> > ---------------------------------------------------------------
> >
> >
> >
> >
> > On 28 Mar 2011, at 19:42, Martin Morgan wrote:
> >
> > > On 03/28/2011 05:11 AM, Julien Gagneur wrote:
> > >> Hi,
> > >>
> > >> both genomeIntervals and the more recent GenomicRanges define a
> > >> generic method 'strand'. There is the same issue for the method
> > >> 'reduce' between IRanges and 'Intervals' (which is on CRAN, not on
> > >> Bioconductor). This leads to conflicts for users that load both
> > >> packages. Below sample code (the same happens on R 2.13 devel).
> > >>
> > >> How shall we solve that?
> > >
> > > In general, specify the package from which the generic comes from
> > >
> > > GenomicRanges::strand
> > > genomeIntervals::strand
> > >
> > > It would in general be nice to coordinate generics across packages, but the prospects for that in this particular case are unclear -- genomeIntervals and GenomicRanges have pretty independent and more-or-less mutually exclusive dependencies.
> > >
> > > Martin
> > >
> > >>
> > >> Thanks for your advices,
> > >>
> > >> Julien Gagneur
> > >>
> > >>
> > >>
> > >>
> > >>> library(GenomicRanges)
> > >> Loading required package: IRanges
> > >>
> > >> Attaching package: 'IRanges'
> > >>
> > >> The following object(s) are masked from 'package:base':
> > >>
> > >> Map, cbind, eval, mapply, order, paste, pmax, pmax.int, pmin,
> > >> pmin.int, rbind, rep.int, table
> > >>
> > >>> library(genomeIntervals)
> > >> Loading required package: intervals
> > >>
> > >> Attaching package: 'intervals'
> > >>
> > >> The following object(s) are masked from 'package:IRanges':
> > >>
> > >> reduce
> > >>
> > >>
> > >> Attaching package: 'genomeIntervals'
> > >>
> > >> The following object(s) are masked from 'package:GenomicRanges':
> > >>
> > >> strand, strand<-
> > >>
> > >>> grngs = GRanges(seqnames=c("chr1", "chr2"),
> > >>> ranges=IRanges(start=1:2, end=2:3), strand=c("+","-"))
> > >>> strand(grngs)
> > >> Error in function (classes, fdef, mtable)  : unable to find an
> > >> inherited method for function "strand", for signature "GRanges"
> > >>> reduce(grngs)
> > >> Error in function (classes, fdef, mtable)  : unable to find an
> > >> inherited method for function "reduce", for signature "GRanges"
> > >>
> > >>> sessionInfo()
> > >> R version 2.12.1 (2010-12-16) Platform:
> > >> x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> > >>
> > >> locale: [1] C
> > >>
> > >> attached base packages: [1] stats     graphics  grDevices utils
> > >> datasets  methods   base
> > >>
> > >> other attached packages: [1] intervals_0.13.1    GenomicRanges_1.2.3
> > >> IRanges_1.8.9
> > >>
> > >> loaded via a namespace (and not attached): [1] Biobase_2.10.0
> > >> genomeIntervals_1.7.4 tools_2.12.1
> > >>
> > >> _______________________________________________ Bioc-sig-sequencing
> > >> mailing list Bioc-sig-sequencing at r-project.org
> > >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> > >
> > >
> > > --
> > > Computational Biology
> > > Fred Hutchinson Cancer Research Center
> > > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> > >
> > > Location: M1-B861
> > > Telephone: 206 667-2793
> > >
> > > _______________________________________________
> > > Bioc-sig-sequencing mailing list
> > > Bioc-sig-sequencing at r-project.org
> > > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> >
> > _______________________________________________
> > Bioc-sig-sequencing mailing list
> > Bioc-sig-sequencing at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> >
> 
> 



More information about the Bioc-sig-sequencing mailing list