[Bioc-sig-seq] generic for strand in genomeIntervals and GenomicRanges
Nicolas Delhomme
delhomme at embl.de
Tue Mar 29 15:24:18 CEST 2011
Hi Michael,
I'll try to give you feedback whenever I encounter a "broken" gff.
Got your point for the gaps function, so I rephrase myself: it just came to my attention recently in a way that I found useful for my purpose :-)
Cheers,
Nico
---------------------------------------------------------------
Nicolas Delhomme
High Throughput Functional Genomics Center
European Molecular Biology Laboratory
Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------
On 29 Mar 2011, at 15:18, Michael Lawrence wrote:
>
>
> On Tue, Mar 29, 2011 at 5:57 AM, Nicolas Delhomme <delhomme at embl.de> wrote:
> Thanks Michael!
>
> I did not know about the `magic` :-)
>
> There are 2 reasons I use genomeIntervals in parallel to IRanges and rtracklayer.
>
> First, IMO and experience, the readGff3 function is more robust to incorrectly formatted gff files. In addition accessing the attributes is extremely fast using the parseGffAttributes and getGffAttribute functions.
>
>
> I see. rtracklayer has recently gained an argument for choosing specific attributes from GFF, which might help. The support would obviously be better if I actually encountered many GFF files in my daily work. Feedback on e.g. broken files would be appreciated.
>
> Second, the genomeIntervals had the interval_complement function implemented from the beginning and this has only recently been addressed in IRanges through the gaps function.
>
>
> 'gaps' has been in there since before IRanges even existed, as a method on the IRanges data structure in Biostrings.
>
>
> Cheers,
>
> Nico
>
> ---------------------------------------------------------------
> Nicolas Delhomme
>
> High Throughput Functional Genomics Center
>
> European Molecular Biology Laboratory
>
> Tel: +49 6221 387 8310
> Email: nicolas.delhomme at embl.de
> Meyerhofstrasse 1 - Postfach 10.2209
> 69102 Heidelberg, Germany
> ---------------------------------------------------------------
>
>
>
>
> On 29 Mar 2011, at 14:40, Michael Lawrence wrote:
>
> >
> >
> > On Tue, Mar 29, 2011 at 3:17 AM, Nicolas Delhomme <delhomme at embl.de> wrote:
> > Hi Martin,
> >
> > But how would you do for the "replacement" functions, i.e. strand<- ?
> >
> > The following does not work:
> >
> > library(genomeIntervals)
> > j <- new(
> > "Genome_intervals_stranded",
> > matrix(
> > c(1,2,
> > 3,5,
> > 4,6,
> > 8,9
> > ),
> > byrow = TRUE,
> > ncol = 2
> > ),
> > closed = matrix(
> > c(
> > FALSE, FALSE,
> > TRUE, FALSE,
> > TRUE, TRUE,
> > TRUE, FALSE
> > ),
> > byrow = TRUE,
> > ncol = 2
> > ),
> > annotation = data.frame(
> > seq_name = factor( c("chr01","chr01", "chr02","chr02") ),
> > strand = factor( c("+", "+", "+", "-") ),
> > inter_base = c(FALSE,FALSE,FALSE,TRUE)
> > )
> > )
> >
> > > genomeIntervals::strand(j)<-factor(rep("+",4),levels=c("+","-"))
> > Error in genomeIntervals::strand(j) <- factor(rep("+", 4), levels = c("+", :
> > invalid function in complex assignment
> >
> >
> >
> > genomeIntervals::`strand<-`(j, factor(rep("+", 4), levels=c("+", "-")))
> >
> > Btw, I'm kind of curious as to why people are using the both packages at the same time. What are the use-cases for using one vs. another?
> >
> > Cheers,
> >
> > Nico
> >
> >
> > > sessionInfo()
> > R version 2.12.2 (2011-02-25)
> > Platform: x86_64-unknown-linux-gnu (64-bit)
> >
> > locale:
> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> > [9] LC_ADDRESS=C LC_TELEPHONE=C
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] stats graphics grDevices utils datasets methods base
> >
> > other attached packages:
> > [1] genomeIntervals_1.6.0 Biobase_2.10.0 intervals_0.13.3
> >
> > loaded via a namespace (and not attached):
> > [1] tools_2.12.2
> > >
> >
> > ---------------------------------------------------------------
> > Nicolas Delhomme
> >
> > High Throughput Functional Genomics Center
> >
> > European Molecular Biology Laboratory
> >
> > Tel: +49 6221 387 8310
> > Email: nicolas.delhomme at embl.de
> > Meyerhofstrasse 1 - Postfach 10.2209
> > 69102 Heidelberg, Germany
> > ---------------------------------------------------------------
> >
> >
> >
> >
> > On 28 Mar 2011, at 19:42, Martin Morgan wrote:
> >
> > > On 03/28/2011 05:11 AM, Julien Gagneur wrote:
> > >> Hi,
> > >>
> > >> both genomeIntervals and the more recent GenomicRanges define a
> > >> generic method 'strand'. There is the same issue for the method
> > >> 'reduce' between IRanges and 'Intervals' (which is on CRAN, not on
> > >> Bioconductor). This leads to conflicts for users that load both
> > >> packages. Below sample code (the same happens on R 2.13 devel).
> > >>
> > >> How shall we solve that?
> > >
> > > In general, specify the package from which the generic comes from
> > >
> > > GenomicRanges::strand
> > > genomeIntervals::strand
> > >
> > > It would in general be nice to coordinate generics across packages, but the prospects for that in this particular case are unclear -- genomeIntervals and GenomicRanges have pretty independent and more-or-less mutually exclusive dependencies.
> > >
> > > Martin
> > >
> > >>
> > >> Thanks for your advices,
> > >>
> > >> Julien Gagneur
> > >>
> > >>
> > >>
> > >>
> > >>> library(GenomicRanges)
> > >> Loading required package: IRanges
> > >>
> > >> Attaching package: 'IRanges'
> > >>
> > >> The following object(s) are masked from 'package:base':
> > >>
> > >> Map, cbind, eval, mapply, order, paste, pmax, pmax.int, pmin,
> > >> pmin.int, rbind, rep.int, table
> > >>
> > >>> library(genomeIntervals)
> > >> Loading required package: intervals
> > >>
> > >> Attaching package: 'intervals'
> > >>
> > >> The following object(s) are masked from 'package:IRanges':
> > >>
> > >> reduce
> > >>
> > >>
> > >> Attaching package: 'genomeIntervals'
> > >>
> > >> The following object(s) are masked from 'package:GenomicRanges':
> > >>
> > >> strand, strand<-
> > >>
> > >>> grngs = GRanges(seqnames=c("chr1", "chr2"),
> > >>> ranges=IRanges(start=1:2, end=2:3), strand=c("+","-"))
> > >>> strand(grngs)
> > >> Error in function (classes, fdef, mtable) : unable to find an
> > >> inherited method for function "strand", for signature "GRanges"
> > >>> reduce(grngs)
> > >> Error in function (classes, fdef, mtable) : unable to find an
> > >> inherited method for function "reduce", for signature "GRanges"
> > >>
> > >>> sessionInfo()
> > >> R version 2.12.1 (2010-12-16) Platform:
> > >> x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> > >>
> > >> locale: [1] C
> > >>
> > >> attached base packages: [1] stats graphics grDevices utils
> > >> datasets methods base
> > >>
> > >> other attached packages: [1] intervals_0.13.1 GenomicRanges_1.2.3
> > >> IRanges_1.8.9
> > >>
> > >> loaded via a namespace (and not attached): [1] Biobase_2.10.0
> > >> genomeIntervals_1.7.4 tools_2.12.1
> > >>
> > >> _______________________________________________ Bioc-sig-sequencing
> > >> mailing list Bioc-sig-sequencing at r-project.org
> > >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> > >
> > >
> > > --
> > > Computational Biology
> > > Fred Hutchinson Cancer Research Center
> > > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> > >
> > > Location: M1-B861
> > > Telephone: 206 667-2793
> > >
> > > _______________________________________________
> > > Bioc-sig-sequencing mailing list
> > > Bioc-sig-sequencing at r-project.org
> > > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> >
> > _______________________________________________
> > Bioc-sig-sequencing mailing list
> > Bioc-sig-sequencing at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> >
>
>
More information about the Bioc-sig-sequencing
mailing list