[Bioc-devel] 'semantically rich' subsetting of SummarizedExperiments

Gabe Becker becker.gabe at gene.com
Sat Sep 20 21:11:41 CEST 2014


Hey all,

We are in the (very) early stages of experimenting with something that
seems relevant here: classed identifiers. We are using them for
database/mart queries, but the same concept could be useful for the cases
you're describing I think.

E.g.

> mysyms = GeneSymbol(c("BRAF", "BRCA1"))
> mysyms
An object of class "GeneSymbol"
[1] "BRAF"  "BRCA1"
> yourSE[mysyms, ]
...


This approach has the benefit of being declarative instead of heuristic
(people won't be able to accidentally invoke it), while still giving most
of the convenience I believe you are looking for.

The object classes inherit directly from character, so should "just work"
most of the time, but as I said it's early days; lots more testing for
functionality and usefulness is needed.

~G


On Sat, Sep 20, 2014 at 11:38 AM, Vincent Carey <stvjc at channing.harvard.edu>
wrote:

> OK by me to leave [ alone.  We could start with subsetByEntrez,
> subsetByKEGG, subsetBySymbol, subsetByGOTERM, subsetByGOID.
>
> Utilities to generate GRanges for queries in each of these vocabularies
> should, perhaps, be in the OrganismDb space?  Once those are in place
> no additional infrastructure is necessary?
>
> On Sat, Sep 20, 2014 at 12:49 PM, Tim Triche, Jr. <tim.triche at gmail.com>
> wrote:
>
> > Agreed with Sean, having tried implementing to "magical" alternative
> >
> > --t
> >
> > > On Sep 20, 2014, at 9:31 AM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
> > >
> > > Hi, Vince.
> > >
> > > I'm coming a little late to the party, but I agree with Kasper's
> > sentiment
> > > that the less "magical" approach of using subsetByXXX might be the
> > cleaner
> > > way to go for the time being.
> > >
> > > Sean
> > >
> > >
> > > On Sat, Sep 20, 2014 at 10:42 AM, Vincent Carey <
> > stvjc at channing.harvard.edu>
> > > wrote:
> > >
> > >>
> > >>
> >
> https://github.com/vjcitn/biocMultiAssay/blob/master/vignettes/SEresolver.Rnw
> > >>
> > >> shows some modifications to [ that allow subsetting of SE by
> > >> gene or pathway name
> > >>
> > >> it may be premature to work at the [ level.  Kasper suggested defining
> > >> a suite of subsetBy operations that would accomplish this
> > >>
> > >> i think we could get something along these lines into the release
> > without
> > >> too much more work.  votes?
> > >>
> > >>        [[alternative HTML version deleted]]
> > >>
> > >> _______________________________________________
> > >> Bioc-devel at r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> > >
> > >    [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
> > > Bioc-devel at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



-- 
Computational Biologist
Genentech Research

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list