[Bioc-devel] 'semantically rich' subsetting of SummarizedExperiments

Gabe Becker becker.gabe at gene.com
Sun Sep 21 05:34:32 CEST 2014


Sean and Vincent,

The goal of what we are doing builds off of what Martin has in GSEABase. We
were looking to see how much benefit we can get with something
lighter-weight that lies between indistinguishable character vectors and
the full machinery of GeneSets.

Either way, it seems like formalizing the semantic information is a way to
do what you want. Furthermore, these classed id objects can be created
automatically when there is contextual information e.g. during queries to
databases (or db-like objects), and then simply added to metadata
DataFrames and re-used.

~G




On Sat, Sep 20, 2014 at 12:19 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:

>
>
> On Sat, Sep 20, 2014 at 3:11 PM, Gabe Becker <becker.gabe at gene.com> wrote:
>
>> Hey all,
>>
>> We are in the (very) early stages of experimenting with something that
>> seems relevant here: classed identifiers. We are using them for
>> database/mart queries, but the same concept could be useful for the cases
>> you're describing I think.
>>
>> E.g.
>>
>> > mysyms = GeneSymbol(c("BRAF", "BRCA1"))
>> > mysyms
>> An object of class "GeneSymbol"
>> [1] "BRAF"  "BRCA1"
>> > yourSE[mysyms, ]
>> ...
>>
>>
> This approach has the flavor of some of the functionality that Martin put
> together for the GSEABase package (EntrezIdentifier, etc.).
>
> Sean
>
>
>
>>
>> This approach has the benefit of being declarative instead of heuristic
>> (people won't be able to accidentally invoke it), while still giving most
>> of the convenience I believe you are looking for.
>>
>> The object classes inherit directly from character, so should "just work"
>> most of the time, but as I said it's early days; lots more testing for
>> functionality and usefulness is needed.
>>
>> ~G
>>
>>
>> On Sat, Sep 20, 2014 at 11:38 AM, Vincent Carey <
>> stvjc at channing.harvard.edu>
>> wrote:
>>
>> > OK by me to leave [ alone.  We could start with subsetByEntrez,
>> > subsetByKEGG, subsetBySymbol, subsetByGOTERM, subsetByGOID.
>> >
>> > Utilities to generate GRanges for queries in each of these vocabularies
>> > should, perhaps, be in the OrganismDb space?  Once those are in place
>> > no additional infrastructure is necessary?
>> >
>> > On Sat, Sep 20, 2014 at 12:49 PM, Tim Triche, Jr. <tim.triche at gmail.com
>> >
>> > wrote:
>> >
>> > > Agreed with Sean, having tried implementing to "magical" alternative
>> > >
>> > > --t
>> > >
>> > > > On Sep 20, 2014, at 9:31 AM, Sean Davis <sdavis2 at mail.nih.gov>
>> wrote:
>> > > >
>> > > > Hi, Vince.
>> > > >
>> > > > I'm coming a little late to the party, but I agree with Kasper's
>> > > sentiment
>> > > > that the less "magical" approach of using subsetByXXX might be the
>> > > cleaner
>> > > > way to go for the time being.
>> > > >
>> > > > Sean
>> > > >
>> > > >
>> > > > On Sat, Sep 20, 2014 at 10:42 AM, Vincent Carey <
>> > > stvjc at channing.harvard.edu>
>> > > > wrote:
>> > > >
>> > > >>
>> > > >>
>> > >
>> >
>> https://github.com/vjcitn/biocMultiAssay/blob/master/vignettes/SEresolver.Rnw
>> > > >>
>> > > >> shows some modifications to [ that allow subsetting of SE by
>> > > >> gene or pathway name
>> > > >>
>> > > >> it may be premature to work at the [ level.  Kasper suggested
>> defining
>> > > >> a suite of subsetBy operations that would accomplish this
>> > > >>
>> > > >> i think we could get something along these lines into the release
>> > > without
>> > > >> too much more work.  votes?
>> > > >>
>> > > >>        [[alternative HTML version deleted]]
>> > > >>
>> > > >> _______________________________________________
>> > > >> Bioc-devel at r-project.org mailing list
>> > > >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> > > >
>> > > >    [[alternative HTML version deleted]]
>> > > >
>> > > > _______________________________________________
>> > > > Bioc-devel at r-project.org mailing list
>> > > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> > >
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > Bioc-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >
>>
>>
>>
>> --
>> Computational Biologist
>> Genentech Research
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>


-- 
Computational Biologist
Genentech Research

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list