[Bioc-devel] 'semantically rich' subsetting of SummarizedExperiments

Vincent Carey stvjc at channing.harvard.edu
Sat Oct 11 17:41:31 CEST 2014


Is there anything on the order of as([GeneSet], "GRanges") around?

On Sat, Sep 20, 2014 at 11:34 PM, Gabe Becker <becker.gabe at gene.com> wrote:

> Sean and Vincent,
>
> The goal of what we are doing builds off of what Martin has in GSEABase.
> We were looking to see how much benefit we can get with something
> lighter-weight that lies between indistinguishable character vectors and
> the full machinery of GeneSets.
>
> Either way, it seems like formalizing the semantic information is a way to
> do what you want. Furthermore, these classed id objects can be created
> automatically when there is contextual information e.g. during queries to
> databases (or db-like objects), and then simply added to metadata
> DataFrames and re-used.
>
> ~G
>
>
>
>
> On Sat, Sep 20, 2014 at 12:19 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>
>>
>>
>> On Sat, Sep 20, 2014 at 3:11 PM, Gabe Becker <becker.gabe at gene.com>
>> wrote:
>>
>>> Hey all,
>>>
>>> We are in the (very) early stages of experimenting with something that
>>> seems relevant here: classed identifiers. We are using them for
>>> database/mart queries, but the same concept could be useful for the cases
>>> you're describing I think.
>>>
>>> E.g.
>>>
>>> > mysyms = GeneSymbol(c("BRAF", "BRCA1"))
>>> > mysyms
>>> An object of class "GeneSymbol"
>>> [1] "BRAF"  "BRCA1"
>>> > yourSE[mysyms, ]
>>> ...
>>>
>>>
>> This approach has the flavor of some of the functionality that Martin put
>> together for the GSEABase package (EntrezIdentifier, etc.).
>>
>> Sean
>>
>>
>>
>>>
>>> This approach has the benefit of being declarative instead of heuristic
>>> (people won't be able to accidentally invoke it), while still giving most
>>> of the convenience I believe you are looking for.
>>>
>>> The object classes inherit directly from character, so should "just work"
>>> most of the time, but as I said it's early days; lots more testing for
>>> functionality and usefulness is needed.
>>>
>>> ~G
>>>
>>>
>>> On Sat, Sep 20, 2014 at 11:38 AM, Vincent Carey <
>>> stvjc at channing.harvard.edu>
>>> wrote:
>>>
>>> > OK by me to leave [ alone.  We could start with subsetByEntrez,
>>> > subsetByKEGG, subsetBySymbol, subsetByGOTERM, subsetByGOID.
>>> >
>>> > Utilities to generate GRanges for queries in each of these vocabularies
>>> > should, perhaps, be in the OrganismDb space?  Once those are in place
>>> > no additional infrastructure is necessary?
>>> >
>>> > On Sat, Sep 20, 2014 at 12:49 PM, Tim Triche, Jr. <
>>> tim.triche at gmail.com>
>>> > wrote:
>>> >
>>> > > Agreed with Sean, having tried implementing to "magical" alternative
>>> > >
>>> > > --t
>>> > >
>>> > > > On Sep 20, 2014, at 9:31 AM, Sean Davis <sdavis2 at mail.nih.gov>
>>> wrote:
>>> > > >
>>> > > > Hi, Vince.
>>> > > >
>>> > > > I'm coming a little late to the party, but I agree with Kasper's
>>> > > sentiment
>>> > > > that the less "magical" approach of using subsetByXXX might be the
>>> > > cleaner
>>> > > > way to go for the time being.
>>> > > >
>>> > > > Sean
>>> > > >
>>> > > >
>>> > > > On Sat, Sep 20, 2014 at 10:42 AM, Vincent Carey <
>>> > > stvjc at channing.harvard.edu>
>>> > > > wrote:
>>> > > >
>>> > > >>
>>> > > >>
>>> > >
>>> >
>>> https://github.com/vjcitn/biocMultiAssay/blob/master/vignettes/SEresolver.Rnw
>>> > > >>
>>> > > >> shows some modifications to [ that allow subsetting of SE by
>>> > > >> gene or pathway name
>>> > > >>
>>> > > >> it may be premature to work at the [ level.  Kasper suggested
>>> defining
>>> > > >> a suite of subsetBy operations that would accomplish this
>>> > > >>
>>> > > >> i think we could get something along these lines into the release
>>> > > without
>>> > > >> too much more work.  votes?
>>> > > >>
>>> > > >>        [[alternative HTML version deleted]]
>>> > > >>
>>> > > >> _______________________________________________
>>> > > >> Bioc-devel at r-project.org mailing list
>>> > > >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> > > >
>>> > > >    [[alternative HTML version deleted]]
>>> > > >
>>> > > > _______________________________________________
>>> > > > Bioc-devel at r-project.org mailing list
>>> > > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> > >
>>> >
>>> >         [[alternative HTML version deleted]]
>>> >
>>> > _______________________________________________
>>> > Bioc-devel at r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> >
>>>
>>>
>>>
>>> --
>>> Computational Biologist
>>> Genentech Research
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>
>>
>
>
> --
> Computational Biologist
> Genentech Research
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list