[Bioc-devel] SummarizedExperiment vs ExpressionSet

Michael Lawrence lawrence.michael at gene.com
Thu Nov 27 01:04:29 CET 2014


The two objects have conflicting APIs. For example, 1D extraction indexes
into the ranges for a GRanges, but into the columns for a table. So I would
not recommend multiple inheritance. Instead, define something new with the
semantics you want and use composition. Maybe just a subclass of DataFrame
that adds a GenomicRanges slot.

On Wed, Nov 26, 2014 at 1:55 PM, Peter Haverty <haverty.peter at gene.com>
wrote:

> OK, GRanges as vector that does overlap stuff makes sense, but I think
> putting a DataFrame of metadata on that confuses the purpose of the
> object.  How about a "GRangesTable" that inherits from both GenomicRanges
> and DataTable?  It would be a DataFrame with a fancy index.  The DataFrame
> API would make stuff like colnames work (rather than needing
> colnames(mcols(x)) ). If this were used as the rowData for
> SummarizedExperiment, then a plain DataFrame could be made to work too.
>
> Pete
>
> ____________________
> Peter M. Haverty, Ph.D.
> Genentech, Inc.
> phaverty at gene.com
>
> On Wed, Nov 26, 2014 at 9:33 AM, Michael Lawrence <
> lawrence.michael at gene.com> wrote:
>
>>
>>
>> On Wed, Nov 26, 2014 at 9:07 AM, Peter Haverty <haverty.peter at gene.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I believe there is a strong need for an object that organizes a
>>> collection
>>> of rectangular data (matrices, etc.) with metadata on the rows and
>>> columns.  Can SummarizedExperiment inherit from something simpler that
>>> has
>>> a DataFrame as rowData?
>>
>>   (I believe GenomicRanges should inherit from
>>> DataTable, rather than Vector, and subset as x[i,j], but maybe that's
>>> getting a bit off topic.)
>>
>>
>> Have to disagree on that. A GRanges is a vector of ranges; a table is a
>> list of vectors all of the same length. Different things. There was a lot
>> of thought invested in that. But it does subset as x[i,j], so in theory
>> SummarizedExperiment could be generalized to contain something with the
>> contract of 2D extraction.
>>
>>
>>> I often see people stuffing arbitrary data into
>>> an ExpressionSet and calling one of the assays "exprs" as a work-around.
>>>
>>> Regards,
>>>
>>> Pete
>>>
>>> ____________________
>>> Peter M. Haverty, Ph.D.
>>> Genentech, Inc.
>>> phaverty at gene.com
>>>
>>> On Wed, Nov 26, 2014 at 7:19 AM, Laurent Gatto <lg390 at cam.ac.uk> wrote:
>>>
>>> >
>>> > On 26 November 2014 14:59, Wolfgang Huber wrote:
>>> >
>>> > > A colleague and I are designing a package for quantitative proteomics
>>> > > data, and we are debating whether to base it on the
>>> > > SummarizedExperiment or the ExpressionSet class.
>>> > >
>>> > > There is no immediate use for the ranges aspect of
>>> > > SummarizedExperiment, so that would have to be carried around with
>>> > > NAs, and this is a parsimony argument for using ExpressionSet
>>> > > instead. OTOH, the interface of SummarizedExperiment is cleaner, its
>>> > > code more modern and more likely to be updated, and users of the
>>> > > Bioconductor project are likely to benefit from having to deal with a
>>> > > single interface that works the same or similarly across packages,
>>> > > rather than a variety of formats; which argues that new packages
>>> > > should converge towards SummarizedExperiment('s interface).
>>> > >
>>> > > Are there any pertinent insights from this group?
>>> >
>>> > Instead of ExpressionSet, you could use MSnbase::MSnSet, which is
>>> > essentially an ExpressionSet for quantitative proteomics (i.e it has a
>>> > MIAPE slot, instead of MIAME for example).
>>> >
>>> > Ideally, a SummarizedExperiment for proteomics would use
>>> peptide/protein
>>> > ranges, which is in the pipeline, as far as I am concerned. When that
>>> > becomes available, there should be infrastructure to coerce and MSnSet
>>> > (and/or other relevant data) into an SummarizedExperiment.
>>> >
>>> > Hope this helps.
>>> >
>>> > Best wishes,
>>> >
>>> > Laurent
>>> >
>>> > > Thanks and best wishes
>>> > > Wolfgang
>>> > >
>>> > > _______________________________________________
>>> > > Bioc-devel at r-project.org mailing list
>>> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> >
>>> > --
>>> > Laurent Gatto
>>> > http://cpu.sysbiol.cam.ac.uk/
>>> >
>>> > _______________________________________________
>>> > Bioc-devel at r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> >
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>
>>
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list