[Bioc-devel] SummarizedExperiment vs ExpressionSet
haverty.peter at gene.com
Wed Nov 26 22:55:26 CET 2014
OK, GRanges as vector that does overlap stuff makes sense, but I think
putting a DataFrame of metadata on that confuses the purpose of the
object. How about a "GRangesTable" that inherits from both GenomicRanges
and DataTable? It would be a DataFrame with a fancy index. The DataFrame
API would make stuff like colnames work (rather than needing
colnames(mcols(x)) ). If this were used as the rowData for
SummarizedExperiment, then a plain DataFrame could be made to work too.
Peter M. Haverty, Ph.D.
phaverty at gene.com
On Wed, Nov 26, 2014 at 9:33 AM, Michael Lawrence <lawrence.michael at gene.com
> On Wed, Nov 26, 2014 at 9:07 AM, Peter Haverty <haverty.peter at gene.com>
>> Hi all,
>> I believe there is a strong need for an object that organizes a collection
>> of rectangular data (matrices, etc.) with metadata on the rows and
>> columns. Can SummarizedExperiment inherit from something simpler that has
>> a DataFrame as rowData?
> (I believe GenomicRanges should inherit from
>> DataTable, rather than Vector, and subset as x[i,j], but maybe that's
>> getting a bit off topic.)
> Have to disagree on that. A GRanges is a vector of ranges; a table is a
> list of vectors all of the same length. Different things. There was a lot
> of thought invested in that. But it does subset as x[i,j], so in theory
> SummarizedExperiment could be generalized to contain something with the
> contract of 2D extraction.
>> I often see people stuffing arbitrary data into
>> an ExpressionSet and calling one of the assays "exprs" as a work-around.
>> Peter M. Haverty, Ph.D.
>> Genentech, Inc.
>> phaverty at gene.com
>> On Wed, Nov 26, 2014 at 7:19 AM, Laurent Gatto <lg390 at cam.ac.uk> wrote:
>> > On 26 November 2014 14:59, Wolfgang Huber wrote:
>> > > A colleague and I are designing a package for quantitative proteomics
>> > > data, and we are debating whether to base it on the
>> > > SummarizedExperiment or the ExpressionSet class.
>> > >
>> > > There is no immediate use for the ranges aspect of
>> > > SummarizedExperiment, so that would have to be carried around with
>> > > NAs, and this is a parsimony argument for using ExpressionSet
>> > > instead. OTOH, the interface of SummarizedExperiment is cleaner, its
>> > > code more modern and more likely to be updated, and users of the
>> > > Bioconductor project are likely to benefit from having to deal with a
>> > > single interface that works the same or similarly across packages,
>> > > rather than a variety of formats; which argues that new packages
>> > > should converge towards SummarizedExperiment('s interface).
>> > >
>> > > Are there any pertinent insights from this group?
>> > Instead of ExpressionSet, you could use MSnbase::MSnSet, which is
>> > essentially an ExpressionSet for quantitative proteomics (i.e it has a
>> > MIAPE slot, instead of MIAME for example).
>> > Ideally, a SummarizedExperiment for proteomics would use peptide/protein
>> > ranges, which is in the pipeline, as far as I am concerned. When that
>> > becomes available, there should be infrastructure to coerce and MSnSet
>> > (and/or other relevant data) into an SummarizedExperiment.
>> > Hope this helps.
>> > Best wishes,
>> > Laurent
>> > > Thanks and best wishes
>> > > Wolfgang
>> > >
>> > > _______________________________________________
>> > > Bioc-devel at r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> > --
>> > Laurent Gatto
>> > http://cpu.sysbiol.cam.ac.uk/
>> > _______________________________________________
>> > Bioc-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> [[alternative HTML version deleted]]
>> Bioc-devel at r-project.org mailing list
[[alternative HTML version deleted]]
More information about the Bioc-devel