[Bioc-devel] SummarizedExperiment vs ExpressionSet

Tim Triche, Jr. tim.triche at gmail.com
Wed Nov 26 18:37:29 CET 2014


so as a simple experiment, I did the following:

library(GenomicRanges)
bar <- matrix(rnorm(100), ncol=10)
colnames(bar) <- as.character(1:10)
rownames(bar) <- letters[1:10]
foo <- SummarizedExperiment(assays=list(bar=bar))

rowData(foo)
## GRangesList object of length 10:
## $a
## GRanges object with 0 ranges and 0 metadata columns:
##    seqnames    ranges strand
##       <Rle> <IRanges>  <Rle>
##
## $b
## GRanges object with 0 ranges and 0 metadata columns:
##      seqnames ranges strand
##
## $c
## GRanges object with 0 ranges and 0 metadata columns:
##      seqnames ranges strand
##
## ...
## <7 more elements>

colData(foo)
## DataFrame with 10 rows and 0 columns

This got me to thinking, why not have an emptyRanges class, or else the
ability to index a bunch of NULL ranges without a lot of hoohah?  The
defaults mostly do what they're supposed to; why not have a compact
representation of empty rowData as for empty colData (i.e., a DataFrame
with 0 rows)?  Or is a GRangesList of empty GRanges as compact as it is
practicable to get for this purpose?

Just pondering what the lowest-impact solution to the problem at hand might
be.


Statistics is the grammar of science.
Karl Pearson <http://en.wikipedia.org/wiki/The_Grammar_of_Science>

On Wed, Nov 26, 2014 at 9:07 AM, Peter Haverty <haverty.peter at gene.com>
wrote:

> Hi all,
>
> I believe there is a strong need for an object that organizes a collection
> of rectangular data (matrices, etc.) with metadata on the rows and
> columns.  Can SummarizedExperiment inherit from something simpler that has
> a DataFrame as rowData?  (I believe GenomicRanges should inherit from
> DataTable, rather than Vector, and subset as x[i,j], but maybe that's
> getting a bit off topic.)  I often see people stuffing arbitrary data into
> an ExpressionSet and calling one of the assays "exprs" as a work-around.
>
> Regards,
>
> Pete
>
> ____________________
> Peter M. Haverty, Ph.D.
> Genentech, Inc.
> phaverty at gene.com
>
> On Wed, Nov 26, 2014 at 7:19 AM, Laurent Gatto <lg390 at cam.ac.uk> wrote:
>
> >
> > On 26 November 2014 14:59, Wolfgang Huber wrote:
> >
> > > A colleague and I are designing a package for quantitative proteomics
> > > data, and we are debating whether to base it on the
> > > SummarizedExperiment or the ExpressionSet class.
> > >
> > > There is no immediate use for the ranges aspect of
> > > SummarizedExperiment, so that would have to be carried around with
> > > NAs, and this is a parsimony argument for using ExpressionSet
> > > instead. OTOH, the interface of SummarizedExperiment is cleaner, its
> > > code more modern and more likely to be updated, and users of the
> > > Bioconductor project are likely to benefit from having to deal with a
> > > single interface that works the same or similarly across packages,
> > > rather than a variety of formats; which argues that new packages
> > > should converge towards SummarizedExperiment('s interface).
> > >
> > > Are there any pertinent insights from this group?
> >
> > Instead of ExpressionSet, you could use MSnbase::MSnSet, which is
> > essentially an ExpressionSet for quantitative proteomics (i.e it has a
> > MIAPE slot, instead of MIAME for example).
> >
> > Ideally, a SummarizedExperiment for proteomics would use peptide/protein
> > ranges, which is in the pipeline, as far as I am concerned. When that
> > becomes available, there should be infrastructure to coerce and MSnSet
> > (and/or other relevant data) into an SummarizedExperiment.
> >
> > Hope this helps.
> >
> > Best wishes,
> >
> > Laurent
> >
> > > Thanks and best wishes
> > > Wolfgang
> > >
> > > _______________________________________________
> > > Bioc-devel at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
> > --
> > Laurent Gatto
> > http://cpu.sysbiol.cam.ac.uk/
> >
> > _______________________________________________
> > Bioc-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list