[Bioc-devel] Behaviour of rbind/cbind on assays slot of SummarizedExperiment with multidimensional assays

Peter Hickey peter.hickey at gmail.com
Thu Mar 3 19:49:56 CET 2016


Hi Herve,

I agree, the abind::abind() signature is rather verbose and much of it is not
required in the context of a SummarizedExperiment. Perhaps "overriding"
abind::abind() with an S4 generic with a different signature isn't a good idea
and it would be better to have our own generic.

I quite like arbind() and acbind() as names. I guess these would live in the
SummarizedExperiment package?

Happy to do further work on this but I won't have time until the weekend or
next week.

Cheers,
Pete

On Thu, 3 Mar 2016 at 13:31 Hervé Pagès <hpages at fredhutch.org> wrote:
>
> Hi Pete,
>
> On 03/02/2016 12:42 PM, Peter Hickey wrote:
> > This is mostly directed to Herve and/or Martin, but I'd be interested
> > in other's input too.
> >
> > The SummarizedExperiment package defines rbind,Assays-method and
> > cbind,Assays-method that are called when rbind() or cbind() is called
> > on a SummarizedExperiment object. In the case of two-dimensional assay
> > (matrix) these work much as if rbind/cbind were called on the matrix:
> >
> >> library(SummarizedExperiment)
> >> m <- matrix(rnorm(100), nrow = 4, ncol = 25)
> >> se1 <- SummarizedExperiment(m)
> >> dim(assay(rbind(se1, se1)))
> > [1]  8 25
> >> dim(rbind(assay(se1), assay(se1)))
> > [1]  8 25
> >> dim(assay(cbind(se1, se1)))
> > [1]  4 50
> >> dim(cbind(assay(se1), assay(se1)))
> > [1]  4 50
> >
> > When an assay is an array with more than 2 dimensions, however, the
> > result of the rbind,Assay-method (resp. cbind,Assays-method) differs
> > from the rbind,array-method (resp. cbind,array-method). This is for a
> > good reason because it preserves the dimensionality of the assay in
> > the SummarizedExperiment object. So in fact the "rbind(...)" of the
> > assay is more like abind::abind(..., along = 1) and the "cbind(...)"
> > of the assay is more like abind::abind(..., along = 2):
> >
> >> x <- array(rnorm(100), dim = c(4, 5, 5))
> >> se2 <- SummarizedExperiment(x)
> >> dim(assay(rbind(se2, se2)))
> > [1] 8 5 5
> >> dim(rbind(assay(se2), assay(se2)))
> > [1]   2 100
> >> dim(abind::abind(assay(se2), assay(se2), along = 1))
> > [1] 8 5 5
> >> identical(assay(rbind(se2, se2)), abind::abind(assay(se2), assay(se2), along = 1))
> > [1] TRUE
> >> dim(assay(cbind(se2, se2)))
> > [1]  4 10  5
> >> dim(cbind(assay(se2), assay(se2)))
> > [1] 100   2
> >> dim(abind::abind(assay(se2), assay(se2), along = 2))
> > [1]  4 10  5
> >> identical(assay(cbind(se2, se2)), abind::abind(assay(se2), assay(se2), along = 2))
> > [1] TRUE
> >
> > rbind/cbind does not work for other "array-like" objects with > 2
> > dimensions in the assays slot of a SummarizedExperiment because the
> > internal function SummarizedExperiment:::.bind_assay_elements()
> > constructs a new array via array() if the assay has more than 2
> > dimensions, thus destroying the original class of the array-like
> > object.
> >
> > What I'm wondering is whether there is a way to generalise rbind/cbind
> > of Assays to other array-like objects provided that have a suitable
> > method defined. It seems to me that a good candidate would be to
> > require that an object in the assays slot has an abind(..., along = 1)
> > and abind(..., along = 2) method defined if it has more than 2
> > dimensions. It might even be worth using abind::abind() for when the
> > assay is an array with more than 2 dimensions to simplify the code
> > somewhat.
> >
> > Thoughts? I'd be happy to work on a patch.
>
> Requiring that abind(..., along=1) and abind(..., along=2) work on
> assays of dim > 2 would work. Note that abind() has a complicated
> signature (many extra arguments) but the "abind" methods that one
> would need to implement wouldn't need to satisfy the full abind()
> contract (in the context of SummarizedExperiment assays, satisfying
> the full contract is not needed and would be too much work).
>
> Alternatively we can introduce our own generics for that e.g.
> abind1() and abind2(), or arbind() and acbind() (for "assay rbind"
> and "assay cbind"). Advantages: the signatures would be cleaner,
> the contracts simpler, and the methods easier to implement. Also
> we wouldn't need to depend on the abind package.
>
> What do you think?
>
> H.
>
> >
> > Cheers,
> > Pete
> >
> > _______________________________________________
> > Bioc-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319



More information about the Bioc-devel mailing list