[Rd] Proper way to define cbind, rbind for s4 classes in package

Michael Lawrence lawrence.michael at gene.com
Sat Jan 24 15:39:37 CET 2015


On Sat, Jan 24, 2015 at 12:58 AM, Mario Annau <mario.annau at gmail.com> wrote:
> Hi all,
> this question has already been posted on stackoverflow, however without
> success, see also
> http://stackoverflow.com/questions/27886535/proper-way-to-use-cbind-rbind-with-s4-classes-in-package.
>
> I have written a package using S4 classes and would like to use the
> functions rbind, cbind with these defined classes.
>
> Since it does not seem to be possible to define rbind and cbind directly
> as S4 methods (see ?cBind) I defined rbind2 and cbind2 instead:
>

This needs some clarification. It certainly is possible to define
cbind and rbind methods. The BiocGenerics package defines generics for
those and many methods are defined by e.g. S4Vectors, IRanges, etc.
The issue is that dispatch on "..." is singular, i.e., you can only
specify one class that all args in "..." must share (potentially
through inheritance). Thus, trying to combine objects from a different
hierarchy (or non-S4 objects) will not work. This has not been a huge
problem for us in practice. For example, we have a DataFrame object
that mimics data.frame. To cbind a data.frame with a DataFrame, the
user can just call the DataFrame() constructor. rbind() between
different data structures is much less common.

The cBind and rBind functions in Matrix (and the r/cbind that get
installed by bind_activation, the code is shared) work by recursing,
dropping the first argument until two are left, and then combining
with r/cbind2(). The Biobase package uses a similar strategy to mimic
c() via its non-standard combine() generic. The nice thing about the
combine() approach is the user entry point and the generic are the
same, instead of having methods on rbind2() and the user calling
rBind().

I would argue that bind_activation(TRUE) should be discouraged,
because it replaces the native rbind and cbind with recursive variants
that are going to cause problems, performance and otherwise. This is
why it is hidden. Perhaps a reasonable compromise would be for the
native cbind and rbind to check whether any arguments are S4 and if
so, resort to recursion. Recursion does seem to be a clean way to
implement "type promotion", i.e., to answer the question "which type
should the result be when faced with mixed-type args?".

Hopefully others have better ideas.

Michael




> setMethod("rbind2", signature(x="ClassA", y = "ANY"),
>     function(x, y) {
>       # Do stuff ...
> })
>
> setMethod("cbind2", signature(x="ClassA", y = "ANY"),
>     function(x, y) {
>       # Do stuff ...
> })
>
> >From ?cbind2 I learned that these functions need to be activated using
> methods:::bind_activation to replace rbind and cbind from base.
>
> I included the call in the package file R/zzz.R using the .onLoad function:
>
> .onLoad <- function(...) {
>   # Bind activation of cbind(2) and rbind(2) for S4 classes
>   methods:::bind_activation(TRUE)
> }
> This works as expected. However, running R CMD check I am now getting
> the following NOTE since I am using an unexported function in methods:
>
> * checking dependencies in R code ... NOTE
> Unexported object imported by a ':::' call: 'methods:::bind_activation'
>   See the note in ?`:::` about the use of this operator.
> How can I get rid of the NOTE and what is the proper way to define the
> methods cbind and rbind for S4 classes in a package?
>
> Best,
> mario
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list