[Rd] Proper way to define cbind, rbind for s4 classes in package
Martin Maechler
maechler at lynne.stat.math.ethz.ch
Mon Feb 2 12:32:48 CET 2015
>>>>> Michael Lawrence <lawrence.michael at gene.com>
>>>>> on Sun, 1 Feb 2015 19:23:06 -0800 writes:
> I've implemented the proposed changes in
> R-devel. Minimally tested, so please try it. It should
> delegate to r/cbind2 when there is at least one S4
> argument and S3 dispatch fails (so you'll probably want to
> add an S3 method for your class to introduce a conflict,
> otherwise it will dispatch to cbind.data.frame if one of
> the args is a data.frame). There may no longer be a need
> for cBind() and rBind().
> Michael
This sounds great! Thank you very much, Michael!
:-) :-)
... but .... :-( experiments with the Matrix package (and R
devel with your change), show a remaining buglet with treating of dimnames :
> M1 <- Matrix(m1 <- matrix(1:12, 3,4))
> cbind(m1, MM = -1)
MM
[1,] 1 4 7 10 -1
[2,] 2 5 8 11 -1
[3,] 3 6 9 12 -1
> cbind(M1, MM = -1) ## ---- notice the "..."
3 x 5 Matrix of class "dgeMatrix"
...
[1,] 1 4 7 10 -1
[2,] 2 5 8 11 -1
[3,] 3 6 9 12 -1
> rbind(R1 = 10:11, m1)
[,1] [,2] [,3] [,4]
R1 10 11 10 11
1 4 7 10
2 5 8 11
3 6 9 12
> rbind(R1 = 10:11, M1) ## --- notice the 'deparse.level'
4 x 4 Matrix of class "dgeMatrix"
[,1] [,2] [,3] [,4]
deparse.level 10 11 10 11
1 4 7 10
2 5 8 11
3 6 9 12
>
Also, it seems you are not observing the 'deparse.level'
argument at all:
Looking at the last three lines of the example in ?cbind,
rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 0) # middle 2 rownames
rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 1) # 3 rownames (default)
rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 2) # 4 rownames
but using a Matrix matrix 'dd', we see that (row)names
construction needs to amended:
> (dd <- Matrix(rbind(c(0:1,0,0))))
1 x 4 sparse Matrix of class "dgCMatrix"
[1,] . 1 . .
> rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 0) # middle 2 rownames
4 x 4 sparse Matrix of class "dgCMatrix"
deparse.level 1 2 3 4
c 2 2 2 2
a++ 10 10 10 10
. 1 . .
> rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 1) # 3 rownames (default)
4 x 4 sparse Matrix of class "dgCMatrix"
deparse.level 1 2 3 4
c 2 2 2 2
a++ 10 10 10 10
. 1 . .
> rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 2) # 4 rownames
4 x 4 sparse Matrix of class "dgCMatrix"
deparse.level 1 2 3 4
c 2 2 2 2
a++ 10 10 10 10
. 1 . .
>
> On Mon, Jan 26, 2015 at 3:55 AM, Martin Maechler <
> maechler at lynne.stat.math.ethz.ch> wrote:
>> >>>>> Michael Lawrence <lawrence.michael at gene.com> >>>>>
>> on Sat, 24 Jan 2015 06:39:37 -0800 writes:
>>
>> > On Sat, Jan 24, 2015 at 12:58 AM, Mario Annau >
>> <mario.annau at gmail.com> wrote: >> Hi all, this question
>> has already been posted on >> stackoverflow, however
>> without success, see also
>> >>
>> http://stackoverflow.com/questions/27886535/proper-way-to-use-cbind-rbind-with-s4-classes-in-package
>> .
>> >>
>> >> I have written a package using S4 classes and would
>> like >> to use the functions rbind, cbind with these
>> defined >> classes.
>> >>
>> >> Since it does not seem to be possible to define rbind
>> and >> cbind directly as S4 methods (see ?cBind) I
>> defined >> rbind2 and cbind2 instead:
>> >>
>>
>> > This needs some clarification. It certainly is possible
>> to > define cbind and rbind methods. The BiocGenerics
>> package > defines generics for those and many methods are
>> defined by > e.g. S4Vectors, IRanges, etc. The issue is
>> that dispatch > on "..." is singular, i.e., you can only
>> specify one class > that all args in "..." must share
>> (potentially through > inheritance).
>>
>> > Thus, trying to combine objects from a > different
>> hierarchy (or non-S4 objects) will not > work.
>>
>> Yes, indeed, that's the drawback
>>
>> I've been there almost surely before everyone else, with
>> the Matrix package... and I have been the author of
>> cbind2(), rbind2(), and of course, of cBind(), and
>> rBind().
>>
>> At the time when I introduced these, the above
>> possibility of writing S4 methods for '...' where not
>> yet part of R.
>>
>> > This has not been a huge problem for us in >
>> practice. For example, we have a DataFrame object that >
>> mimics data.frame. To cbind a data.frame with a
>> DataFrame, > the user can just call the DataFrame() >
>> constructor. rbind() between different data structures is
>> > much less common.
>>
>> well... yes and no. Think of using the Matrix package,
>> maybe with another package that defines another
>> generalized matrix class... It would be nice if things
>> worked automatically / perfectly there.
>>
>> > The cBind and rBind functions in Matrix (and the
>> r/cbind > that get installed by bind_activation, the code
>> is shared) > work by recursing, dropping the first
>> argument until two > are left, and then combining with
>> r/cbind2(). The Biobase > package uses a similar strategy
>> to mimic c() via its > non-standard combine()
>> generic. The nice thing about the > combine() approach is
>> the user entry point and the generic > are the same,
>> instead of having methods on rbind2() and > the user
>> calling rBind().
>>
>> > I would argue that bind_activation(TRUE) should be >
>> discouraged,
>>
>> Yes, you are right Michael; it should be discouraged at
>> least to be run in a *package*. One could think of its
>> use by an explicit user call.
>>
>> > because it replaces the native rbind and > cbind with
>> recursive variants that are going to cause > problems,
>> performance and otherwise. This is why it is >
>> hidden. Perhaps a reasonable compromise would be for the
>> > native cbind and rbind to check whether any arguments
>> are > S4 and if so, resort to recursion. Recursion does
>> seem to > be a clean way to implement "type promotion",
>> i.e., to > answer the question "which type should the
>> result be when > faced with mixed-type args?".
>>
>> Exactly. That has been my idea at the time .. ((yes,
>> I'm also the author of the bind_activation()
>> "(mis)functionality".))
>>
>> > Hopefully others have better ideas.
>>
>> that would be great.
>>
>> And even if not, it would be great if we could implement
>> your idea > Perhaps a reasonable compromise would be for
>> the > native cbind and rbind to check whether any
>> arguments are > S4 and if so, resort to recursion.
>>
>> without a noticable performance penalty in the case of no
>> S4 arguments.
>>
>> Martin
>>
>>
>> > Michael
>>
>> >> setMethod("rbind2", signature(x="ClassA", y = "ANY"),
>> >> function(x, y) { # Do stuff ... })
>> >>
>> >> setMethod("cbind2", signature(x="ClassA", y = "ANY"),
>> >> function(x, y) { # Do stuff ... })
>> >>
>> >> >From ?cbind2 I learned that these functions need to
>> be >> activated using methods:::bind_activation to
>> replace >> rbind and cbind from base.
>> >>
>> >> I included the call in the package file R/zzz.R using
>> the >> .onLoad function:
>> >>
>> >> .onLoad <- function(...) { # Bind activation of
>> cbind(2) >> and rbind(2) for S4 classes >>
>> methods:::bind_activation(TRUE) } This works as >>
>> expected. However, running R CMD check I am now getting
>> >> the following NOTE since I am using an unexported >>
>> function in methods:
>> >>
>> >> * checking dependencies in R code ... NOTE Unexported
>> >> object imported by a ':::' call: >>
>> 'methods:::bind_activation' See the note in ?`:::` about
>> >> the use of this operator. How can I get rid of the
>> NOTE >> and what is the proper way to define the methods
>> cbind >> and rbind for S4 classes in a package?
>> >>
>> >> Best, mario
>> >>
>> >> ______________________________________________ >>
>> R-devel at r-project.org mailing list >>
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> > ______________________________________________ >
>> R-devel at r-project.org mailing list >
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
More information about the R-devel
mailing list