[Bioc-devel] Pushing towards a better home for matrix generics

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Tue Jan 29 12:08:01 CET 2019


>>>>> Michael Lawrence 
>>>>>     on Mon, 28 Jan 2019 19:00:59 -0800 writes:

    > I agree (2) is a good compromise. CC'ing Martin for his
    > perspective.  Michael

Hmm....  there's quite a bit more it, really:

I'd not be unwilling to do so myself, but in the long history of
Matrix development -- most of which has happened quite a few
years ago,
I had come to the pretty strong conclusion that you should *NOT*
do setGeneric() for functions in base R, but rather use 
*implicit* generics in the methods package, and then in Matrix
only do setMethod() on those.. 

E.g., this commit to R's sources of the methods package:
------------------------------------------------------------------------
r49869 | maechler | 2009-09-29 10:34:48 +0200 (Tue, 29 Sep 2009) | 1 line

make chol2inv, rcond, (col|row)(Means|Sums) implicit Generics
------------------------------------------------------------------------
had made  rowSums etc into implicit generic,
to be used in Matrix but in other packages as well,
the whole idea being that this would be "the best way", at least
according to knowledge back then, including advice from John
Chambers, unless my memory is wrong.

Having many "base R" functions be implicit generics via the
methods package  has been *the* approach to achieve most
interoperability between different R package and been strongly
recommended by John Chambers, e.g., at the time.

In "vanilla" R, only standard packages loaded, including
'methods' of course, calling  implicitGeneric("colSums")  gives

  > implicitGeneric("colSums")
  standardGeneric for "colSums" defined from package "base"

  function (x, na.rm = FALSE, dims = 1, ...) 
  standardGeneric("colSums")
  <bytecode: 0x5893758>
  <environment: 0x588e308>
  Methods may be defined for arguments: x, na.rm, dims
  Use  showMethods("colSums")  for currently available ones.
  > 

And it seems you would want to limit dispatch to 'x' and hence
have the following line in the above be changed from
   Methods may be defined for arguments: x, na.rm, dims
to
   Methods may be defined for arguments: x

We should probably move this to R-devel -- as it indeed it
concerns code more in R than in Matrix, I think.

Martin

    > On Mon, Jan 28, 2019 at 6:58 PM Pages, Herve
    > <hpages using fredhutch.org> wrote:
    >> 
    >> Hi Aaron,
    >> 
    >> The 4 matrix summarization generics currently defined in
    >> BiocGenerics are defined as followed:
    >> 
    >> setGeneric("rowSums", signature="x")
    >> setGeneric("colSums", signature="x")
    >> setGeneric("rowMeans", signature="x")
    >> setGeneric("colMeans", signature="x")
    >> 
    >> The only reason for having these definitions in
    >> BiocGenerics is to restrict dispatch the first
    >> argument. This is cleaner than what we would get with the
    >> implicit generics where dispatch is on all arguments (it
    >> doesn't really make sense to dispatch on toggles like
    >> 'na.rm' or 'dims'). Sticking to simple dispatch when
    >> possible makes life easier for the developer (especially
    >> in times of troubleshooting) and for the user (methods
    >> are easier to discover and their man pages easier to
    >> access).
    >> 
    >> However, the 4 statements above create new generics that
    >> mask the implicit generics defined in the Matrix package
    >> (Matrix doesn't contain any setGeneric statements for
    >> these generics, only setMethod statements). This is a
    >> very unsatisfying situation and it has hit me repeatedly
    >> over the last couple of years.
    >> 
    >> We have basically 3 ways to go. From simpler to more
    >> complicated:
    >> 
    >> 1) Give up on single dispatch for these generics. That
    >> is, we remove the 4 statements above from
    >> BiocGenerics. Then we use setMethod() in package code
    >> like Matrix does.
    >> 
    >> 2) Convince the Matrix folks to put the 4 statements
    >> above in Matrix.  Then any BioC package that needs to
    >> define methods for these generics would just need to
    >> import them from the Matrix package. Maybe we could even
    >> push this one step further by having BiocGenerics import
    >> and re-export these generics. This would make them
    >> "available" in BioC as soon as the BiocGenerics is loaded
    >> (and any package that needs to define methods on them
    >> would just need to import them from BiocGenerics).
    >> 
    >> 3) Put the 4 statements above in a MatrixGenerics
    >> package. Then convince the Matrix folks to define methods
    >> on the generics defined in MatrixGenerics. Very unlikely
    >> to happen!
    >> 
    >> IMO 2) is the best compromise. Want to give it a shot?
    >> 
    >> H.
    >> 
    >> 
    >> On 1/27/19 13:45, Aaron Lun wrote: > This is a
    >> resurrection of some old threads:
    >> >
    >> >
    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_bioc-2Ddevel_2017-2DNovember_012273.html&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3XRwM4jf0WeZA2ePj9yT3fc2X5hOsKNJk&s=pcpUyjpkQe6U79lZ_n2SANNp6Zj_s6i1Sq2yZx2NSjw&e=
    >> >
    >> >
    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_MatrixGenerics_issues&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3XRwM4jf0WeZA2ePj9yT3fc2X5hOsKNJk&s=NrmcVnmvgkDp3p64J-izZU9VD5nFsFCWOTI-TsnzCpY&e=
    >> >
    >> > For those who are unfamiliar with this, the basic issue
    >> is that various > Matrix and BiocGenerics functions mask
    >> each other. This is mildly > frustrating in interactive
    >> sessions:
    >> >
    >> >> library(Matrix) >> library(DelayedArray) >> x <-
    >> rsparsematrix(10, 10, 0.1) >> colSums(x) # fails >>
    >> Matrix::colSums(x) # okay > ... but quite annoying during
    >> package development, requiring code like > this:
    >> >
    >> > if (is(x, "Matrix")) { > z <- Matrix::colSums(x) > }
    >> else { > z <- colSums(x) # assuming DelayedArray does the
    >> masking.  > }
    >> >
    >> > ... which defeats the purpose of using S4 dispatch in
    >> the first place.
    >> >
    >> > I have been encountering this issue with increasing
    >> frequency in my > packages, as a lot of my code base
    >> needs to be able to interface with > both Matrix and
    >> Bioconductor objects (e.g., DelayedMatrices) at the >
    >> same time. What needs to happen so that I can just write:
    >> >
    >> > z <- colSums(x)
    >> >
    >> > ... and everything will work for both Matrix and
    >> Bioconductor classes?  > It seems that many of these
    >> function names are implicit generics > anyway, can
    >> BiocGenerics take advantage of that for the time being?
    >> >
    >> > Best,
    >> >
    >> > Aaron
    >> >
    >> > _______________________________________________ >
    >> Bioc-devel using r-project.org mailing list >
    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3XRwM4jf0WeZA2ePj9yT3fc2X5hOsKNJk&s=JtgGBnaZJH44fV8OUp-SwnHxhD_i_mdVkqoMfUoA5tM&e=
    >> 
    >> --
    >> Hervé Pagès
    >> 
    >> Program in Computational Biology Division of Public
    >> Health Sciences Fred Hutchinson Cancer Research Center
    >> 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA
    >> 98109-1024
    >> 
    >> E-mail: hpages using fredhutch.org Phone: (206) 667-5791 Fax:
    >> (206) 667-1319
    >> 
    >> _______________________________________________
    >> Bioc-devel using r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list