[Bioc-devel] Pushing towards a better home for matrix generics

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Tue Jan 29 12:16:20 CET 2019


>>>>> Michael Lawrence 
>>>>>     on Mon, 28 Jan 2019 20:47:58 -0800 writes:

    > That will have some consequences; for example,
    > documentation aliases will need to change. Not sure how
    > many packages will need to be fixed outside of Matrix, but
    > it's not an isolated change. Martin might comment on the
    > rationale for the full signature, since he defined those
    > generics.

    > On Mon, Jan 28, 2019 at 7:21 PM Pages, Herve
    > <hpages using fredhutch.org> wrote:
    >> 
    >> Actually there is a 4th solution which is to modify the
    >> definition of the implicit generics in the methods
    >> package (file makeBasicFunsList.R) to make them dispatch
    >> on their 1st arg only. Should be easy. Then no package
    >> will need to use a setGeneric statement
    >> anymore. Everybody will automatically get a clean
    >> implicit generic. Including the Matrix package which
    >> shouldn't need to be touched (except maybe for some
    >> aliases in its man page that might need to be changed
    >> from \alias{colSums,dgCMatrix,ANY,ANY-method} to
    >> \alias{colSums,dgCMatrix-method}).
    >> 
    >> Anybody wants to try to make a patch for this?

    >> H.

I've already replied without having read the above two messages.
In my reply I had indeed more or less argued as Hervé does above.

Michael, Hervé, .. : Why is it really so much better to disallow
dispatch for the other compulsory arguments?
Dispatch there allows to use methods for class "missing" which
is nicer in my eyes than the traditional default argument + missing() "tricks".

Is it mainly speed you are concerned about.  If yes, do we have
data (and data analysis) about performance here?

Martin

    >> 
    >> On 1/28/19 19:00, Michael Lawrence wrote: > I agree (2)
    >> is a good compromise. CC'ing Martin for his perspective.
    >> >
    >> > Michael
    >> >
    >> > On Mon, Jan 28, 2019 at 6:58 PM Pages, Herve
    >> <hpages using fredhutch.org> wrote: >> Hi Aaron,
    >> >>
    >> >> The 4 matrix summarization generics currently defined
    >> in BiocGenerics >> are defined as followed:
    >> >>
    >> >> setGeneric("rowSums", signature="x") >>
    >> setGeneric("colSums", signature="x") >>
    >> setGeneric("rowMeans", signature="x") >>
    >> setGeneric("colMeans", signature="x")
    >> >>
    >> >> The only reason for having these definitions in
    >> BiocGenerics is to >> restrict dispatch the first
    >> argument. This is cleaner than what we would >> get with
    >> the implicit generics where dispatch is on all arguments
    >> (it >> doesn't really make sense to dispatch on toggles
    >> like 'na.rm' or >> 'dims'). Sticking to simple dispatch
    >> when possible makes life easier for >> the developer
    >> (especially in times of troubleshooting) and for the user
    >> >> (methods are easier to discover and their man pages
    >> easier to access).
    >> >>
    >> >> However, the 4 statements above create new generics
    >> that mask the >> implicit generics defined in the Matrix
    >> package (Matrix doesn't contain >> any setGeneric
    >> statements for these generics, only setMethod >>
    >> statements). This is a very unsatisfying situation and it
    >> has hit me >> repeatedly over the last couple of years.
    >> >>
    >> >> We have basically 3 ways to go. From simpler to more
    >> complicated:
    >> >>
    >> >> 1) Give up on single dispatch for these generics. That
    >> is, we remove the >> 4 statements above from
    >> BiocGenerics. Then we use setMethod() in package >> code
    >> like Matrix does.
    >> >>
    >> >> 2) Convince the Matrix folks to put the 4 statements
    >> above in Matrix.  >> Then any BioC package that needs to
    >> define methods for these generics >> would just need to
    >> import them from the Matrix package. Maybe we could >>
    >> even push this one step further by having BiocGenerics
    >> import and >> re-export these generics. This would make
    >> them "available" in BioC as >> soon as the BiocGenerics
    >> is loaded (and any package that needs to define >>
    >> methods on them would just need to import them from
    >> BiocGenerics).
    >> >>
    >> >> 3) Put the 4 statements above in a MatrixGenerics
    >> package. Then convince >> the Matrix folks to define
    >> methods on the generics defined in >>
    >> MatrixGenerics. Very unlikely to happen!
    >> >>
    >> >> IMO 2) is the best compromise. Want to give it a shot?
    >> >>
    >> >> H.
    >> >>
    >> >>
    >> >> On 1/27/19 13:45, Aaron Lun wrote: >>> This is a
    >> resurrection of some old threads:
    >> >>>
    >> >>>
    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_bioc-2Ddevel_2017-2DNovember_012273.html&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3XRwM4jf0WeZA2ePj9yT3fc2X5hOsKNJk&s=pcpUyjpkQe6U79lZ_n2SANNp6Zj_s6i1Sq2yZx2NSjw&e=
    >> >>>
    >> >>>
    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_MatrixGenerics_issues&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3XRwM4jf0WeZA2ePj9yT3fc2X5hOsKNJk&s=NrmcVnmvgkDp3p64J-izZU9VD5nFsFCWOTI-TsnzCpY&e=
    >> >>>
    >> >>> For those who are unfamiliar with this, the basic
    >> issue is that various >>> Matrix and BiocGenerics
    >> functions mask each other. This is mildly >>> frustrating
    >> in interactive sessions:
    >> >>>
    >> >>>> library(Matrix) >>>> library(DelayedArray) >>>> x <-
    >> rsparsematrix(10, 10, 0.1) >>>> colSums(x) # fails >>>>
    >> Matrix::colSums(x) # okay >>> ... but quite annoying
    >> during package development, requiring code like >>> this:
    >> >>>
    >> >>> if (is(x, "Matrix")) { >>> z <- Matrix::colSums(x)
    >> >>> } else { >>> z <- colSums(x) # assuming DelayedArray
    >> does the masking.  >>> }
    >> >>>
    >> >>> ... which defeats the purpose of using S4 dispatch in
    >> the first place.
    >> >>>
    >> >>> I have been encountering this issue with increasing
    >> frequency in my >>> packages, as a lot of my code base
    >> needs to be able to interface with >>> both Matrix and
    >> Bioconductor objects (e.g., DelayedMatrices) at the >>>
    >> same time. What needs to happen so that I can just write:
    >> >>>
    >> >>> z <- colSums(x)
    >> >>>
    >> >>> ... and everything will work for both Matrix and
    >> Bioconductor classes?  >>> It seems that many of these
    >> function names are implicit generics >>> anyway, can
    >> BiocGenerics take advantage of that for the time being?
    >> >>>
    >> >>> Best,
    >> >>>
    >> >>> Aaron
    >> >>>
    >> >>> _______________________________________________ >>>
    >> Bioc-devel using r-project.org mailing list >>>
    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3XRwM4jf0WeZA2ePj9yT3fc2X5hOsKNJk&s=JtgGBnaZJH44fV8OUp-SwnHxhD_i_mdVkqoMfUoA5tM&e=
    >> >> --
    >> >> Hervé Pagès
    >> >>
    >> >> Program in Computational Biology >> Division of Public
    >> Health Sciences >> Fred Hutchinson Cancer Research Center
    >> >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >>
    >> Seattle, WA 98109-1024
    >> >>
    >> >> E-mail: hpages using fredhutch.org >> Phone: (206) 667-5791
    >> >> Fax: (206) 667-1319
    >> >>
    >> >> _______________________________________________ >>
    >> Bioc-devel using r-project.org mailing list >>
    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=c-Mmi30ouubEEHC5W9_X6DIwxblt1nQlIfgCaK8uCJU&s=U8Hu1kzglD_RP7t_eR5w_nYAIaupBgrEKx11geSZwVg&e=
    >> 
    >> --
    >> Hervé Pagès
    >> 
    >> Program in Computational Biology Division of Public
    >> Health Sciences Fred Hutchinson Cancer Research Center
    >> 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA
    >> 98109-1024
    >> 
    >> E-mail: hpages using fredhutch.org Phone: (206) 667-5791 Fax:
    >> (206) 667-1319
    >>



More information about the Bioc-devel mailing list