[Bioc-devel] Pushing towards a better home for matrix generics

Pages, Herve hp@ge@ @end|ng |rom |redhutch@org
Tue Jan 29 03:57:36 CET 2019


Hi Aaron,

The 4 matrix summarization generics currently defined in BiocGenerics 
are defined as followed:

   setGeneric("rowSums", signature="x")
   setGeneric("colSums", signature="x")
   setGeneric("rowMeans", signature="x")
   setGeneric("colMeans", signature="x")

The only reason for having these definitions in BiocGenerics is to 
restrict dispatch the first argument. This is cleaner than what we would 
get with the implicit generics where dispatch is on all arguments (it 
doesn't really make sense to dispatch on toggles like 'na.rm' or 
'dims'). Sticking to simple dispatch when possible makes life easier for 
the developer (especially in times of troubleshooting) and for the user 
(methods are easier to discover and their man pages easier to access).

However, the 4 statements above create new generics that mask the 
implicit generics defined in the Matrix package (Matrix doesn't contain 
any setGeneric statements for these generics, only setMethod 
statements). This is a very unsatisfying situation and it has hit me 
repeatedly over the last couple of years.

We have basically 3 ways to go. From simpler to more complicated:

1) Give up on single dispatch for these generics. That is, we remove the 
4 statements above from BiocGenerics. Then we use setMethod() in package 
code like Matrix does.

2) Convince the Matrix folks to put the 4 statements above in Matrix. 
Then any BioC package that needs to define methods for these generics 
would just need to import them from the Matrix package. Maybe we could 
even push this one step further by having BiocGenerics import and 
re-export these generics. This would make them "available" in BioC as 
soon as the BiocGenerics is loaded (and any package that needs to define 
methods on them would just need to import them from BiocGenerics).

3) Put the 4 statements above in a MatrixGenerics package. Then convince 
the Matrix folks to define methods on the generics defined in 
MatrixGenerics. Very unlikely to happen!

IMO 2) is the best compromise. Want to give it a shot?

H.


On 1/27/19 13:45, Aaron Lun wrote:
> This is a resurrection of some old threads:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_bioc-2Ddevel_2017-2DNovember_012273.html&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3XRwM4jf0WeZA2ePj9yT3fc2X5hOsKNJk&s=pcpUyjpkQe6U79lZ_n2SANNp6Zj_s6i1Sq2yZx2NSjw&e=
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_MatrixGenerics_issues&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3XRwM4jf0WeZA2ePj9yT3fc2X5hOsKNJk&s=NrmcVnmvgkDp3p64J-izZU9VD5nFsFCWOTI-TsnzCpY&e=
>
> For those who are unfamiliar with this, the basic issue is that various
> Matrix and BiocGenerics functions mask each other. This is mildly
> frustrating in interactive sessions:
>
>> library(Matrix)
>> library(DelayedArray)
>> x <- rsparsematrix(10, 10, 0.1)
>> colSums(x) # fails
>> Matrix::colSums(x) # okay
> ... but quite annoying during package development, requiring code like
> this:
>
>      if (is(x, "Matrix")) {
>          z <- Matrix::colSums(x)
>      } else {
>          z <- colSums(x) # assuming DelayedArray does the masking.
>      }
>
> ... which defeats the purpose of using S4 dispatch in the first place.
>
> I have been encountering this issue with increasing frequency in my
> packages, as a lot of my code base needs to be able to interface with
> both Matrix and Bioconductor objects (e.g., DelayedMatrices) at the
> same time. What needs to happen so that I can just write:
>
>      z <- colSums(x)
>
> ... and everything will work for both Matrix and Bioconductor classes?
> It seems that many of these function names are implicit generics
> anyway, can BiocGenerics take advantage of that for the time being?
>
> Best,
>
> Aaron
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3XRwM4jf0WeZA2ePj9yT3fc2X5hOsKNJk&s=JtgGBnaZJH44fV8OUp-SwnHxhD_i_mdVkqoMfUoA5tM&e=

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages using fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list