[Bioc-devel] any interest in a BiocMatrix core package?

Peter Hickey peter.hickey at gmail.com
Thu Nov 2 23:20:10 CET 2017


As Michael notes, I think the scope here is broader than considering S4
generics for functions in base R. To summarise, I think we would be looking
to have S4 generics for the following:

- All(?) the row*/col* functions in matrixStats (NB: matrixStats uses plain
old functions with no S3 or S4, which I believe was to avoid any overhead
of method dispatch since it is explicitly targeting ordinary matrix objects
as input)
- Potentially new row*/col* summaries (i.e. that don't currently exist in
matrixStats)
- Perhaps moving from BiocGenerics the S4 generics defined in
R/matrix-summary.R?
- Perhaps apply() (E.g., DelayedArray defines an S4 generic for this)

Having these as part of base R or in a recommended packages would be great,
but of course comes with its own challenges. The alternative is a
lightweight package, likely better hosted on CRAN than BioC to assist with
wider adoption and integration with Matrix, matrixStats, and other non-BioC
packages.

As Michael notes, getting the generic signature 'right' will be important
and there are undoubtedly other challenges ahead (I've started a TODO).

Might Bioconductor open up a GitHub repo (MatrixGenerics?) where this can
be discussed with accompanying code. I've made the skeleton of a
MatrixGenerics package that I could upload to kick things off, along with
adding my TODOs as Issues on GitHub for further discussion.

Cheers,
Pete


On Thu, 2 Nov 2017 at 13:10 Michael Lawrence <lawrence.michael at gene.com>
wrote:

> I'm pretty sure we're also considering generics for functions that do not
> exist in base R. Like rowVars() and colVars(). This sort of suggests that
> matrixStats should be part of base R.
>
> As an aside, we should think about the signature on those implicit
> generics. Should they really include na.rm and dims? The simpler the
> signature, the easier to understand the API.
>
>
> On Thu, Nov 2, 2017 at 10:38 AM, Martin Maechler <
> maechler at stat.math.ethz.ch
> > wrote:
>
> > >>>>> Martin Morgan <martin.morgan at roswellpark.org>
> > >>>>>     on Thu, 2 Nov 2017 06:17:19 -0400 writes:
> >
> >     > On 11/02/2017 05:00 AM, Martin Maechler wrote:
> >     >>>>>>> "ML" == Michael Lawrence <lawrence.michael at gene.com>
> >     >>>>>>> on Wed, 1 Nov 2017 14:13:54 -0700 writes:
> >     >>
> >     >> > Probably way easier to add the generics to the Matrix >
> >     >> package and everyone just depends on that.
> >     >>
> >     >> Yes!  It is 'Recommended' and comes with every R
> >     >> installation, and has had many such matrix S4 methods in
> >     >> place for > 10 years, notably for dealing with (large)
> >     >> sparse matrices.
> >     >>
> >     >> Honestly, I (as co-maintainer of Matrix, principal
> >     >> maintainer for several years now) had been a bit
> >     >> surprised and frustrated that the 'matrixStats'
> >     >> initiative had started w/o any contact with the Matrix
> >     >> package maintainers and initially has not ever tried to
> >     >> use Matrix package classes or functionality (and this is
> >     >> still the case now AFAICS).
> >     >>
> >     >> I'm happy to coordinate with maintainers of bioc packages
> >     >> about which generics (and classes !) to use and export,
> >     >> etc.
> >
> >     > One issue is that Matrix is a relatively large package
> >     > (well, I wonder if that's a reasonable statement, given
> >     > the Bioc dependencies and data involved, but perhaps in
> >     > general...) and hence 'overkill' to obtain a collection of
> >     > generics. Is there any prospect for factoring out the
> >     > definition of the generics from implementation of the
> >     > methods?  Re-purposing stats4 ?
> >
> >     > Martin Morgan
> >
> > Hmm..  we have quite a few  setGenericImplicit()  statements in
> > the methods package already, notably for  'colSums' and friends,
> > and so other decent citizen packages do *NOT*  setGeneric() at
> > all on these ... and of course, Matrix _is_ a decent citizen in
> > the R package universe.
> >
> > Instead of to stats4, I'm pretty sure we should only consider
> > what functions should be added to the implicit generics already
> > provided by the 'methods' package itself.
> >
> > Could it be that (some of) you are not properly aware of
> > implicit generics?
> >
> > If you start 'R --vanilla' you can say
> >
> > > implicitGeneric("colSums")
> > standardGeneric for "colSums" defined from package "base"
> >
> > function (x, na.rm = FALSE, dims = 1, ...)
> > standardGeneric("colSums")
> > <bytecode: 0x6cb4798>
> > <environment: 0x6cab560>
> > Methods may be defined for arguments: x, na.rm, dims
> > Use  showMethods("colSums")  for currently available ones.
> > ---------
> >
> > so I think it is clear how *any* decent package has to define
> > methods for colSums(), and if they do, there should not be any conflicts.
> >
> > I think the problem is with S3 methods, not with S4 ones, where
> > the implicit generics I understand where made for dealing with
> > several packages writing methods for the same generic without
> > one of the packages taking precedence.
> >
> > Martin Mächler
> >
> >
> >
> >     >>
> >     >> Best, Martin Maechler ETH Zurich (and R core team)
> >     >>
> >     >>
> >     >>
> >     >> > On Wed, Nov 1, 2017 at 1:59 PM, Hervé Pagès >
> >     >> <hpages at fredhutch.org> wrote:
> >     >>
> >     >> >> That's probably a good idea but a clean solution would
> >     >> >> need to involve all players, including the Matrix >>
> >     >> package. Right now there are conflicts for some S4 >>
> >     >> generics defined in Matrix and in BiocGenerics >>
> >     >> (e.g. rowSums). I'm not sure that moving rowSums from >>
> >     >> BiocGenerics to a new MatrixGenerics package would >>
> >     >> address this.  Unless MatrixGenerics is on CRAN and >>
> >     >> Matrix depends on it ;-)
> >     >> >>
> >     >> >> How likely is this to happen?
> >     >> >>
> >     >> >> H.
> >     >> >>
> >     >> >>
> >     >> [............]
> >     >>
> >     >> _______________________________________________
> >     >> Bioc-devel at r-project.org mailing list
> >     >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >     >>
> >
> >
> >     > This email message may contain legally privileged and/or
> >     > confidential information.  If you are not the intended
> >     > recipient(s), or the employee or agent responsible for the
> >     > delivery of this message to the intended recipient(s), you
> >     > are hereby notified that any disclosure, copying,
> >     > distribution, or use of this email message is prohibited.
> >     > If you have received this message in error, please notify
> >     > the sender immediately by e-mail and delete this email
> >     > message from your computer. Thank you.
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list