[Bioc-devel] any interest in a BiocMatrix core package?
Peter Hickey
peter.hickey at gmail.com
Thu Nov 2 23:20:10 CET 2017
As Michael notes, I think the scope here is broader than considering S4
generics for functions in base R. To summarise, I think we would be looking
to have S4 generics for the following:
- All(?) the row*/col* functions in matrixStats (NB: matrixStats uses plain
old functions with no S3 or S4, which I believe was to avoid any overhead
of method dispatch since it is explicitly targeting ordinary matrix objects
as input)
- Potentially new row*/col* summaries (i.e. that don't currently exist in
matrixStats)
- Perhaps moving from BiocGenerics the S4 generics defined in
R/matrix-summary.R?
- Perhaps apply() (E.g., DelayedArray defines an S4 generic for this)
Having these as part of base R or in a recommended packages would be great,
but of course comes with its own challenges. The alternative is a
lightweight package, likely better hosted on CRAN than BioC to assist with
wider adoption and integration with Matrix, matrixStats, and other non-BioC
packages.
As Michael notes, getting the generic signature 'right' will be important
and there are undoubtedly other challenges ahead (I've started a TODO).
Might Bioconductor open up a GitHub repo (MatrixGenerics?) where this can
be discussed with accompanying code. I've made the skeleton of a
MatrixGenerics package that I could upload to kick things off, along with
adding my TODOs as Issues on GitHub for further discussion.
Cheers,
Pete
On Thu, 2 Nov 2017 at 13:10 Michael Lawrence <lawrence.michael at gene.com>
wrote:
> I'm pretty sure we're also considering generics for functions that do not
> exist in base R. Like rowVars() and colVars(). This sort of suggests that
> matrixStats should be part of base R.
>
> As an aside, we should think about the signature on those implicit
> generics. Should they really include na.rm and dims? The simpler the
> signature, the easier to understand the API.
>
>
> On Thu, Nov 2, 2017 at 10:38 AM, Martin Maechler <
> maechler at stat.math.ethz.ch
> > wrote:
>
> > >>>>> Martin Morgan <martin.morgan at roswellpark.org>
> > >>>>> on Thu, 2 Nov 2017 06:17:19 -0400 writes:
> >
> > > On 11/02/2017 05:00 AM, Martin Maechler wrote:
> > >>>>>>> "ML" == Michael Lawrence <lawrence.michael at gene.com>
> > >>>>>>> on Wed, 1 Nov 2017 14:13:54 -0700 writes:
> > >>
> > >> > Probably way easier to add the generics to the Matrix >
> > >> package and everyone just depends on that.
> > >>
> > >> Yes! It is 'Recommended' and comes with every R
> > >> installation, and has had many such matrix S4 methods in
> > >> place for > 10 years, notably for dealing with (large)
> > >> sparse matrices.
> > >>
> > >> Honestly, I (as co-maintainer of Matrix, principal
> > >> maintainer for several years now) had been a bit
> > >> surprised and frustrated that the 'matrixStats'
> > >> initiative had started w/o any contact with the Matrix
> > >> package maintainers and initially has not ever tried to
> > >> use Matrix package classes or functionality (and this is
> > >> still the case now AFAICS).
> > >>
> > >> I'm happy to coordinate with maintainers of bioc packages
> > >> about which generics (and classes !) to use and export,
> > >> etc.
> >
> > > One issue is that Matrix is a relatively large package
> > > (well, I wonder if that's a reasonable statement, given
> > > the Bioc dependencies and data involved, but perhaps in
> > > general...) and hence 'overkill' to obtain a collection of
> > > generics. Is there any prospect for factoring out the
> > > definition of the generics from implementation of the
> > > methods? Re-purposing stats4 ?
> >
> > > Martin Morgan
> >
> > Hmm.. we have quite a few setGenericImplicit() statements in
> > the methods package already, notably for 'colSums' and friends,
> > and so other decent citizen packages do *NOT* setGeneric() at
> > all on these ... and of course, Matrix _is_ a decent citizen in
> > the R package universe.
> >
> > Instead of to stats4, I'm pretty sure we should only consider
> > what functions should be added to the implicit generics already
> > provided by the 'methods' package itself.
> >
> > Could it be that (some of) you are not properly aware of
> > implicit generics?
> >
> > If you start 'R --vanilla' you can say
> >
> > > implicitGeneric("colSums")
> > standardGeneric for "colSums" defined from package "base"
> >
> > function (x, na.rm = FALSE, dims = 1, ...)
> > standardGeneric("colSums")
> > <bytecode: 0x6cb4798>
> > <environment: 0x6cab560>
> > Methods may be defined for arguments: x, na.rm, dims
> > Use showMethods("colSums") for currently available ones.
> > ---------
> >
> > so I think it is clear how *any* decent package has to define
> > methods for colSums(), and if they do, there should not be any conflicts.
> >
> > I think the problem is with S3 methods, not with S4 ones, where
> > the implicit generics I understand where made for dealing with
> > several packages writing methods for the same generic without
> > one of the packages taking precedence.
> >
> > Martin Mächler
> >
> >
> >
> > >>
> > >> Best, Martin Maechler ETH Zurich (and R core team)
> > >>
> > >>
> > >>
> > >> > On Wed, Nov 1, 2017 at 1:59 PM, Hervé Pagès >
> > >> <hpages at fredhutch.org> wrote:
> > >>
> > >> >> That's probably a good idea but a clean solution would
> > >> >> need to involve all players, including the Matrix >>
> > >> package. Right now there are conflicts for some S4 >>
> > >> generics defined in Matrix and in BiocGenerics >>
> > >> (e.g. rowSums). I'm not sure that moving rowSums from >>
> > >> BiocGenerics to a new MatrixGenerics package would >>
> > >> address this. Unless MatrixGenerics is on CRAN and >>
> > >> Matrix depends on it ;-)
> > >> >>
> > >> >> How likely is this to happen?
> > >> >>
> > >> >> H.
> > >> >>
> > >> >>
> > >> [............]
> > >>
> > >> _______________________________________________
> > >> Bioc-devel at r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> > >>
> >
> >
> > > This email message may contain legally privileged and/or
> > > confidential information. If you are not the intended
> > > recipient(s), or the employee or agent responsible for the
> > > delivery of this message to the intended recipient(s), you
> > > are hereby notified that any disclosure, copying,
> > > distribution, or use of this email message is prohibited.
> > > If you have received this message in error, please notify
> > > the sender immediately by e-mail and delete this email
> > > message from your computer. Thank you.
> >
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list