[Bioc-devel] any interest in a BiocMatrix core package?

Martin Morgan martin.morgan at roswellpark.org
Fri Nov 3 15:16:52 CET 2017


On 11/02/2017 06:20 PM, Peter Hickey wrote:
> As Michael notes, I think the scope here is broader than considering S4
> generics for functions in base R. To summarise, I think we would be looking
> to have S4 generics for the following:
> 
> - All(?) the row*/col* functions in matrixStats (NB: matrixStats uses plain
> old functions with no S3 or S4, which I believe was to avoid any overhead
> of method dispatch since it is explicitly targeting ordinary matrix objects
> as input)
> - Potentially new row*/col* summaries (i.e. that don't currently exist in
> matrixStats)
> - Perhaps moving from BiocGenerics the S4 generics defined in
> R/matrix-summary.R?
> - Perhaps apply() (E.g., DelayedArray defines an S4 generic for this)
> 
> Having these as part of base R or in a recommended packages would be great,
> but of course comes with its own challenges. The alternative is a
> lightweight package, likely better hosted on CRAN than BioC to assist with
> wider adoption and integration with Matrix, matrixStats, and other non-BioC
> packages.
> 
> As Michael notes, getting the generic signature 'right' will be important
> and there are undoubtedly other challenges ahead (I've started a TODO).
> 
> Might Bioconductor open up a GitHub repo (MatrixGenerics?) where this can
> be discussed with accompanying code. I've made the skeleton of a
> MatrixGenerics package that I could upload to kick things off, along with
> adding my TODOs as Issues on GitHub for further discussion.

I did start this repository as a place to develop more concrete ideas; I 
think that a Bioconductor MatrixGenerics solution would not be optimal, 
so I think of this repository as a place to develop ideas rather than a 
precursor to an actual package.

I invited Pete as a Collaborator with 'Admin' privileges, so I think he 
should be able to extend Collaborator invites to other interested parties.

Martin

> 
> Cheers,
> Pete
> 
> 
> On Thu, 2 Nov 2017 at 13:10 Michael Lawrence <lawrence.michael at gene.com>
> wrote:
> 
>> I'm pretty sure we're also considering generics for functions that do not
>> exist in base R. Like rowVars() and colVars(). This sort of suggests that
>> matrixStats should be part of base R.
>>
>> As an aside, we should think about the signature on those implicit
>> generics. Should they really include na.rm and dims? The simpler the
>> signature, the easier to understand the API.
>>
>>
>> On Thu, Nov 2, 2017 at 10:38 AM, Martin Maechler <
>> maechler at stat.math.ethz.ch
>>> wrote:
>>
>>>>>>>> Martin Morgan <martin.morgan at roswellpark.org>
>>>>>>>>      on Thu, 2 Nov 2017 06:17:19 -0400 writes:
>>>
>>>      > On 11/02/2017 05:00 AM, Martin Maechler wrote:
>>>      >>>>>>> "ML" == Michael Lawrence <lawrence.michael at gene.com>
>>>      >>>>>>> on Wed, 1 Nov 2017 14:13:54 -0700 writes:
>>>      >>
>>>      >> > Probably way easier to add the generics to the Matrix >
>>>      >> package and everyone just depends on that.
>>>      >>
>>>      >> Yes!  It is 'Recommended' and comes with every R
>>>      >> installation, and has had many such matrix S4 methods in
>>>      >> place for > 10 years, notably for dealing with (large)
>>>      >> sparse matrices.
>>>      >>
>>>      >> Honestly, I (as co-maintainer of Matrix, principal
>>>      >> maintainer for several years now) had been a bit
>>>      >> surprised and frustrated that the 'matrixStats'
>>>      >> initiative had started w/o any contact with the Matrix
>>>      >> package maintainers and initially has not ever tried to
>>>      >> use Matrix package classes or functionality (and this is
>>>      >> still the case now AFAICS).
>>>      >>
>>>      >> I'm happy to coordinate with maintainers of bioc packages
>>>      >> about which generics (and classes !) to use and export,
>>>      >> etc.
>>>
>>>      > One issue is that Matrix is a relatively large package
>>>      > (well, I wonder if that's a reasonable statement, given
>>>      > the Bioc dependencies and data involved, but perhaps in
>>>      > general...) and hence 'overkill' to obtain a collection of
>>>      > generics. Is there any prospect for factoring out the
>>>      > definition of the generics from implementation of the
>>>      > methods?  Re-purposing stats4 ?
>>>
>>>      > Martin Morgan
>>>
>>> Hmm..  we have quite a few  setGenericImplicit()  statements in
>>> the methods package already, notably for  'colSums' and friends,
>>> and so other decent citizen packages do *NOT*  setGeneric() at
>>> all on these ... and of course, Matrix _is_ a decent citizen in
>>> the R package universe.
>>>
>>> Instead of to stats4, I'm pretty sure we should only consider
>>> what functions should be added to the implicit generics already
>>> provided by the 'methods' package itself.
>>>
>>> Could it be that (some of) you are not properly aware of
>>> implicit generics?
>>>
>>> If you start 'R --vanilla' you can say
>>>
>>>> implicitGeneric("colSums")
>>> standardGeneric for "colSums" defined from package "base"
>>>
>>> function (x, na.rm = FALSE, dims = 1, ...)
>>> standardGeneric("colSums")
>>> <bytecode: 0x6cb4798>
>>> <environment: 0x6cab560>
>>> Methods may be defined for arguments: x, na.rm, dims
>>> Use  showMethods("colSums")  for currently available ones.
>>> ---------
>>>
>>> so I think it is clear how *any* decent package has to define
>>> methods for colSums(), and if they do, there should not be any conflicts.
>>>
>>> I think the problem is with S3 methods, not with S4 ones, where
>>> the implicit generics I understand where made for dealing with
>>> several packages writing methods for the same generic without
>>> one of the packages taking precedence.
>>>
>>> Martin Mächler
>>>
>>>
>>>
>>>      >>
>>>      >> Best, Martin Maechler ETH Zurich (and R core team)
>>>      >>
>>>      >>
>>>      >>
>>>      >> > On Wed, Nov 1, 2017 at 1:59 PM, Hervé Pagès >
>>>      >> <hpages at fredhutch.org> wrote:
>>>      >>
>>>      >> >> That's probably a good idea but a clean solution would
>>>      >> >> need to involve all players, including the Matrix >>
>>>      >> package. Right now there are conflicts for some S4 >>
>>>      >> generics defined in Matrix and in BiocGenerics >>
>>>      >> (e.g. rowSums). I'm not sure that moving rowSums from >>
>>>      >> BiocGenerics to a new MatrixGenerics package would >>
>>>      >> address this.  Unless MatrixGenerics is on CRAN and >>
>>>      >> Matrix depends on it ;-)
>>>      >> >>
>>>      >> >> How likely is this to happen?
>>>      >> >>
>>>      >> >> H.
>>>      >> >>
>>>      >> >>
>>>      >> [............]
>>>      >>
>>>      >> _______________________________________________
>>>      >> Bioc-devel at r-project.org mailing list
>>>      >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>      >>
>>>
>>>
>>>      > This email message may contain legally privileged and/or
>>>      > confidential information.  If you are not the intended
>>>      > recipient(s), or the employee or agent responsible for the
>>>      > delivery of this message to the intended recipient(s), you
>>>      > are hereby notified that any disclosure, copying,
>>>      > distribution, or use of this email message is prohibited.
>>>      > If you have received this message in error, please notify
>>>      > the sender immediately by e-mail and delete this email
>>>      > message from your computer. Thank you.
>>>
>>
>>          [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 


This email message may contain legally privileged and/or...{{dropped:2}}



More information about the Bioc-devel mailing list