[Bioc-devel] avoiding clashes of different S4 methods with the same generic
lawrence.michael at gene.com
Tue Apr 26 23:28:46 CEST 2016
On Tue, Apr 26, 2016 at 2:16 PM, Martin Morgan
<martin.morgan at roswellpark.org> wrote:
> On 04/26/2016 04:47 PM, Michael Lawrence wrote:
>> On Tue, Apr 26, 2016 at 11:00 AM, Aaron Lun <alun at wehi.edu.au> wrote:
>>> Dear List,
>>> When a S4 method for the same class is defined in two separate packages
>>> (i.e., under the same generic), and both packages are loaded into a R
>>> session, it seems that the method from the package loaded later clobbers
>>> method from the package loaded first. Is it possible to specifically call
>>> the method in the first package when both packages are loaded? If not,
>>> should we protect against this?
>>> To give some context; the csaw package currently defines a normalize()
>>> method for SummarizedExperiment objects, using the generic from
>>> BiocGenerics. However, if some other hypothetical package (I'll call it
>>> "swings", for argument's sake) were to define a normalize() method with a
>>> signature, and if the swings package were to be loaded after csaw, then
>>> seems that all calls to normalize() would use the method defined by
>>> rather than that defined by csaw.
>>> Now, for usual functions, disambiguation would be easy with "::", but I
>>> don't know whether this can be done in the S4 system, given that the
>>> of dispatch are generally hidden away. The only solution I can see is for
>>> csaw (and/or swings) to define a SE subclass; define the normalize()
>>> using the subclass as the signature, such that S4 dispatch will now go to
>>> the correct method; and hope that no other package redefines normalize()
>>> the subclass.
>>> Is this what I should be doing routinely, i.e., define subclasses and
>>> methods for those subclasses in all my packages? Or am I missing
>>> obvious? I would have expected such clashes to be more of a problem,
>>> how many new packages are being added to BioC at every release.
>> I would recommend against defining subclasses of basic data structures
>> that differ only in their behavior. The purpose of
>> SummarizedExperiment is to store data. One might use inheritance to
>> modify how the data are stored, or to store new types of data,
>> although the latter may be best addressed through composition.
>> To extend behavior, define methods. The generic represents the verb
>> and thus the semantics of the operation. In general, method conflicts
>> indicate that the design is broken. In this case, the normalize()
>> generic has a very general name. There is no one way to "normalize" a
>> SummarizedExperiment. It would be difficult for the reader to
>> understand such ambiguous code. To indicate a specific normalization
>> algorithm, we either need a more specific generic or we need to
>> parameterize it further.
>> One way to make more specific generics would be to give them the same
>> name, "normalize", but define them in different namespaces and require
>> :: qualification. That would mean abandoning the BiocGenerics generic
>> and it would only work if each package provides only one way to
>> normalize. Or, one could give them different names, but it would be
>> difficult to select a natural name, and it's not clear whether the
>> abstract notion of normalization should be always coupled with the
>> A more flexible/modular approach would be to augment the signature of
>> BiocGenerics::normalize to indicate a normalization method and rely on
>> normalize(se, WithSwings())
>> normalize(se, WithCSaw())
>> Roughly, one example of this approach is
>> VariantAnnotation::locateVariants() and its variant type argument.
> I like the dual dispatch method quite a bit (but wonder why we get several
> swings but only one csaw? Maybe a csaw implies two participants [though I
> think I once in a while csaw-ed alone], so a singular csaw and a pair of
> swings balance out?), partly because it's very easy to extend (write another
> method) and the second argument can be either lightweight or parameterized.
I could go along with the dual dispatch. "Swings" is short for "Set of
swings". Usually, there are several swings in a row, but only one
> From a user perspective normalizeCsaw / normalizeSwings makes the available
> options only a tab key away; maybe that's why Michael suggested With*?
That's a good point.
>> The affy package (or something around it) auto-qualifies the generic
>> via a method argument; something like S3 around S4. For example
>> normalize(se, "swings") would call normalize.swings(se), where
>> normalize.swings itself could be generic. Another way to effect
>> cascading dispatch is through composition, where the method object
>> either is a function or can provide one to implement the normalization
>> (emulating message passing OOP), which would allow normalize() to
>> implemented simply as:
>> normalize <- function(x, method, ...) normalizer(method)(x, ...)
>> One issue is that the syntax is a bit unconventional and users might
>> end up preferring the affy approach, with a normalize_csaw() and
>> normalize_swings(). But I like the modular, dynamic approach outlined
>>> Bioc-devel at r-project.org mailing list
>> Bioc-devel at r-project.org mailing list
> This email message may contain legally privileged and/or confidential
> information. If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited. If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
More information about the Bioc-devel