[Bioc-devel] avoiding clashes of different S4 methods with the same generic

Martin Morgan martin.morgan at roswellpark.org
Tue Apr 26 23:16:12 CEST 2016



On 04/26/2016 04:47 PM, Michael Lawrence wrote:
> On Tue, Apr 26, 2016 at 11:00 AM, Aaron Lun <alun at wehi.edu.au> wrote:
>> Dear List,
>>
>> When a S4 method for the same class is defined in two separate packages
>> (i.e., under the same generic), and both packages are loaded into a R
>> session, it seems that the method from the package loaded later clobbers the
>> method from the package loaded first. Is it possible to specifically call
>> the method in the first package when both packages are loaded? If not, how
>> should we protect against this?
>>
>> To give some context; the csaw package currently defines a normalize()
>> method for SummarizedExperiment objects, using the generic from
>> BiocGenerics. However, if some other hypothetical package (I'll call it
>> "swings", for argument's sake) were to define a normalize() method with a SE
>> signature, and if the swings package were to be loaded after csaw, then it
>> seems that all calls to normalize() would use the method defined by swings,
>> rather than that defined by csaw.
>>
>> Now, for usual functions, disambiguation would be easy with "::", but I
>> don't know whether this can be done in the S4 system, given that the details
>> of dispatch are generally hidden away. The only solution I can see is for
>> csaw (and/or swings) to define a SE subclass; define the normalize() method
>> using the subclass as the signature, such that S4 dispatch will now go to
>> the correct method; and hope that no other package redefines normalize() for
>> the subclass.
>>
>> Is this what I should be doing routinely, i.e., define subclasses and
>> methods for those subclasses in all my packages? Or am I missing something
>> obvious? I would have expected such clashes to be more of a problem, given
>> how many new packages are being added to BioC at every release.
>>
>
> I would recommend against defining subclasses of basic data structures
> that differ only in their behavior. The purpose of
> SummarizedExperiment is to store data. One might use inheritance to
> modify how the data are stored, or to store new types of data,
> although the latter may be best addressed through composition.
>
> To extend behavior, define methods. The generic represents the verb
> and thus the semantics of the operation. In general, method conflicts
> indicate that the design is broken. In this case, the normalize()
> generic has a very general name. There is no one way to "normalize" a
> SummarizedExperiment. It would be difficult for the reader to
> understand such ambiguous code. To indicate a specific normalization
> algorithm, we either need a more specific generic or we need to
> parameterize it further.
>
> One way to make more specific generics would be to give them the same
> name, "normalize", but define them in different namespaces and require
> :: qualification. That would mean abandoning the BiocGenerics generic
> and it would only work if each package provides only one way to
> normalize. Or, one could give them different names, but it would be
> difficult to select a natural name, and it's not clear whether the
> abstract notion of normalization should be always coupled with the
> method.
>
> A more flexible/modular approach would be to augment the signature of
> BiocGenerics::normalize to indicate a normalization method and rely on
> dual-dispatch:
>
> normalize(se, WithSwings())
> normalize(se, WithCSaw())
>
> Roughly, one example of this approach is
> VariantAnnotation::locateVariants() and its variant type argument.

I like the dual dispatch method quite a bit (but wonder why we get 
several swings but only one csaw? Maybe a csaw implies two participants 
[though I think I once in a while csaw-ed alone], so a singular csaw and 
a pair of swings balance out?), partly because it's very easy to extend 
(write another method) and the second argument can be either lightweight 
or parameterized.

 From a user perspective normalizeCsaw / normalizeSwings makes the 
available options only a tab key away; maybe that's why Michael 
suggested With*?

Martin

>
> The affy package (or something around it) auto-qualifies the generic
> via a method argument; something like S3 around S4. For example
> normalize(se, "swings") would call normalize.swings(se), where
> normalize.swings itself could be generic. Another way to effect
> cascading dispatch is through composition, where the method object
> either is a function or can provide one to implement the normalization
> (emulating message passing OOP), which would allow normalize() to
> implemented simply as:
>
> normalize <- function(x, method, ...) normalizer(method)(x, ...)
>
> One issue is that the syntax is a bit unconventional and users might
> end up preferring the affy approach, with a normalize_csaw() and
> normalize_swings(). But I like the modular, dynamic approach outlined
> above.
>
> Thoughts?
>
> Michael
>
>> Cheers,
>>
>> Aaron
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>


This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.



More information about the Bioc-devel mailing list