[Bioc-devel] avoiding clashes of different S4 methods with the same generic

Hervé Pagès hpages at fredhutch.org
Wed Apr 27 19:52:03 CEST 2016


On 04/27/2016 04:24 AM, Michael Lawrence wrote:
> On Tue, Apr 26, 2016 at 11:12 PM, Hervé Pagès <hpages at fredhutch.org> wrote:
>> Hi,
>>
>> I would not discard defining a SummarizedExperiment subclass so quickly.
>> SummarizedExperiment is very generic and can contain any kind of data.
>> IIUC the csaw package uses SummarizedExperiment to store a particular
>> kind of data (ChIP-seq data) and I believe specialization is a
>> legitimate situation for defining a subclass, even if the subclass is
>> a "straight" subclass i.e. a subclass that doesn't add new slots or
>> doesn't touch the existing slots.
>>
>> OTOH introducing a "straight" subclass only to define one specialized
>> method on it (the "normalize" method in this case) might not be worth
>> it since there is a cost for such class, even if that cost is minimal:
>> a cost for the user (one new container/constructor to deal with) and a
>> cost for the developer (e.g. multiplication of coerce methods).
>>
>
> If the data are more specialized, specialize the data structure,

Isn't it what I'm doing when I define a "straight" subclass? The fact 
that I don't need to alter the internal representation is an
implementation detail (and it could change at some point) but what's
important is that from a user point of view my container is now
tagged/specialized. I might only have one specialized method for it
at the moment but I might have more in the future, and/or other package
developers might build on top of my specialized container and add
specialized methods in the future (and I cross my fingers that since
I own the specialized container and already implemented a "normalize"
method for it, nobody will redefine that method in their package).

H.

> but
> the fact that the specialization solves the normalize() ambiguity is a
> mere coincidence. There are two different concerns.
>
>> Changing the signature of the normalize() generic in BiocGenerics and
>> introducing dual dispath is of course doable but that means the
>> maintainers of the packages that define methods on this generic are
>> ok with the dual dispatch game and are willing to make the required
>> modifications to their packages. It's an important change and I don't
>> see an easy way to make it happen smoothly (i.e. thru a
>> deprecated/defunct cycle).
>>
>
> In conjunction with what Martin said, you could define a
> "ANY","missing" method that emits a deprecation warning, and then
> recall the generic using NULL or something for the second argument so
> that it falls through. Packages would only need to fix the formals of
> their method definition.
>
>> Here is the list of packages that currently define methods for
>> BiocGenerics::normalize():
>>
>>    affyPLM
>>    Cardinal
>>    codelink
>>    CopyNumber450k
>>    csaw
>>    diffHic
>>    EBImage
>>    epigenomix
>>    MSnbase
>>    oligo
>>    qpcrNorm
>>    scran
>>
>> [Interestingly the scran package defines a default "normalize" method
>> (i.e. a normalize,ANY method)].
>>
>> Whether we make the second argument lightweight or parameterized (which
>> is something that would need to be decided at the level of the generic)
>> these packages will break as soon as we change the signature of the
>> generic. So we'll need to wait after the release before this happens.
>>
>> Personally I find the lightweight second argument not particularly
>> intuitive, elegant, or user-friendly. I'd rather type
>> normalizeSwing(se, ...) or normalize(se, SwingParam(...)) than
>> normalize(se, WithSwing(), ...).
>>
>
> Sure, WithSwing() could hold arguments as well, but I agree that the
> Param suffix is more consistent. The Param naming is not great for
> autocompletion. Though I guess the interface could provide hints based
> on the defined methods.
>
>> Last thing: In case of a parameterized second argument, do we really
>> need a virtual normalizeParam class as parent of all the concrete
>> normalizeParam* classes? If so then I guess we would need to have it
>> defined in BiocGenerics but I think we should try hard to not start
>> defining classes in this package (that could take us too far...)
>>
>
> I would say no, no real need for a base class.
>
>> H.
>>
>>
>> On 04/26/2016 03:03 PM, Aaron Lun wrote:
>>>
>>> Yes, but "monkeyBars" doesn't have quite the same pithiness for a
>>> package name.
>>>
>>> Anyway, the dual dispatch mechanism sounds most interesting. I assume
>>> that means we'd have to define some sort of base "normalizeParam" class,
>>> and then derive "csawNormParam" and "swingsNormParam" subclasses, so
>>> that specific methods can be defined for each signature.
>>>
>>> - Aaron
>>>
>>> Martin Morgan wrote:
>>>>
>>>>
>>>> On 04/26/2016 05:28 PM, Michael Lawrence wrote:
>>>>>
>>>>> On Tue, Apr 26, 2016 at 2:16 PM, Martin Morgan
>>>>> <martin.morgan at roswellpark.org>  wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 04/26/2016 04:47 PM, Michael Lawrence wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Apr 26, 2016 at 11:00 AM, Aaron Lun<alun at wehi.edu.au>
>>>>
>>>> wrote:
>>>> ...
>>>>>>>>>>>
>>>>>>>>>>> BiocGenerics. However, if some other hypothetical package
>>>>
>>>> (I'll call it
>>>>>>>>>>>
>>>>>>>>>>> "swings", for argument's sake) were to define a normalize()
>>>>
>>>> method with a
>>>> ...
>>>>>>>
>>>>>>> I like the dual dispatch method quite a bit (but wonder why we get
>>>>
>>>> several
>>>>>>>
>>>>>>> swings but only one csaw? Maybe a csaw implies two participants
>>>>>>
>>>>>> [though I
>>>>>>>
>>>>>>> think I once in a while csaw-ed alone], so a singular csaw and a
>>>>>>
>>>>>> pair of
>>>>>>>
>>>>>>> swings balance out?), partly because it's very easy to extend
>>>>>>
>>>>>> (write another
>>>>>>>
>>>>>>> method) and the second argument can be either lightweight or
>>>>>>
>>>>>> parameterized.
>>>>>>>
>>>>>>>
>>>>> I could go along with the dual dispatch. "Swings" is short for "Set of
>>>>> swings". Usually, there are several swings in a row, but only one
>>>>> see-saw.
>>>>>
>>>>
>>>> Googling for "how many swings per see-saw" took me to
>>>>
>>>>     https://www.cpsc.gov//PageFiles/108601/playgrnd.pdf
>>>>
>>>> where it is apparent that swings are much more dangerous than see-saws
>>>> (e.g., 51 matches for "swing" versus 4 for "see-saw"; "Swings ... were
>>>> involved in about 19 ... percent of injuries ... See-saws accounted
>>>> for about three percent"; "Homemade rope, tire, or tree swings were
>>>> also involved in a number of hanging deaths" [no mention of death by
>>>> see-saw]).
>>>>
>>>> I think for the sake of our users, especially our younger users, we do
>>>> not want to consider swings, or even methods on swings, further.
>>>>
>>>> Martin
>>>>
>>>>
>>>> This email message may contain legally privileged and/or confidential
>>>> information.  If you are not the intended recipient(s), or the
>>>> employee or agent responsible for the delivery of this message to the
>>>> intended recipient(s), you are hereby notified that any disclosure,
>>>> copying, distribution, or use of this email message is prohibited.  If
>>>> you have received this message in error, please notify the sender
>>>> immediately by e-mail and delete this email message from your
>>>> computer. Thank you.
>>>
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages at fredhutch.org
>> Phone:  (206) 667-5791
>> Fax:    (206) 667-1319
>>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list