[Bioc-devel] Making non-bioconductor packages play nice with bioc (and vice versa)

Martin Morgan mtmorgan at fhcrc.org
Tue Sep 6 01:39:10 CEST 2011


On 09/05/2011 03:57 PM, Steve Lianoglou wrote:
> Hi Martin,
>
> First -- thanks for taking the time to reply to Q's over the holiday.
>
> Second -- I don't want this to come across as a rant or anything, and
> clearly this isn't a bioconductor thing, but more of a core R thing so
> feel free to tune out if you like. If there was an R-philosophy list,
> perhaps that'd be the best venue for further discussion, since what
> follows from here on out doesn't amount to much more than navel gazing
> at this point.

probably R-devel and cc'ing the maintainer of methods would be the 
closest to R-philosophy.

> Your reply is interesting since it shows how people think of what it
> means to be a "generic function" in different ways ... and when I say
> "people" I guess I really mean me :-)
>
> In short, I've always thought "the idea" of generic functions is that
> there isn't an IRanges version of cbind, and an xxx version of cbind,
> but rather "cbind" is a function that different classes can register
> functionality against if they so choose. Maybe a better example would
> be for the function `length` -- it's a function that is just defined

'length' is a good example

   > getGeneric("length")
   standardGeneric for "length" defined from package "base"

   function (x)
   standardGeneric("length", .Primitive("length"))
   <environment: 0x177cab8>
   Methods may be defined for arguments: x
   Use  showMethods("length")  for currently available ones.

the methods package has made it so that length behaves as a generic, it 
dispatches across all methods, packages that exportMethods(length) add 
methods to the length generic. There is no IRanges::length generic, just 
methods exposed via exportMethods.

> in different ways for different classes of things, but you expect most
> all types of classes to respond to `length`, so they should be free to
> implement this generic.
>
> After all, you are provided with a warning when you create a generic
> (say from a base function) that differs in signature from one that's
> already loaded in the working environment. True that warnings aren't
> errors, but still, I take it as something you're not *really* supposed
> to do.
>
> And actually, I think the way exporting of S3 methods from packages
> works is more inline with how I imagine S4 exports *should* work, ie.
> in the NAMESPACE you could have `S3method(cbind, data.table)` and it
> just "registers" it to the generic `cbind` without masking `cbind`
> from base, or whatever other package wants to register a cbind method
> (too bad there are no "dummy" S4 versions of some of the functions
> found in base R (cbind, rbind, length, plot, etc. as I reckon that
> will solve all these problems).

I think the S3 world is just a bit more comprehensive in what people 
have lobbied / R-core has chosen to implement as a generic -- cbind is 
an S3 generic, but for example 'cat' or 'args' or 'as.factor' isn't, and 
a package wanting these as S3 generics would have to create a generic. 
And two packages could create two separate generics. And...

I don't know what has motivated S3 or S4 generic-ization of functions. 
There is work involved, and efficiency concerns. I don't think it's 
enough to say 'wouldn't it be nice if...' or 'for consistency...' but I 
don't have the magic recipe. Here's my last attempt

http://tolstoy.newcastle.edu.au/R/e13/devel/11/04/0976.html

Martin

> So -- with that in mind, I would like to just "add" my definition of
> `cbind` or `length` or whatever to the S4 generic that's already
> loaded (or make one, if there is no such generic). Particularly since
> these types of functions are broadly applicable to different types of
> objects.
>
> at the risk of being redundant:
>
> On Mon, Sep 5, 2011 at 5:50 PM, Martin Morgan<mtmorgan at fhcrc.org>  wrote:
>> Hi Steve --
>>
>> On 09/05/2011 11:29 AM, Steve Lianoglou wrote:
> [trim]
>
>>> I've tried to "guard" against an already defined cbind generic (say,
>>> from IRanges), like:
>>>
>>> if (!isGeneric("cbind")) {
>>>    setGeneric("cbind", function(..., deparse.level=1)
>>> standardGeneric("cbind"), signature="...")
>>> }
>>
>> I don't really like this idea. It implies you'll take whatever cbind comes
>> along, whereas you're really interested in only specific cbinds (the one in
>> base, for instance).
>> You've already used a NAMESPACE to determine what
>> functions are reliably available, so either there's a cbind generic that
>> you're interested in attaching methods to or there isn't.
>
> That's the nut right there. I don't see it as taking "whatever cbind
> comes along" (even though that's what I'm doing I guess).
>
> I'm not interested in a specific cbind, I just want to add my specific
> implementation of cbind to list of classes that a call to "cbind" will
> dispatch to ... making a user call data.table::cbind(...) makes me sad
> ... especially since I'll be one of those users ;-)
>
> I mean -- ultimately that's the whole point of generic functions (S3,
> S4, whatever) vs. "normal" functions, no? If I didn't use any generic
> function (S3, S4, or otherwise), I would expect to have to do
> IRanges::something vs. data.table::something, but in my mind generics
> are supposed to solve that problem.
>
> Anyway ... thanks again for taking the time to reply,
>
> -steve
>
>> Likely you want to
>> promote base cbind to a generic (since as you say it doesn't make sense to
>> introduce IRanges as a dependency).
>>
>>   setGeneric("cbind")
>>
>>> but exporting cbind from data.table's NAMESPACE will trample the cbind
>>> methods from IRanges (if IRanges is loaded first). Loading IRanges
>>> after data.table will hose the cbind/rbind defined in data.table.
>>
>> Conceptually, and my story might depart from reality here, I'd think that
>> data.table::cbind and IRanges::cbind are distinct generics, and that when
>> you say 'trample' you mean mask, i.e., IRanges generic and methods are still
>> available, just further down the search path than is found by default. So
>> the user has to IRanges::cbind or data.table::cbind. Within either package,
>> they both know the appropriate cbind generic and hence cbind methods, so no
>> need for pkg::cbind. Likewise, in a package that import(IRanges, cbind),
>> dispatch within the package is un-affected by what data.table does.
>>
>>> I guess one way to do it is to importFrom(IRanges, cbind, rbind) in
>>> data.table's NAMESPACE, but it wouldn't really be appropriate to
>>> introduce an IRanges dependency to data.table.
>>
>> agreed, this doesn't sound appropriate.
>>
>>>>  From reading through the R-exts doc, I get the feeling that what I'm
>>>
>>> trying to do is not possible, but maybe there's a way for me to do
>>> this that I'm overlooking?
>>>
>>> Will two different packages that declare and implement S4 functions by
>>> the same name always trample over one another unless one of them
>>> imports these functions from the other?
>>>
>>> And a random (somehow related), stylistic question. Out of curiosity:
>>> why doesn't IRanges "export" rbind,cbind through the exportMethods()
>>> directive, since its an S4/generic (instead of just export())?
>>
>> I'm not the author so don't know for sure, and again my story might depart
>> from reality. But I think of the setGeneric in IRanges as producing a new
>> generic function, and the function is exported. This is different from
>> adding methods to an existing, perhaps already visible (e.g., 'show')
>> generic.
>>
>>> Thanks,
>>>
>>> -steve
>>>
>>> [1] data.table:
>>> http://cran.r-project.org/web/packages/data.table/index.html
>>>
>>
>>
>> --
>> Computational Biology
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>
>> Location: M1-B861
>> Telephone: 206 667-2793
>>
>
>
>


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioc-devel mailing list