[Bioc-devel] Making non-bioconductor packages play nice with bioc (and vice versa)

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Tue Sep 6 01:41:50 CEST 2011


Steve,

I can totally see where you are coming from and I agree it could be a lot nicer.

Let me start by explaining one posible issue: whoever makes the call
to setGeneric decides what the basic arguments are are to the generic
method.  Most of the S4 stuff in the core Bioconductor packages
handles this by having the signature be something like
  function(x, y, ...)
so that specific methods may add additional arguments.  But you cannot
always know this.  So if someone decides that length should be a
generic with a base signature of
  function(x)
there is no way you as a user can add arguments to this method.  This
is just an example of why the call to setGeneric matters.  [ In the
old days, this was "addressed" in Biobase by a construction like
   if(!isGeneric("foo"))
      setGeneric("foo"  ETC)
]

In order to address this, currently you need to depend (or import) the
package defining exactly the generic you want.

Now, this is rather irritating for all the standard R things like
length, dim, ...  If any of these are already S4 generics in R, you
are happy; it is clear what to do.  If not, there is quite a chance
that several of the packages you work with may define different
generics.  Any while package writers can fix this by NAMESPACEs, it
sucks for users at the prompt.  Here I agree with you that all of this
combines to something less desirable.  Unfortunately I think that
there is no interest in making all of these basic functions into
generics in base R.

Now, one solution for Bioconductor, is to have a single package just
containing standard generics for the project.  This used to be
Biobase, so that all packages using some basic S4 depended on Biobase.
 Now with IRanges etc. and the many additions this is not true
anymore.  One possible "solution" going forward is to get some
consensus about the signature and then design a single base
Bioconductor containing simply a lot of calls to setGeneric.  Of
course, this will not necessarily help your use case a lot, but it
might at least provide more sanity with Bioc.

Kasper




On Mon, Sep 5, 2011 at 6:57 PM, Steve Lianoglou
<mailinglist.honeypot at gmail.com> wrote:
> Hi Martin,
>
> First -- thanks for taking the time to reply to Q's over the holiday.
>
> Second -- I don't want this to come across as a rant or anything, and
> clearly this isn't a bioconductor thing, but more of a core R thing so
> feel free to tune out if you like. If there was an R-philosophy list,
> perhaps that'd be the best venue for further discussion, since what
> follows from here on out doesn't amount to much more than navel gazing
> at this point.
>
> Your reply is interesting since it shows how people think of what it
> means to be a "generic function" in different ways ... and when I say
> "people" I guess I really mean me :-)
>
> In short, I've always thought "the idea" of generic functions is that
> there isn't an IRanges version of cbind, and an xxx version of cbind,
> but rather "cbind" is a function that different classes can register
> functionality against if they so choose. Maybe a better example would
> be for the function `length` -- it's a function that is just defined
> in different ways for different classes of things, but you expect most
> all types of classes to respond to `length`, so they should be free to
> implement this generic.
>
> After all, you are provided with a warning when you create a generic
> (say from a base function) that differs in signature from one that's
> already loaded in the working environment. True that warnings aren't
> errors, but still, I take it as something you're not *really* supposed
> to do.
>
> And actually, I think the way exporting of S3 methods from packages
> works is more inline with how I imagine S4 exports *should* work, ie.
> in the NAMESPACE you could have `S3method(cbind, data.table)` and it
> just "registers" it to the generic `cbind` without masking `cbind`
> from base, or whatever other package wants to register a cbind method
> (too bad there are no "dummy" S4 versions of some of the functions
> found in base R (cbind, rbind, length, plot, etc. as I reckon that
> will solve all these problems).
>
> So -- with that in mind, I would like to just "add" my definition of
> `cbind` or `length` or whatever to the S4 generic that's already
> loaded (or make one, if there is no such generic). Particularly since
> these types of functions are broadly applicable to different types of
> objects.
>
> at the risk of being redundant:
>
> On Mon, Sep 5, 2011 at 5:50 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>> Hi Steve --
>>
>> On 09/05/2011 11:29 AM, Steve Lianoglou wrote:
> [trim]
>
>>> I've tried to "guard" against an already defined cbind generic (say,
>>> from IRanges), like:
>>>
>>> if (!isGeneric("cbind")) {
>>>   setGeneric("cbind", function(..., deparse.level=1)
>>> standardGeneric("cbind"), signature="...")
>>> }
>>
>> I don't really like this idea. It implies you'll take whatever cbind comes
>> along, whereas you're really interested in only specific cbinds (the one in
>> base, for instance).
>> You've already used a NAMESPACE to determine what
>> functions are reliably available, so either there's a cbind generic that
>> you're interested in attaching methods to or there isn't.
>
> That's the nut right there. I don't see it as taking "whatever cbind
> comes along" (even though that's what I'm doing I guess).
>
> I'm not interested in a specific cbind, I just want to add my specific
> implementation of cbind to list of classes that a call to "cbind" will
> dispatch to ... making a user call data.table::cbind(...) makes me sad
> ... especially since I'll be one of those users ;-)
>
> I mean -- ultimately that's the whole point of generic functions (S3,
> S4, whatever) vs. "normal" functions, no? If I didn't use any generic
> function (S3, S4, or otherwise), I would expect to have to do
> IRanges::something vs. data.table::something, but in my mind generics
> are supposed to solve that problem.
>
> Anyway ... thanks again for taking the time to reply,
>
> -steve
>
>> Likely you want to
>> promote base cbind to a generic (since as you say it doesn't make sense to
>> introduce IRanges as a dependency).
>>
>>  setGeneric("cbind")
>>
>>> but exporting cbind from data.table's NAMESPACE will trample the cbind
>>> methods from IRanges (if IRanges is loaded first). Loading IRanges
>>> after data.table will hose the cbind/rbind defined in data.table.
>>
>> Conceptually, and my story might depart from reality here, I'd think that
>> data.table::cbind and IRanges::cbind are distinct generics, and that when
>> you say 'trample' you mean mask, i.e., IRanges generic and methods are still
>> available, just further down the search path than is found by default. So
>> the user has to IRanges::cbind or data.table::cbind. Within either package,
>> they both know the appropriate cbind generic and hence cbind methods, so no
>> need for pkg::cbind. Likewise, in a package that import(IRanges, cbind),
>> dispatch within the package is un-affected by what data.table does.
>>
>>> I guess one way to do it is to importFrom(IRanges, cbind, rbind) in
>>> data.table's NAMESPACE, but it wouldn't really be appropriate to
>>> introduce an IRanges dependency to data.table.
>>
>> agreed, this doesn't sound appropriate.
>>
>>>> From reading through the R-exts doc, I get the feeling that what I'm
>>>
>>> trying to do is not possible, but maybe there's a way for me to do
>>> this that I'm overlooking?
>>>
>>> Will two different packages that declare and implement S4 functions by
>>> the same name always trample over one another unless one of them
>>> imports these functions from the other?
>>>
>>> And a random (somehow related), stylistic question. Out of curiosity:
>>> why doesn't IRanges "export" rbind,cbind through the exportMethods()
>>> directive, since its an S4/generic (instead of just export())?
>>
>> I'm not the author so don't know for sure, and again my story might depart
>> from reality. But I think of the setGeneric in IRanges as producing a new
>> generic function, and the function is exported. This is different from
>> adding methods to an existing, perhaps already visible (e.g., 'show')
>> generic.
>>
>>> Thanks,
>>>
>>> -steve
>>>
>>> [1] data.table:
>>> http://cran.r-project.org/web/packages/data.table/index.html
>>>
>>
>>
>> --
>> Computational Biology
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>
>> Location: M1-B861
>> Telephone: 206 667-2793
>>
>
>
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



More information about the Bioc-devel mailing list