[Bioc-devel] implementing interfaces

Robert Gentleman rgentlem at fhcrc.org
Wed Nov 26 06:59:21 CET 2008


Leaping in - where it might be best not to,
the point that James is making is that, if we have a class, foo, on which we
have defined a number of operations (eg exprs(x), pData(x) etc) that retrieve
somewhat well defined entities, then an alternative to a class-centric method of
dispatch is to rely on the accessors (what James is calling an interface).

In that case, there is not a lot to be gained by defining an interface class,
one could just have a regular function (and indeed code reuse could be obtained
by having the methods just be interfaces to that function).

I disagree a little with the characterization that it is a more developer
friendly paradigm -- that depends on which developer we are talking about. The
one doing the original implementation, or the one that wants to add to it.  I
think that the approach James suggests makes more work initially, and less later
on.  But in any event, it is a good thing for folks to think about.

The good thing, is that if things are as James suggests, I don't think he needs
to write a very big patch to get the behavior he wants :-). And I at least don't
see the discussion as complaining,

 best wishes
   Robert


Sean Davis wrote:
> On Tue, Nov 25, 2008 at 9:04 PM, James Bullard <bullard at berkeley.edu> wrote:
> 
>> On Nov 25, 2008, at 5:27 PM, Sean Davis wrote:
>>
>>
>>> On Tue, Nov 25, 2008 at 7:22 PM, James Bullard <bullard at berkeley.edu>
>>> wrote:
>>> Thanks all for the comments. I somehow never happened on
>>> showMethods(class="AffyBatch"). Comments below.
>>>
>>>
>>> On Nov 25, 2008, at 1:08 PM, Robert Gentleman wrote:
>>>
>>> Hi,
>>>
>>> Martin Morgan wrote:
>>> James Bullard wrote:
>>> hi all, this will probably demonstrate my lack of knowledge concerning
>>> OOP in R, but I am hoping for some quick answers. This is a problem I
>>> have faced before.
>>>
>>> I want to use the method bg.correct.mas, this method takes as its
>>> object an AffyBatch. I don't have an AffyBatch nor do I want to
>>> massage my data structures into such an object, so I want to implement
>>> the AffyBatch interface. However, I can see no way to determine the
>>> list of generics which have methods defined on AffyBatch (and
>>> superclasses). I understand that things are method-centric, however I
>>> assume that being method-centric still leaves room for a way to know
>>> the methods specialized for a class/interface so that I as a
>>> programmer can define the suitable methods on a new class without
>>> having to dig around all over the place determining what I need to
>>> define. My question is:
>>>
>>> is there a function to determine all the methods that are specialized
>>> for a certain class? Also, would it be possible to write a function
>>> adheresTo(classA, classB), which tells me that classB satisfies the
>>> calling requirements of classA (forget for a moment that we have
>>> public member variables).
>>>
>>> note, i don't want to make a subclass of AffyBatch.
>>>
>>>
>>> showMethods(class="AffyBatch")
>>>
>>> gets you some of the way there. But in a brand spanking new session it
>>> shows nothing (because the packages where methods are defined, e.g.,
>>> affy, has not been loaded) and this illustrates a fundamental problem:
>>> the interface to AffyBatch is dynamically determined by loaded packages.
>>>
>>> Likely you'd invoke
>>>
>>> showMethods(class="AffyBatch", where=getNamespace("affy")
>>>
>>> to get a kind of base-line set of expected methods, i.e., those visible
>>> from affy.
>>>
>>>  But this all seems like a weird way to go.  I guess I don't really see
>>> the use
>>> case for:
>>>  I know this class (AffyBatch), and I want to find all generics that have
>>> methods for it, so I can implement my own class and the same set of
>>> methods.
>>>
>>>
>>>  Why not figure out what operations you think you want to support, and
>>> support
>>> those? What is so special about the set of generic functions that have
>>> AffyBatch
>>> methods that you would want to support those?
>>>
>>> Nothing is special about the set of generic functions that have AffyBatch
>>> methods except that the bg.correct.mas is defined to take an AffyBatch
>>> object and if I want to call this function I have to subclass AffyBatch or
>>> construct an AffyBatch. I don't have a CDF file, but I do have a notion of
>>> PM/MM probes. The real issue is that methods which are defined on classes
>>> are often unusable when you can't construct the object. The ideal way to do
>>> this would be (in my opinion)
>>>
>>> setMethod("indexProbes", signature(object = "AffyInterface", which =
>>> "character"), function(object, which, ...) {
>>>       stop("No suitable method defined.")
>>> })
>>>
>>> setMethod("bg.correct.mas", signature(object = "AffyInterface"),
>>> function(object, ...) {
>>>       # current defintion goes through.
>>> })
>>>
>>> Where AffyInterface is a virtual class (also no member variables) but has
>>> a set of methods associated with it. This is a much more "developer"
>>> friendly way to do it because then I don't have to concern myself with the
>>> internals of AffyBatch, just the interface that I need to implement to call
>>> a particular generic that I am interested in. I then would subclass the
>>> AffyInterface class which would impose no stricture on how I represent my
>>> class members, it would only (although not explicitly) proclaim that I
>>> implement the set of methods in AffyInterface.
>>>
>>> As Martin and I pointed out, perhaps in less than transparent words, there
>>> is no static concept of an interface to implement.  The "interface" in
>>> defined dynamically based on what methods are available in the search path.
>>> If there is a particular method for an AffyBatch that you want to
>>> implement for your class, you are free to do so by simply creating that
>>> method for your class.  If you think about it, what you are proposing is to
>>> do exactly that, since the methods really do need to access the internals of
>>> the class in order to operate.
>>>
>> I think you are not understanding. What you say above is incorrect
>> ("methods really do need to access the internals of the class");
>> bg.correct.mas is a perfect example. If it *really* needed to access the
>> internals of a class then you would have a lot of @s in the code, but you
>> have none. This is because you can abstract the concepts of bg.correct.mas
>> away from a particular representation of the data. This is what I am trying
>> to drive home.
>>
>>
> The reason that there are no @s in bg.correct.mas is that other methods are
> used (notice that indexProbes() and intensity() are both methods).  Of
> course, those other methods (or methods they are built on) need to access
> the internals.  If you want to have methods that abstract out the @s, you
> have to write them for your class.  Alternatively, you need to make your
> object look like an AffyBatch so that you benefit from its methods via
> inheritance or directly.
> 
> Sean
> 
> 
>>
>>  The class has no methods "built into" it like other languages where the
>>> methods belong to the class directly.
>>>
>> This is not at issue here. If you have an API which is object oriented then
>> what is the right way to expose the functionality of that library? If you
>> tie the logic methods into a particular representation then the only thing
>> you gain from object orientation is encapsulation, but you have lost the
>> benefit of dispatch because it is *too* hard to subclass when
>> representations become complicated, in other words you tie your contract
>> (the set of methods that something needs to implement) to an implementation
>> -- this might not be a bad way to design code for an in-house project, but
>> it is a bad way to make/expose libraries to other programmers.
>>
>> Also, I do understand that I could use setMethod downstream on classes thus
>> making the interface a moving target, but I think that is irrelevant here.
>> The core classes and generics of bioconductor could be explicitly defined at
>> release time and I could depend on a set of functionality exposed. Also, I
>> understand that some of the particulars of R make this a little challenging,
>> but I do want to stress that I think it is an error to assume that I somehow
>> need to have a class to gain access to functionality -- that is an
>> implementation issue.
>>
>> thanks, jim
>>
>>
>>  As Robert was pointing out, the simplest way to reuse the code written for
>>> AffyBatch is to make your object look like an AffyBatch.  You can, of
>>> course, rewrite all the methods for your own data structure which might be
>>> more efficient if you have only one or two methods that you want to
>>> implement.
>>>
>>> Hope that helps a bit.
>>>
>>> Sean
>>>
>>>
>>>
>>>
>>>
>>>  And then, typically the simplest way to get there is to define a coerce
>>> method.  I know you say you don't want to do that, but basically, if the
>>> class
>>> (AffyBatch) is widely used (I mean here that lots of methods are defined
>>> for it,
>>> not that people use it) that exercise its components you will need to have
>>> a
>>> class that is basically isometric to it, and then why would you want the
>>> additional code base that comes with having your own methods?
>>>
>>>
>>> I agree that I basically need a class which is isometric to AffyBatch and
>>> that is what I believe the problem to be. I cannot get at some functionality
>>> because in a practical sense the generics have been tied to a class. There
>>> is only one way to use the functionality of bg.correct.mas and that is be an
>>> AffyBatch. I think that this is fine when it is strictly necessary or when
>>> you are designing end-user code, but if you want programmers to build on top
>>> of things then it is limiting.
>>>
>>> So I know that affy is not part of Biobase and I can solve the problem
>>> without too much trouble in a number of ways and that the primary goal of
>>> the package is for the end user, but I guess I am advocating for a design by
>>> interface approach so that particular choice of representation is left to
>>> the programmer. Thanks for the feedback and please understand that I
>>> appreciate all of the hardwork even though i am complaining a bit.
>>>
>>>
>>> jim
>>>
>>>
>>>
>>> AshowMethods returns a connection (!) which is basically useless for
>>> programmatic purposes, e.g., adheresTo().
>>>
>>>  true, but you can get a slightly (only slightly) more helpful answer if
>>> you
>>> set printTo=FALSE (in a recent revision of R 2.9.0 candidate)
>>>
>>>  Robert
>>>
>>>
>>> Martin
>>>
>>>
>>> thanks, jim
>>>
>>> _______________________________________________
>>> Bioc-devel at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>>
>>>
>>> --
>>> Robert Gentleman, PhD
>>> Program in Computational Biology
>>> Division of Public Health Sciences
>>> Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N, M2-B876
>>> PO Box 19024
>>> Seattle, Washington 98109-1024
>>> 206-667-7700
>>> rgentlem at fhcrc.org
>>>
>>> _______________________________________________
>>> Bioc-devel at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>>
>> _______________________________________________
>> Bioc-devel at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioc-devel mailing list