[Rd] S4 accessors

Henrik Bengtsson hb at stat.berkeley.edu
Thu Sep 28 00:00:06 CEST 2006

On 9/27/06, John Chambers <jmc at r-project.org> wrote:
> There is a point that needs to be remembered in discussions of accessor
> functions (and more generally).
> We're working with a class/method mechanism in a _functional_ language.
> Simple analogies made from class-based languages such as Java are not
> always good guides.
> In the example below, "a function foo that only operates on that class"
> is not usually a meaningful concept in R.   Whereas in Java a method can
> only be invoked "on" an object, given the syntax of the Java language, R
> (that is, the S language) is different.  You can intend a function to be
> used only on one class, but that isn't normally the way to think about R
> software.
> Functions are first-class objects and in principle every function should
> have a "function", a purpose.  Methods implement that purpose for
> particular combinations of arguments.
> Accessor functions are therefore a bit anomalous.  If they had a
> standard syntactic pattern, say get_foo(object), then it would be more
> reasonable to think that you're just defining a method for that function
> for a given class that happens to have a slot with the particular name,
> "foo".
> Also, slot-setting functions will be different in R because we deal with
> objects, not object references as in Java.  An R-like naming convention
> would be something along the lines of
>   set_foo(object) <- value
> but in any case one will need to use replacement functions to conform to
> the way assignments work.

In the Object class system of the R.oo package I have for years worked
successfully with what I call virtual fields.  I find them really
useful and convenient to work with.

These works as follows, if there is a get<Field>(object) function,
this is called whenever object$<field> is called.  If there is no such
function, the internal field '<field>' is access (from the environment
where all fields live in).  Similarily, object$<field> <- value check
for set<Field>(object, value), which is called if available. [I work
with environments/references so my set functions don't really have to
be replacement functions, but there is nothing preventing them from
being such.]

There are several advantages doing it this way.  You can protect
fields behind a set function, e.g. preventing assignment of negative
values and similar, e.g.

  circle$radius <- -5
  Error: Negative radius: -5

You can also provide redundant fields in your API, e.g.

  circle$radius <- 5
  circle$area <- 4

and so on. How the circle is represented internally does not matter
and may change over time. With such a design you don't have to worry
as a software developer; the API is stable.  I think this schema
carries over perfectly to S4 and '@'.

FYI: I used the above naming convention because I did this way before
the '_' operator was redefined.

Comment: If you don't want the user to access a slot/field directly, I
recommend to name the slot with a period prefix, e.g. '.radius'.  This
gives at least the user the chance to understand your design although
it does not prevent them to misuse it.  The period prefix is also
"standard" for "private" object, cf. ls(all.names=FALSE/TRUE).


> Ross Boylan wrote:
> > On Tue, 2006-09-26 at 10:43 -0700, Seth Falcon wrote:
> >
> >> Ross Boylan <ross at biostat.ucsf.edu> writes:
> >>
> >
> >
> >>>> If anyone else is going to extend your classes, then you are doing
> >>>> them a disservice by not making these proper methods.  It means that
> >>>> you can control what happens when they are called on a subclass.
> >>>>
> >>> My style has been to define a function, and then use setMethod if I want
> >>> to redefine it for an extension.  That way the original version becomes
> >>> the generic.
> >>>
> >>> So I don't see what I'm doing as being a barrier to adding methods.  Am
> >>> I missing something?
> >>>
> >> You are not, but someone else might be: suppose you release your code
> >> and I would like to extend it.  I am stuck until you decide to make
> >> generics.
> >>
> > This may be easier to do concretely.
> > I have an S4 class A.
> > I have defined a function foo that only operates on that class.
> > You make a class B that extends A.
> > You wish to give foo a different implementation for B.
> >
> > Does anything prevent you from doing
> > setMethod("foo", "B", function(x) blah blah)
> > (which is the same thing I do when I make a subclass)?
> > This turns my original foo into the catchall method.
> >
> > Of course, foo is not appropriate for random objects, but that was true
> > even when it was a regular function.
> >
> >
> >>> Originally I tried defining the original using setMethod, but this
> >>> generates a complaint about a missing function; that's one reason I fell
> >>> into this style.
> >>>
> >> You have to create the generic first if it doesn't already exist:
> >>
> >>    setGeneric("foo", function(x) standardGeneric("foo"))
> >>
> > I wonder if it might be worth changing setMethod so that it does this by
> > default when no existing function exists. Personally, that would fit the
> > style I'm using better.
> >
> >>>> For accessors, I like to document them in the methods section of the
> >>>> class documentation.
> >>>>
> >>> This is for accessors that really are methods, not my fake
> >>> function-based accessors, right?
> >>>
> >> Which might be a further argument not to have the distinction in the
> >> first place ;-)
> >>
> >> To me, simple accessors are best documented with the class.  If I have
> >> an instance, I will read help on it and find out what I can do with
> >> it.
> >>
> >>
> >>> If you use foo as an accessor method, where do you define the associated
> >>> function (i.e., \alias{foo})? I believe such a definition is expected by
> >>> R CMD check and is desirable for users looking for help on foo (?foo)
> >>> without paying attention to the fact it's a method.
> >>>
> >> Yes you need an alias for the _generic_ function.  You can either add
> >> the alias to the class man page where one of its methods is documented
> >> or you can have separate man pages for the generics.  This is
> >> painful.  S4 documentation, in general, is rather difficult and IMO
> >> this is in part a consequence of the more general (read more powerful)
> >> generic function based system.
> >>
> > As my message indicates, I too am struggling with an appropriate
> > documentation style for S4 classes and methods.  Since "Writing R
> > Extensions" has said "Structure of and special markup for documenting S4
> > classes and methods are still under development." for as long as I cam
> > remember, perhaps I'm not the only one.
> >
> > Some of the problem may reflect the tension between conventional OO and
> > functional languages, since R remains the latter even under S4.  I'm not
> > sure if it's the tools or my approach that is making things awkward; it
> > could be both!
> >
> > Ross
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
>         [[alternative HTML version deleted]]
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

More information about the R-devel mailing list