[Rd] s4 methods and base
John Chambers
jmc at research.bell-labs.com
Mon Aug 11 18:03:14 MEST 2003
"Marsland, John" wrote:
>
> I'm sure that many people are in the same position as me in that they are
> trying to write packages and code that is vaguely "future proof".
>
> Would it be possible to get some guidance on how the R-core team see the
> evolution of the "base" package with regard to s4 methods.
>
> There seem to be quite a lot of inconsistencies between s3 and s4 methods
> and classes currently and this (I'm sure) is only to be expected in a period
> of transition. eg POSIXlt vs POSIXt and POSIXct. And there seem like dozens
> of print methods to convert - it's not an enviable task and I'm sure it will
> take time! ... if indeed you do see R being purged of s3 by some point in
> the future.
>
> But I think it would be helpful if there was some general guidance in terms
> of the direction R is going?
Your concerns are reasonable, and deserve some thoughtful discussion
from the community. r-devel is a good place to start.
The topic is not really S4 methods and base, but what S4 methods and
classes imply about S3 methods and classes. (The S4 methods and classes
are not in package base, and the overall drift of thinking at the moment
is not towards adding to base, but if anything considering unbundling of
some material from base.)
So your request might be rephrased as asking for advice to owners of
software that currently use S3 methods and classes.
There is no proposal to make S3 methods defunct in the forseeable
future. Owners of existing code that uses them should not feel
pressured to rewrite the code, solely to keep the code working.
It's technically true that existing S3 methods software can just ignore
the existence of S4 methods and classes, at least under some
circumstances. But maintainers of software using S3 methods and classes
might want to consider conversions or partial conversions, when and if
they decide to revise the software.
Similarly, the methods package COULD simply have ignored existing S3
classes. There would then be no "inconsistencies" because S3 classes
are not "classes" at all in the S4 sense: there is no definition for
them, only objects that contain the class name in their "class"
attribute. But in fact the methods pacakge provides some heuristic
mapping of "old classes" into new classes. It will be useful to get
feedback if the mapping is not right in specific examples.
To summarize the situation so far:
1. If S3 classes are not defined as S4 classes, no methods can be
written for S4 generic functions for them, and objects from these
classes cannot be slots in S4 classes. So some attempt to link the two
seemed worthwhile.
2. In the methods package, an attempt was made to map known S3 classes
in the R code in base into S4 classes, reflecting S3 inheritance, so
that existing classes could be used for methods and slots.
Here's how that was done:
An object using S3 methods can have one or more strings in its class
attribute. Heuristically, the first of these strings is interpreted as
"the" class of the object. Subsequent strings are classes which the
first class inherits from--"superclasses" of the first class, in common
terminology. (See section A.5 of the "Statistical Models in S" book, pp
463-467.)
With S4 classes, every object has a single class, with an explicit
definition. That class can have superclasses (defined as the classes
this class contains). ALL objects from the class have the same single
string as the value of class(x).
The goal was to map each S3 class into an S4 class, and to infer
superclasses from places where two or more strings were used as a class
attribute. Clearly a guessing game, because there really is no
"definition" of an S3 class.
So, for example, there appear to be objects with class attribute
c("ordered", "factor"), meaning that the object has main class "ordered"
but inherits from "factor". This gets mapped into S4 classes:
--------------
R> getClass("ordered")
Virtual Class
No Slots, prototype of class "NULL"
Extends:
Class "factor", directly
Class "oldClass", by class "factor"
R> getClass("factor")
Virtual Class
No Slots, prototype of class "NULL"
Extends: "oldClass"
Known Subclasses: "ordered"
----------------
Both classes are "virtual" in the S4 sense, because you can't say
new("ordered"), and you can't say that because we haven't tried to
legislate what "slots" objects from any of these classes have. And an
object from a non-virtual S4 class will have a single string as its
class, meaning that S3 inheritance would cease to work.
Undoubtedly, some existing classes were missed and/or misinterpreted.
Feedback on these (as in your second e-mail) will be helpful. Yes, it
looks as if POSIXlt slipped through the cracks, because there wasn't any
R code that assigned that as a class.
As for the relation of the POSIX* classes: Your interpretation (in your
second e-mail) sounds reasonable--that POSIXt is a superclass to both
POSIXlt and POSIXct. Unfortunately both the literal inerpretation of
the documentation and the code itself seem to contradict that. There
are several instances of expressions such as:
class(z) <- c("POSIXt", "POSIXct")
This says the opposite: that POSIXct is a superclass of POSIXt. There
doesn't seem to be any code that links POSIXlt to POSIXt.
3. Anyway, to get back to some general suggestions. The heuristic
mapping in the methods package has to be just a stop-gap. We should try
to fix errors and omissions, but as noted, we can't push it much further
at least for objects with more than one string in their class
attribute. And there are several examples that just can't be included,
because objects start with the same class string, but then follow that
with DIFFERENT superclass strings. (The use of "aov" and "maov" seems
to be of this form.) No single mapping to S4 classes can capture this
behavior explicitly.
There seem to be some examples where the S3 inheritance wasn't
understood or was used in a way inconsistent with ANY S4 class
structure.
A better solution in the long run is to try to convert S3 classes to
non-virtual S4 classes, where possible.
Although owners of S3 class software shouldn't feel threatened, they
might consider conversions, perhaps when revising the software for other
reasons:
- classes that don't have multiple strings in the class attribute can
often just be converted to a non-virtual S4 class, so long as objects
always have the same attributes. Attributes go into slots (the slot
must have some specified class, but there are ways to allow some
variation in the actual type of data in the slot). S3 methods can
generally stay unchanged.
- classes with multiple strings can also be converted, again if they
have consistent attributes. In this case, DIRECT S3 methods (those
dispatching from the first string) are OK, but INHERITED S3 methods are
not, so some conversion to S4 methods may be needed.
- classes that have inconsistent superclasses from one object to
another may just not be convertible, but it's worth looking at examples.
When conversion or re-design is feasible, it's likely worthwhile in the
long run.
Regards,
John Chambers
>
> For my own part I am very impressed with the power of the s4 methods and the
> functional nature of the language. But there are some comparisons with (say)
> Python that would make it even better. Many of these relate to the base
> package and the very diverse collection of demos, stats, dates and file
> handling etc... are there plans to break this up so as to make the
> instruction set smaller (an obvious use of s4 methods) and the whole
> language more light-weight and like a scripting language. Others seem to be
> thinking in this direction with the addition of "import" and namespaces, but
> more detail (or a pointer towards it if it exists!) would be helpful.
>
> Regards,
>
> John Marsland
>
> **********************************************************************
> This is a commercial communication from Commerzbank AG.\ \ T...{{dropped}}
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
--
John M. Chambers jmc at bell-labs.com
Bell Labs, Lucent Technologies office: (908)582-2681
700 Mountain Avenue, Room 2C-282 fax: (908)582-3340
Murray Hill, NJ 07974 web: http://www.cs.bell-labs.com/~jmc
More information about the R-devel
mailing list