[R] Google's R Style Guide (has become S3 vs S4, in part)
Martin Maechler
maechler at stat.math.ethz.ch
Tue Sep 8 11:59:15 CEST 2009
>>>>> Martin Morgan <mtmorgan at fhcrc.org>
>>>>> on Tue, 01 Sep 2009 09:07:05 -0700 writes:
> spencerg wrote:
>> Bryan Hanson wrote:
>>> Looks like the discussion is no longer about R Style, but S3 vs S4?
> yes nice topic rename!
>>>
>>> To that end, I asked more or less the same question a few weeks ago,
>>> arising
>>> from the much the same motivations. The discussion was helpful,
>>> here's the
>>> link:
>>> http://www.nabble.com/Need-Advice%3A-Considering-Converting-a-Package-from-S
>>>
>>> 3-to-S4-tc24901482.html#a24904049
>>>
>>> For what it's worth, I decided, but with some ambivalence, to stay
>>> with S3
>>> for now and possibly move to S4 later. In the spirit of S4, I did
>>> write a
>>> function that is nearly the equivalent of validObject for my S3 object of
>>> interest.
>>>
>>> Overall, it looked like I would have to spend a lot of time moving to S4,
>>> while staying with S3 would allow me to get the project done and get
>>> results
>>> going much faster (see Frank Harrell's comment in the thread above).
> Bryan's original post started me thinking about this, but I didn't
> respond. I'd classify myself as an 'S4' 'expert', with my ignorance of
> S3 obvious from Duncan's corrections to my earlier post. It's hard for
> me to make a comparative statement about S3 vs. S4, and hard really to
> know what is 'hard' for someone new to S4, to R, to programming, ... I
> would have classified most of the responses in that thread as coming
> from 'S3' 'experts'.
>>> As a concrete example (concrete for us non-programmers,
>>> non-statisticians),
>>> I recently decided that I wanted to add a descriptive piece of text to a
>>> number of my plots, and it made sense to include the text with the
>>> object.
>>> So I just added a list element to the existing S3 object, e.g.
>>> Myobject$descrip No further work was necessary, I could use it right
>>> away.
>>> If instead, if I had made Myobject an S4 object, then I would have to go
>>> back, redefine the object, update validObject, and possibly write some
>>> new
>>> accessor and definitely constructor functions. At least, that's how I
>>> understand the way one uses S4 classes.
> This is a variant of Gabor's comment, I guess, that it's easy to modify
> S3 on an as-needed basis. In S3, forgoing any pretext of 'best
> practices', one might
> s3 <- structure(list(x=1:10, y=10:1), class="MyS3Object")
> ## some lines of code...
> if (aTest)
> s3$descraption <- "A description"
> (either 'description' or 'discraption' is a typo, uncaught by S3).
> In S4 I'd have to change my class definition from
> setClass("MyS4Object", representation(x="numeric", y="numeric"))
> to
> setClass("MyS4Object", representation(x="numeric", y="numeric",
> description="character"))
> but the body of the code would look surprising similar
> s4 <- new("MyS4Object", x=1:10, y=10:1)
> ## some lines of code...
> if (aTest)
> s4 at description <- "A description"
> (no typo, because I'd have been told that the slot 'discraption' didn't
> exist). In the S3 case the (implicit) class definition is a single line,
> perhaps nested deep inside a function. In S4 the class definition is in
> a single location.
> Best practices might make me want to have a validity method (x and y the
> same dimensions? 'description' of length 1?), to use a constructor and
> accessors (to provide an abstraction to separate the interface from its
> implementation), etc., but those issues are about best practices.
> A downstream consequence is that s4 always has a 'description' slot
> (perhaps initialized with an appropriate default in the 'prototype'
> argument of setClass, but that's more advanced), whereas s3 only
> sometimes has 'description'. So I'm forced to check
> is.null(s3$description) whenever I'm expecting a character vector.
>> It doesn't stop there: If you keep the same name for your
>> redefined S4 class, I don't know what happens when you try to access
>> stored objects of that class created before the change, but it might not
>> be pretty. If you give your redefined S4 class a different name, then
> Actually, the old object is loaded in R. It is not valid
> (validObject(originalS4) would complain about 'slots in class definition
> not in object'). One might write an 'updateObject' generic and method
> that detects and corrects this. This contrasts with S3, where there is
> no knowing whether the object is consistent with the current (implicit)
> class definition.
>> you have a lot more code to change before you can use the redefined
>> class like you want.
> For slot addition, this is not true -- old code works fine. For slot
> removal / renaming, this is analogous to S3 -- code needs reworking; use
> of accessors might help isolate code using the class from the
> implementation of the class.
> A couple of comments on Duncan's
> S3Foo <- function(x=numeric(), y=numeric()) {
> structure(list(x=as.numeric(x), y=as.numeric(y)), class="S3Foo")
> }
> I used makeS3Foo to emphasize that it was a constructor, but in my own
> code I use S3Foo(). Realizing that, as Henrik has now also pointed out,
> I'm far from perfect, the use of as.numeric() combines validity checking
> and coercion, which I think is not usually a good thing (even when
> efficient). In particular this
> as.numeric(factor(c("one", "two", "three")))
> might unintentionally propagate earlier mistakes, e.g., after read.table
> converts characters to factors behind the unexpecting user's back.
> Martin
Very, very well put, Martin!
As another S4 lover and expert (who still uses S3 for older or very simple
projects), I do wholeheartedly agree with Martin's statements,
notably his points about the partially implied consistency of S4
classes, and the point that adding a slot to an S4 class is very
comparable in work to adding an informal element to an (always only
informal) S3 class object.
Martin Maechler, ETH Zurich.
>> By contrast, with S3, if you have any code that tests the number of
>> components in a list, that will have to be changed.
>>
>> Spencer
>>> Back to trying to get something done! Bryan
>>> *************
>>> Bryan Hanson
>>> Professor of Chemistry & Biochemistry
>>> DePauw University, Greencastle IN USA
>>>
>>>
>>>
>>>
>>>
>>> On 9/1/09 6:16 AM, "Duncan Murdoch" <murdoch at stats.uwo.ca> wrote:
>>>
>>>
>>>> Corrado wrote:
>>>>
>>>>> Thanks Duncan, Spencer,
>>>>>
>>>>> To clarify, the situation is:
>>>>>
>>>>> 1) I have no reasons to choose S3 on S4 or vice versa, or any other
>>>>> coding
>>>>> convention
>>>>> 2) Our group has not done any OO developing in R and I would be the
>>>>> first, so
>>>>> I can set up the standards
>>>>> 3) I am starting from scratch with a new package, so I do not have
>>>>> any code I
>>>>> need to re-use.
>>>>> 4) I am an R OO newbie, so whatever I can learn from the beginning
>>>>> what is
>>>>> better and good for me.
>>>>>
>>>>> So the questions would be two:
>>>>>
>>>>> 1) What coding style guide should we / I follow? Is the google style
>>>>> guide
>>>>> good, or is there something better / more prescriptive which makes our
>>>>> research group life easier?
>>>>>
>>>> I don't think I can answer that. I'd recommend planning to spend some
>>>> serious time on the decision, and then go by your personal impression.
>>>> S4 is definitely harder to learn but richer, so don't make the decision
>>>> too quickly. Take a look at John Chamber's new book, try small projects
>>>> in each style, etc.
>>>>
>>>>
>>>>> 2) What class type should I use? From what you two say, I should use S3
>>>>> because is easier to use .... what are the disadvantages? Is there an
>>>>> advantages / disadvantages table for S3 and S4 classes?
>>>>>
>>>> S3 is much more limited than S4. It dispatches on just one argument, S4
>>>> can dispatch on several. S3 allows you to declare things to be of a
>>>> certain class with no checks that anything will actually work; S4 makes
>>>> it easier to be sure that if you say something is of a certain class, it
>>>> really is. S4 hides more under the hood: if you understand how regular
>>>> R functions work, learning S3 is easy, but there's still a lot to learn
>>>> before you'll be able to use S4 properly.
>>>>
>>>> Duncan Murdoch
More information about the R-help
mailing list