[R] S4 vs Reference Classes
Joseph Park
jpark.us at att.net
Wed Sep 14 21:56:58 CEST 2011
Gentlemen: Steve, Martin & Doug:
Thanks for the insightful comments regarding my query.
I think that Martin and Doug have well assessed my position
and both offer useful advice and have greatly improved
my limited understanding of S4. Thanks!
At this point, i'm well into the app via S4, and so will
probably continue on. If the app finds wings, then i'll
convert it to Reference Classes.
Generally, my problem with S4 in an OO paradigm is that i
need to add (what i consider) extra code in the main app
environment to update object slots. As Doug points out:
"If you try to perform some kind of
update operation on an S4 object and not cheat in some way (i.e.
adhere to strict functional programming semantics) you need to create
a new instance of the object each time you update it."
which is my issue. Without the reference-based approach an object
in a slot which is then included in another object slot is a copy.
An update to the original object slot then requires 'extra' code
to update/synchronize the copy.
This is not a complaint! I find R quite amazing and powerful.
Next time I'll dive into the Reference Class methods, or perhaps
as suggested, hybridize the current app.
On 9/14/2011 12:02 PM, Martin Morgan wrote:
On 09/14/2011 06:01 AM, Joseph Park wrote:
Thanks Martin.
What i'm hoping to do is have a class object, with a member method
that can change values of slots in the object, without having to
assign values by external assignment to the object. Something like this:
setClass ( "Element",
representation ( x = "numeric", y = "numeric" ),
prototype = list( x = 0, y = 1 )
)
setGeneric( name = "ComputeX",
def = function( self ) standardGeneric("ComputeX") )
setMethod( "ComputeX", signature = "Element",
function ( self ) {
if ( self @ y > 0 ) {
self @ x = pi
}
}
)
so that a call to the method ComputeX assigns ('internally') a
value to the slot x of the global object.
Hi Joseph --
I understand. In R generally and in S4 in particular self at x = pi triggers
a 'copy-on-change', so self inside the function is now different from self
outside the function.
You either need to change your expectations, or use reference classes (and
change the expectations of your users).
For completeness, in your function above you would return self, and have
elt = ComputeX(elt)
you'd also likely implement some 'accessor' X (or better named) so
X(elt)
to get X. So there is no direct call to @ in your code.
It might help to understand a real use case; if it's just 'that's the way
other programming languages do it' then there isn't much more to discuss.
But maybe, like Doug Bates, you have a particular problem with the
paradigm?
One can do :
a = new( 'Element' )
a @ x = 2
but i would prefer to have a class method do the work without
having to explicitly call a @ x = 2. Having to do this means that
i need code in my main processing app that does things on slots
that normally i would do in a class method.
As I understand it, Reference Classes provide this. So i'm
naturally wondering if i should switch my app from S4 to RC.
Fundamentally, I don't clearly understand S4 and what the difference
is between creating a SetReplaceMethod vs a SetMethod, since it
seems that in either case one has to 'externally' assign the slot
value. My limitation, of course.
at some level they are differences in syntax only, e.g.,
slt(a) = 2
versus
setGeneric("updt", function(x, value, ...) standardGeneric("updt"))
setMethod(updt, c("A", "numeric"), function(x, value, ...) {
initialize(x, a=value)
})
and then
a = updt(a, 3)
The 'updt' model easily extends to multiple arguments; both represent an
abstraction between the API seen by the user, and the implementation of
the class, so there's no reason to store '3' directly.
Martin
On 9/14/2011 12:17 AM, Martin Morgan wrote:
On 09/13/2011 10:54 AM, Joseph Park wrote:
Hi, I'm looking for some guidance on whether to use
S4 or Reference Classes for an analysis application
I'm developing.
I'm a C++/Python developer, and like to 'think' in OOD.
I started my app with S4, thinking that was the best
set of OO features in R. However, it appears that one
needs Reference Classes to allow object methods to assign
values (other than the .Object in the initialize method)
to slots of the object.
With
setClass("A", representation=representation(slt="numeric"))
a slot can be updated with @<- and an object updated with a
replacement method
setGeneric("slt<-", function(x, ..., value) standardGeneric("slt<-"))
setReplaceMethod("slt", c("A", "numeric"), function(x, ..., value) {
x at slt <- value
x
})
so
> a = new("A", slt=1)
> slt(a) = 2
> a
An object of class "A"
Slot "slt":
[1] 2
The default initialize method also works as a copy constructor with
validity check, e.g., allowing multiple slot updates
setReplaceMethod("slt", c("A", "ANY"), function(x, ..., value) {
initialize(x, slt=as.numeric(value))
})
> slt(a) = "1"
This is typically what I prefer: creating an object, then
operating on the object (reference) calling object methods
to access/modify slots.
So I'm wondering what (dis)advantages there are in
developing with S4 vs Reference Classes.
R's copy-on-change semantics leads me to expect that
b = a
slt(a) = 2
leaves b unchanged, which S4 does (necessarily copying and thus with a
time and memory performance cost). A reference class might be
appropriate when the entity referred to exists in a single copy, as
e.g., an on-disk data base, or an external pointer to a C++ class.
Martin
Things of interest:
Performance (i.e. memory management)
Integration compatibility with R packages
??? other issues
Thanks!
______________________________________________
[1]R-help at r-project.org mailing list
[2]https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
[3]http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
References
1. mailto:R-help at r-project.org
2. https://stat.ethz.ch/mailman/listinfo/r-help
3. http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list