[Rd] Plans to improve reference classes?
winston at rstudio.com
Tue Jun 23 18:08:08 CEST 2015
I can provide a little background on why particular choices were made for
R6. Generally speaking, speed is a primary consideration in making
decisions about the design of R6. The basic structure of R6 classes is
actually not so different from reference classes: an R6 object is an
environment. But many aspects of R6 objects are simpler.
R6 does support clean cross-package inheritance. The key design feature
that allows this is that methods have one environment that they are bound
in (this is where they can be found), and another environment that they are
enclosed in (roughly, this is where they run). The enclosing environment
points back to the binding environment with a binding named `self`. Methods
must access other members with `self$`, as in `self$foo`. I've found that
this requirement results in clearer code, because it's always clear when
you're accessing something that's part of the object.
When a class inherits from another class, the enclosing environment also
contains a binding named `super`, which points to an environment containing
methods from the superclass. These methods also have their own enclosing
environment, with a `self` that points back to the object's binding
I know this might be hard to picture from the description; I have some
diagrams drawn up which might help. See pages 1 and 4 from this document:
(The other pages show other features, like private members, and
non-portable R6 objects, which don't support clean cross-package
inheritance, and have a different structure.)
Regarding performance, R6 is fast relative to ref classes because it
doesn't do type checking for fields, and doesn't make use of S4. (There may
be other reasons as well, but I don't know the internals of ref classes
well enough to say much about it.) Accessing a member of an R6 object is
literally just accessing a binding in an environment, and that's a very
fast operation in R.
On Tue, Jun 23, 2015 at 10:06 AM, Hadley Wickham <h.wickham at gmail.com>
> > 1) Is there any example or writeup on the difficulties of extending
> > reference classes across packages? Just so I can fully understand the
> > issues.
> Here's a simple example:
> MyRange <- setRefClass("MyRange", contains = "DiscreteRange")
> a_range <- MyRange()
> # Error in a_range$train(1:10) : could not find function "train_discrete"
> where train_discrete() is an non-exported function of the scales
> package called by the train() method of DiscreteRange.
> There are also some notes about portable vs. non-portable R6 classes
> at http://cran.r-project.org/web/packages/R6/vignettes/Portable.html
> > 2) In what sorts of situations does the performance of reference
> > classes cause problems? Sure, it's an order of magnitude slower than
> > constructing a simple environment, but those timings are in
> > microseconds, so one would need a thousand objects before it started
> > to be noticeable. Some motivating use cases would help.
> It's a bit of a pathological case, but the switch from RefClasses to
> R6 made a noticeable performance improvement in shiny. It's hard to
> quantify the impact on an app, but the impact on the underlying
> reactive implementation was quite profound: http://rpubs.com/wch/27260
> vs http://rpubs.com/wch/27264
> R6 also includes a vignette with detailed benchmarking:
> I've added Winston to the thread since he's the expert.
> R-devel at r-project.org mailing list
[[alternative HTML version deleted]]
More information about the R-devel