[Rd] OOP performance, was: V2.9.0 changes
ggrothendieck at gmail.com
Fri Jul 3 02:29:04 CEST 2009
In terms of performance if you want the fastest
performance in R go with S3 and if you want
even faster performance rewrite your inner loops
in C. All the other approaches will usually be slower.
Also S3 is simple, elegant and will result in less code
and take you much less time to design, program and
For 100% R code, particularly for simulations,
proto can sometimes be even faster than pure R code based
S3 as proto supports hand optimizations that cannot readily
be done in other systems. (For unoptimized code it would
be slower.) The key trick is based on its ability
to separate dispatching from calling so that if method f and
object p are unchanged in a loop
then the loop can be rewritten
f <- p$f; for(...) f(...)
Note that this still retains dynamic dispatch but
just factors it out of the loop. With S3 the best you could
get would be for(...) f.p(...) where f is a method of class p
but this is really tantamount to not using OO at all since
no dispatch is done at all.
On Thu, Jul 2, 2009 at 11:31 AM, Thomas
Petzoldt<Thomas.Petzoldt at tu-dresden.de> wrote:
> Hi Troy,
> first of all a question, what kind of ecosystem models are you
> developing in R? Differential equations or individual-based?
> Your write that you are a frustrated Java developer in R. I have a
> similar experience, however I still like JAVA, and I'm now more happy
> with R as it is much more efficient (i.e. sum(programming + runtime))
> for the things I usually do: ecological data analysis and modelling.
> After using functional R quite a time and Java in parallel
> I had the same idea, to make R more JAVA like and to model ecosystems in
> an object oriented manner. At that time I took a look into R.oo (thanks
> Henrik Bengtssson) and was one of the Co-authors of proto. I still think
> that R.oo is very good and that proto is a cool idea, but finally I
> switched to the recommended S4 for my ecological simulation package.
> Note also, that my solution was *not* to model the ecosystems as objects
> (habitat - populations- individuals), but instead to model ecological
> models (equations, inputs, parameters, time steps, outputs, ...).
> This works quite well with S4. A speed test (see useR!2006 poster on
> http://simecol.r-forge.r-project.org/) showed that all OOP flavours had
> quite comparable performance.
> The only thing I have to have in mind are a few rules:
> - avoid unnecessary copying of large objects. Sometimes it helps to
> prefer matrices over data frames.
> - use vectorization. This means for an individual-based model that one
> has to re-think how to model an individual: not "many [S4] objects"
> like in JAVA, but R structures (arrays, lists, data frames) where
> vectorized functions (e.g. arithmetics or subset) can work with.
> - avoid interpolation (i.e. approx) and if unavoidable, minimize the tables.
> If all these things do not help, I write core functions in C (others use
> Fortran). This can be done in a mixed style and even a full C to C
> communication is possible (see the deSolve documentation how to do this
> with differential equation models).
> Thomas P.
> Thomas Petzoldt
> Technische Universitaet Dresden
> Institut fuer Hydrobiologie thomas.petzoldt at tu-dresden.de
> 01062 Dresden http://tu-dresden.de/hydrobiologie/
> R-devel at r-project.org mailing list
More information about the R-devel