[R] pass by reference

Ben Tupper btupper at bigelow.org
Tue Aug 14 17:26:17 CEST 2012


Hi,

On Aug 14, 2012, at 10:07 AM, Bert Gunter wrote:

> (Offlist, as my comments are not worth bothering the list about).
> 

Almost off list!

> I don't understand the purpose of this tirade (whose reasonableness I
> make no judgment of). R is what it is. If you don't like it for
> whatever reason, don't use it.
> 
> As a point of order, there are several packages that "automate" pass
> by reference/pointers in R to some extent: packages ref, R.oo, and
> proto are 3 that I know of, but I think there are others. My
> understanding is that this tends to be computationally inefficient in
> R, but I have no direct knowledge.
> 

I wonder why Reference Classes are not mentioned - I think they maybe be informally called R5 and RefClass, too.    You can learn about them with ?setRefClass.

I have tried using this approach quite recently as I was working with very large data frames.  I'm no software engineer, so I am not sure if using R5 style was a significant help to my problem, but with one exception it was pretty painless to give it a shake.*   In fact, inside the methods (and therefore the object instance environment?) I probably was working in the pass-by-value paradigm. 

I'm hoping that those in the know could shed some light on the pros and cons of using Reference Classes.

Cheers,
Ben

* There are two ways to add methods to a reference class: (1) in the call to setRefClass() which generates the object definition, and (2) using the MyRefClass$methods() function after the generator object is created. After some puzzle-filled afternoons I settled on the latter as being waaaaay better. I have pasted below an example.

##### START

# generator
MyRefClassR5 <- setRefClass("MyRefClassR5",

   # here are the field properties
   fields = list(
      x = "numeric",
      y = "numeric",
      color = "character",
      flavor = "character"),
      
   # here we can add methods - but there is a handier way 
   # see the length() method added below
   methods = list(
      plot = function(color = .self$color, flavor = .self$flavor, pch = 15, ...){
         graphics::plot(.self$x, .self$y, col = color, main = flavor, pch = pch, ...)
         speak("plotting")
      },
      
      speak = function(message = paste(.self$flavor, "is the color", .self$color)){
         cat("MyRefClass:", message, "\n")
      })
  )    
  
  MyRefClass <- function(x = seq(from = 0, to = 10), y = x^2,
   color = "brown", flavor = "chocolate"){
      
      X <- MyRefClassR5$new(x = x, y = y,
         color = color, flavor = flavor)
         
      return(X)
 }
 
# create an instance, a
a <- MyRefClass()
a$speak()
a$plot()
 
# So, now I have an instance, a, of MyRefClassR5.  
# But suppose I want to add a new method called length.
# this is a handier way to add methods as it adds them to existing 
# instances of the class - if this new method is added above in the 
# setRefClass() generator, then subsequent instances of the object would
# have a length() method, but object a would be orphaned without it.
 MyRefClassR5$methods(
   length = function(){
      len <- c(x = base::length(.self$x), y = base::length(.self$y))
      s <- paste("length x =", len["x"], " length y =", len["y"])
      speak(s)
      return(len)
   })
   
# can I use this method on instance a?  Yup!
a$length()


#### END



> As you presumably already know, you can also implement this manually
> through the use of environments and S3 or S4 semantics. You might also
> be interested in Luke Tierney's comments on references in R:
> 
> http://homepage.stat.uiowa.edu/~luke/R/references.html
> 
> Cheers,
> Bert
> 
> On Tue, Aug 14, 2012 at 2:07 AM, Jan T Kim <jttkim at googlemail.com> wrote:
>> On Mon, Aug 13, 2012 at 11:20:26PM -0300, Alexandre Aguiar wrote:
>>> Sachinthaka Abeywardana <sachin.abeywardana at gmail.com> escreveu:
>>>> Think you are missing the point,
>>> 
>>> As lover of C-style pointers, I must admit that hiding complexities
>>> (and associated problems) of pointers is a great feature of all successful
>>> high level languages (HLLs). As much as they spare time and can be easily
>>> learned by non-programmers, they impose penalties in performance and
>>> memory consumption.
>>> 
>>> Most drawbacks of HLLs have been effectively and efficiently addressed
>>> by a number of strategies in such a way that currently we have a wide
>>> variety of options. Languages have become tools to solve problems and,
>>> as such, we must pick the proper tool for each problem. That means the
>>> very first step is properly assessing the problem.
>> 
>> yes, I quite agree. In my experience, the root of the trouble leading
>> to "how can I pass things by reference in R" requests is that many problems
>> involve objects that retain their identity while their attributes change in
>> a dynamic way. These are represented more adequately by having multiple
>> functions changing state of the same (identical!) object, rather than
>> approximating this by repeatedly replacing an object with an updated
>> copy of itself, or by using other "hacks".
>> 
>> The overheads / inefficiencies that come with such hacks are really just
>> a consequence of an inadequate representation of the problem (i.e. "as
>> though one had not assessed it properly", if you will). As a further
>> indication that this is a design issue rather than one of optimisation
>> at the implementation level, notice that from a database perspective,
>> holding multiple copies that represent the same thing in memory amounts
>> to a denormalised design.
>> 
>> Personally, I understand functional programming evangelists who object
>> to state and side effects because this "purism" is adequate (and also
>> often very elegant) for enabling parallel and distributed computing. But
>> noticing that this is not much of an issue for R (which e.g. doesn't
>> support multithreading much), I do think on a regular basis that
>> providing a mechanism enabling multiple references to one instance
>> would be an improvement that would not do too much damage.
>> 
>> As it is, I've resigned to using other languages where I need an object
>> graph, and producing a first normal form type of table where I want to
>> do something in R. But I do get the feeling that I'm doing something
>> not quite right, and I frequently reiterate the problem analysis outlined
>> above to myself in order to put that funny feeling behind me.
>> 
>> Just my 2 pence, Jan
>> 
>>> 
>>> My 2 cents.
>>> 
>>> 
>>> --
>>> 
>>> Alexandre Aguiar, MD SCT
>>> SPS Consultoria
>>> 
>>> --
>>> Sent from my tablet. Please, excuse my brevity.
>>> Enviado do tablet. Por favor, perdoe a brevidade.
>>> Publi?? de le tablet. S'il vous pla??t pardonnez la bri??vet??.
>>> Ver??ffentlicht aus dem Tablet. Bitte verzeihen Sie die K??rze.
>>> Enviado desde mi tablet. Por favor, disculpen mi brevedad.
>>> Inviato dal mio tablet. Per favore, scusate la mia brevit??.
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> --
>> +- Jan T. Kim -------------------------------------------------------+
>> |             email: jttkim at gmail.com                                |
>> |             WWW:   http://www.jtkim.dreamhosters.com/              |
>> *-----=<  hierarchical systems are for files, not for humans  >=-----*
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> 
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ben Tupper
Bigelow Laboratory for Ocean Sciences
180 McKown Point Rd. P.O. Box 475
West Boothbay Harbor, Maine   04575-0475 
http://www.bigelow.org



More information about the R-help mailing list