[R] variable scope

R. Michael Weylandt michael.weylandt at gmail.com
Wed Aug 29 00:13:21 CEST 2012


On Tue, Aug 28, 2012 at 4:55 PM, Sam Steingold <sds at gnu.org> wrote:
>> * R. Michael Weylandt <zvpunry.jrlynaqg at tznvy.pbz> [2012-08-28 13:45:35 -0500]:
>>
>>> always you shouldn't need manual garbage collection.
>
> my observation is that gc in R sucks.
> (it cannot release small objects).
> this is not specific to R; ocaml suffers too.

That may be (I don't know enough about gc's to really say one way or
another), but if I remember correctly, allocation triggers gc, so
manual triggering shouldn't be too important. In my experience, the
one point I've needed it was after freeing multiple very large objects
when hitting memory limits. Rewriting that code to use functions
rather than as one long imperative slog was a real performance win.

Note that if you are compiling locally, you can modify the gc
parameters (frequency of various sweeps) for different performance
characteristics -- grep src/main/memory.c for "Tuning Constants" and
the lines that follow.

>
>> since a loop doesn't define its own scope like some languages (a
>> practice that always seemed strange to me),
>
> every level of indentation has its own scope.
> seems reasonable.

I guess I see something like

for(i in 1:2){
   f(i)
}

as little more than short hand for

f(1)
f(2)

rather than as something "more meaningful". I suppose you're thinking
of something like Ruby blocks here? Those correspond more closely to
anonymous functions in my mind. (scope wise)

Certainly, "to each his own" applies here.

>
>> The other answer is to use functions / apply statements like the good
>> lord and John Chambers intended :-)
>
> so explicit loops are "deprecated" in some sense?

No, not when they are really necessary (truly iterative computation),
but it's generally considered clearer/idiomatic to use higher order
functions like *apply (which I suppose is really just Map by another
name) for brevity. The fact that *apply is, in turn, a function call
means you get the "new scope" benefits for free. Looking forward,
apply() statements are stateless and hence much easier to parallelize
than loops. (In fact, the parallel package available with R >=2.14
uses that exact abstraction)

As a note, some folks worry about function calls in R: they do have
some cost, but R keeps a copy-on-write+lazy eval behavior for function
arguments so

x <- seq_len(1e7)
y <- 3

f <- function(a,b) print(b)

f(x,y)

doesn't actually copy x.

It's an implementation detail that I'm not sure is documented anywhere
(I may in fact just be making it up), but it makes some folks feel
better about using many small functions.

Cheers,
Michael

>
> thanks for your kind and informative reply!
> --
> Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
> http://www.childpsy.net/ http://ffii.org http://mideasttruth.com
> http://think-israel.org http://pmw.org.il http://honestreporting.com
> Computers are like air conditioners: they don't work with open windows!




More information about the R-help mailing list