[R] R as a programming language
Duncan Murdoch
murdoch at stats.uwo.ca
Wed Nov 7 14:13:14 CET 2007
On 11/7/2007 7:46 AM, Alexy Khrabrov wrote:
> Greetings -- coming from Python/Ruby perspective, I'm wondering about
> certain features of R as a programming language.
Lots of question, I'll intersperse some answers.
>
> Say I have a huge table t of the form
>
> run ord unit words new
> 1 1 6939 1013 641
> 1 2 275 1001 518
> 1 3 3314 1008 488
> 1 4 14154 1018 463
> 1 5 2982 1006 421
>
> Alternatively, it may have a part column in front. For each run (in
> a part if present), I select ord and new columns as x and y and plot
> their functions in various ways. t is huge. So I want to select the
> subset to plot, as follows:
>
> t.xy <- function(t,part=NA,run=NA) {
> if (is.na(run)) {
> # TODO does this entail a full copy -- or how do we do references
> in R?
> r <- t
Semantically it acts as a full copy, though there is some internal
optimization that means the copy won't be made until necessary, i.e. one
of r or t changes.
There are some kinds of objects in R that are handled as references:
environments, external pointers, names, NULL. (I may have missed some.)
There are various kludges to expand this list to other kinds of objects,
the most common way being to wrap an object in an environment. But
there is a fond wish that people use R as a functional language and
avoid doing this.
> } else if (is.na(part)) {
> r <- t[t$run == run,]
> } else { # part present too
> r <- t[t$part == part & t$run == run,]
> }
> x <- r$ord
> y <- r$new
> xy.coords(x,y)
> }
>
> What I'm wondering about is whether r <-t will copy the complete t,
> and how do I minimize copying in R. I heard it's a functional
> language -- is there lazy evaluation in place here?
There is lazy evaluation of function arguments, but assignments trigger
evaluation of their RHS.
>
> Additionally, tried to use --args command line arguments, and found a
> way only due to David Brahm -- who helped with several important R
> points (thanks Dave!):
>
> #!/bin/sh
> # graph a fertility run
> tail --lines=+4 "$0" | R --vanilla --slave --args $*; exit
> args <- commandArgs()[-(1:4)]
> ...
>
> And, still no option processing as in GNU long options, or python or
> ruby's optparse.
>
> What's the semantics of parameter passing -- by value or by reference?
By value.
> Is there anything less ugly than
>
> print(paste("x=",x,"y=",y))
>
> -- for routine printing? Can [1] be eliminated from such simple
> printing? What about formatted printing?
You can use cat() instead of print(), and avoid the numbering and
quoting. Remember to explicitly specify a "\n" newline at the end.
At first I thought you were complaining about the syntax, which I find
ugly. There was a proposal last year to overload + to do concatenation
of strings, so you'd type cat("x=" + x + "y=" + y + "\n"), but there was
substantial resistance, on the grounds that + should be commutative.
> Is there a way to assign all of
>
> a <- args[1]
> b <- args[2]
> c <- args[3]
>
> in one fell swoop, a lá Python's
>
> a,b,c = args
No, but you can do
abc <- args[1:3]
names(abc) <- c('a', 'b', 'c')
and refer to the components as abc$a, etc.
> What's the simplest way to check whether a filename ends in ".rda"?
Probably something like
if (regexpr("\\.rda$", filename) > 0) ...
You double the escape char to get it entered into the RE, and then the
regexpr function uses it to escape the dot in the RE.
Duncan Murdoch
> Will ask more as I go programming...
>
> (Will someone here please write an O'Reilly's "Programming in R"? :)
>
> Cheers,
> Alexy
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list