[Rd] R as unix filter

David A. van Leeuwen david at elseware.nl
Wed Jan 12 00:44:46 CET 2005


R is great, and in fact so great that I have whished to use it as a 
proper Unix filter ever since I started using it.  Just like octave, 
perl, bash, etc.

Even though I could only find one message in the R-mailing lists, I 
can't imagine nobody else would want to be able to say on a nice bash 
command line:

$ cat data-file | some-process | R-script | another-process > file.out

specifically for (live generated, multiple) large data files and 
complicated calculations in the R-script.

So I wrote a set of wrapper scripts/programs.  From the terse README:


`Rf' allows unix script programmers to use the statistics program `R'
as a proper unix filter.  `R' is IMHO a great program, which can do
much more than only statistics, for instance, it has great graphics

A big disadvantage is, however, that it seems inherently an
interactive program, and can therefor not be used as a proper unix
filter.  More specifically, it cannot read data from stdin.  It wants
the script input in stdin.

In order to circumvent this problem, the Rf package comes with two

r-as-filter: a (bash) script doing most of the work
Rf: a c-wrapper script that calls r-as-filter in a `sh-bang' context.

Basically, Rf allows you to write a little sh-bang script (included in
this distribution as file `mean'):

===start of file====

x <- scan(.stdin, quiet=T)
cat (mean(x))

===end of file===

This example script reads number in from stdin (the R character
variable `.stdin') and prints the average on standard out.  Thus, one
could say on the command line:

$ seq 10 | mean

and the average of the numbers 1 through 10 will be calculated (5.5). 


The very first version of the package can be found at


Of course, It would be a lot better if somehow R could include this 
functionality itself.  But what do you think of this?


More information about the R-devel mailing list