[Rd] RFC: "loop connections"

Sun Aug 28 01:38:29 CEST 2005

This may not be entirely on the mark in terms of relevancy but
just in case there is some relevancy I wanted to bring it up.

Just to be concrete, suppose one wants to run the following as 
a concurrent process to R.  (What is does is it implicitly sets x
to zero and then for each line of stdin it adds the first field
of the input to x and prints that to stdout unless the first field is "exit"
in which case it exits.  gawk has an implicit read/process loop
so one does not have to specify the read step.  The fflush()
command just makes sure that output is emitted, rather than
buffered, as it is produced.)

   gawk -f myexample.awk

where myexample.awk contains the single line:

   { if ($1 == "exit") exit else { x += $1; print x; fflush() } }

This has nothing to do with raw data but is prototypical of many
possible situations where one is controlling a remote program
from R and is sending input to it and getting back output with
memory/persistance.

This example is actually the same as
   system("gawk -f myexample.awk", intern = TRUE)
except that it also has memory/persistance whereas the system
call starts up a new instance of gawk each time its called and
so would always start out with x=0 each time rather than
the accumulated sum of past values.

I have not used fifos which I assume could handle this problem
(since they are not yet provided in the Windows version of R which
is what I use) but I was wondering if the application overlaps in any 
way with what is being discussed here.  In particular it would be nice 
to have a read/write "connection" that one writes to in order to provide 
the next line to the gawk process and reads from to get the answer.