[R] Rserve and R to R communication

Ramon Diaz-Uriarte rdiaz02 at gmail.com
Mon Apr 9 23:20:05 CEST 2007


Dear Matthew,

On 4/9/07, Matthew Keller <mckellercran at gmail.com> wrote:
> Hi Ramon,
>
> I've been interested in responses to your question. I have what I
> think is a similar issue - I have a very large simulation script and
> would like to be able to modularize it by having a main script that
> calls lots of subscripts - but I haven't done that yet because the
> only way I could think to do it was to call a subscript, have it run,
> save the objects from the subscript, and then call those objects back
> into the main script, which seems like a very slow and onerous way to
> do it.
>
> Would Rserve do what I'm looking for?
>

Maybe. That is in fact what I am wondering. However, an easier route
might be to try Rmpi with papply. Or snow (with either Rmpi or rpvm).
Or nws (a Linda implementation for R). Using Rmpi with papply, in
particular, is a piece of cake with embarrasingly parallel problems.
papply is like lapply, but parallelized, with built-in load-balancing,
although it will run sequentially when no MPI universe is available;
the later is very handy for debugging. snow also has parallelized,
load-balanced, versions of apply (though I do not think it
automatically switches to running sequentially).

All of these (Rmpi, papply, Rmpi, rpvm, nws) are R packages available
from CRAN. You will need some additional stuff (LAM/MPI for Rmpi ---or
mpich if you run windows---, PVM for rpvm, and Python and twisted for
nws).

(I asked about Rserve because the lack of fault tolerance of MPI is a
pain to deal with in my applications. Also, with LAM/MPI there are
limits on the number of slaves that can be handled by a lam daemon,
and that is a problem for some of our web-based applications. Thus, I
am looking at alternative approaches that might eliminate some of the
extra layers that MPI ---or PVM--- add. ).

HTH,

R.


> On 4/7/07, Ramon Diaz-Uriarte <rdiaz02 at gmail.com> wrote:
> > Dear All,
> >
> > The "clients.txt" file of the latest Rserve package, by Simon Urbanek,
> > says, regarding its R client,
> >
> > "(...) a simple R client, i.e. it allows you to connect to Rserve from
> > R itself. It is very simple and limited,  because Rserve was not
> > primarily meant for R-to-R communication (there are better ways to do
> > that), but it is useful for quick interactive connection to an Rserve
> > farm."
> >
> > Which are those better ways to do it? I am thinking about using Rserve
> > to have an R process send jobs to a bunch of Rserves in different
> > machines. It is like what we could do with Rmpi (or pvm), but without
> > the MPI layer. Therefore, presumably it'd be easier to deal with
> > network problems, machine's failures, using checkpoints, etc. (i.e.,
> > to try to get better fault tolerance).
> >
> > It seems that Rserve would provide the basic infrastructure for doing
> > that and saves me from reinventing the wheel of using sockets, etc,
> > directly from R.
> >
> > However, Simon's comment about better ways of R-to-R communication
> > made me wonder if this idea really makes sense. What is the catch?
> > Have other people tried similar approaches?
> >
> > Thanks,
> >
> > R.
> >
> > --
> > Ramon Diaz-Uriarte
> > Statistical Computing Team
> > Structural Biology and Biocomputing Programme
> > Spanish National Cancer Centre (CNIO)
> > http://ligarto.org/rdiaz
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> Matthew C Keller
> Postdoctoral Fellow
> Virginia Institute for Psychiatric and Behavioral Genetics
>


-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz



More information about the R-help mailing list