[R-sig-hpc] Newbie question to Rmpi

Takatsugu Kobayashi taquito2007 at gmail.com
Mon Aug 22 08:58:44 CEST 2011


Hi

I am a newbie to parallel computing, let alone Rmpi, and appreciate if you could
give me advice on R parallel computing in general.

I am a database marketing analyst and was told by my boss to do some research
to see if we could set up a parallel computing system on our current
server system
at work. This is because some of our clients asked us to do
statistical analysis on
a large data (about 30 million records * 12 columns) - multinomial
logit model and
some neural network analysis.

Our server spec is like

2* Xeon 5630
16GB Memory
2*500GB

and we are planning to purchase another set of this server for
small-sized cluster
computing. While I was reading

http://www.bioconductor.org/help/bioconductor-cloud-ami/

the following questions occurred to me and I would like to ask you:

1. Could you share websites/books/journals that I could deepen my understanding
    of how MPICH2 and R work on linux?

2. With our current server system, is this code correct?
    "mpirun -np 1 --hostfile /usr/local/Rmpi/hostfile R --no-save -f
/usr/local/Rmpi/xxxx.R --args 8 "

3. If I use Amazon EC2 and initiate 4 extra large instances, is this
code correct?
   "mpiutil -a accessid -s keyid -w 16 -n clustername -t m1.large -v volumeid"
   "mpirun -np 1 --hostfile /usr/local/Rmpi/hostfile R --no-save -f
/usr/local/Rmpi/xxxx.R --args 16 "


Sorry for my fundamental and perhaps invalid questions.
Many thanks in advance.

Taka



More information about the R-sig-hpc mailing list