[R-sig-hpc] Snow Not Distributing

Paul Johnson pauljohn32 at gmail.com
Thu Jan 26 18:34:54 CET 2012


On Fri, Jan 20, 2012 at 3:53 PM, Jeff Allen <lists at jdadesign.net> wrote:
> I have been able to successfully setup snow (0.3-5) and Rmpi (0.5-9) on my
> RedHat 5 cluster, and have it working perfectly for jobs that don't span
> multiple nodes.
>
> We're using Torque for resource management, so I start a job with access to
> multiple nodes and load Snow. Unfortunately, not matter what size cluster I
> try to make, all of the workers end up running on the same host -- leaving
> the other hosts idle.

Have you solved the problem yet?  If not, I can help. I have exactly
your setup and I have been through EXACTLY the same problems you are
seeing.

I've been developing a collection of Rmpi programs that actually work,
some with Snow, some with parallel.

This is the cluster main page

http://web.ku.edu/~quant/cgi-bin/mw1/index.php?title=Cluster:Main

and about 2/3 down, you see a link to my collection of working programs.

That is an SVN repo that has http access

http://winstat.quant.ku.edu/svn/hpcexample/trunk

In case you are impatient, here is what I suggest.  This should be
your submission script. I mean this works for us.

#!/bin/sh
#
#This is an example script example.sh
#
#These commands set up the Grid Environment for your job:
#PBS -N SnowHelloWorld
#PBS -l nodes=11:ppn=1
#PBS -l walltime=00:50:00
#PBS -M pauljohn at ku.edu
#PBS -m bea

cd $PBS_O_WORKDIR

### This RUNS, and because I give it a machine list, it uses them.
orterun --hostfile $PBS_NODEFILE -n 1 R --no-save --vanilla -f snow-hello.R

###############################

note that in the orterun command (same as mpirun) I am ONLY REQUESTING
one node. We let R do the spawning of the jobs.  THe PBS command asks
for 11 nodes

Then the job for snow-hello.R creates the cluster.

Why am I pasting this in. I'm crazy. Just go look here for the sub
script, the program, an explanation, and example output.

http://winstat.quant.ku.edu/svn/hpcexample/trunk/Ex60-HelloWorldSnow/

>
> I'm no expert with MPI or snow, so I'm really not sure how to approach
> debugging this.
>
> Any input would be much appreciated!
>
> Jeff
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc



-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas



More information about the R-sig-hpc mailing list