[R-sig-hpc] multi-threaded R/MPI jobs using SGE

Renaud Gaujoux renaud at mancala.cbio.uct.ac.za
Wed Sep 22 13:08:52 CEST 2010


I want to run an MPI-multithread job on our local cluster (Rocks + SGE).
My R script uses the doMPI/doMC packages to compute, say 10 tasks.
I'd like to compute each task using as many CPUs available on a 
worker-host (meaning all available and assigned slots by SGE).
Suppose I know each host as 4 slots.

1. SGE question: ideally I'd like to be able to ask for say 9 slots, 
allocated on 3 hosts (4 + 4 + 1), using the isolated slot to run the 
master thread.
This way I can spawn one master, two 4-core workers to perform the tasks.
I read one can configure an SGE parallel environment to pass to qsub -pe 
<pe_name> to ensure the allocation follows this rule.
Has anybody this kind of environment available on its cluster?
Would this one work?

pe_name           mtmpich
slots             9999
user_lists        NONE
xuser_lists       NONE
start_proc_args   /opt/gridengine/mpi/startmpi.sh -catch_rsh $pe_hostfile
stop_proc_args    /opt/gridengine/mpi/stopmpi.sh
allocation_rule   $pe_slots
control_slaves    TRUE
job_is_first_task FALSE
urgency_slots     min

2. Rmpi/doMPI question: suppose I cannot be sure the slots are allocated 
in such a way, say I get (2+2+3+2). This means that other users are 
using the other CPUs, which I do not want to over-use. Currently when I 
registerDoMC() it registers all 4 CPUs which interferes with the other 
users' jobs.
Is it possible from within R to figure out which worker-host is running 
the code and the number of CPU I am allowed to use on it?

3. If anybody has successfully done this kind of thing, please let me 
know how.

Thank you.

Renaud Gaujoux
Computational Biology - University of Cape Town
South Africa



This e-mail is subject to the UCT ICT policies and e-mail disclaimer published on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 21 650 4500. This e-mail is intended only for the person(s) to whom it is addressed. If the e-mail has reached you in error, please notify the author. If you are not the intended recipient of the e-mail you may not use, disclose, copy, redirect or print the content. If this e-mail is not related to the business of UCT it is sent by the sender in the sender's individual capacity.


More information about the R-sig-hpc mailing list