[R-sig-hpc] multi-threaded R/MPI jobs using SGE
Renaud Gaujoux
renaud at mancala.cbio.uct.ac.za
Wed Sep 22 15:36:05 CEST 2010
Hi,
thanks for sharing your experience Kasper.
I had envisaged this solution but I'd like to be able to use more than 4
cores (as our nodes are 'limited' quad-cores), which would mean getting
more than one node assigned to me.
One question though in your case. Say you asked for 12 cores but get
only 6 (the 6 others being already assigned and used by other users).
Do you know if 'mclapply' fork the 6 processes on the correct CPUs and
not on some of those that are not assigned to you?
Thanks,
Renaud
--
Renaud Gaujoux
Computational Biology - University of Cape Town
South Africa
On 22/09/2010 15:18, Kasper Daniel Hansen wrote:
> Your question is more advanced than my current experience. But I can
> tell you what I do with SGE when I use multicore (where all cores
> should be on the same node).
>
> Our sysadmin has setup a parallel environment called "local" which
> makes sure that all the cores I request will be on the same node. I
> use it like
> qsub -pe local 6-12
> The 6-12 means give me between 6 and 12 cores (SGE will always give me
> the most I request, but that probably depend on the cluster setup).
> The actual number of cores I get, gets stored in the environment
> variable NSLOTS (or is it N_SLOTS, I don't recall), so in R I do
>
> CORES = as.integer(Sys.getenv("NSLOTS"))
> mclapply(LIST, FUN, mc.cores = CORES)
>
> You clearly have a more advanced use case, but I would guess that
> someone has done it (perhaps not using R). I would furthermore guess
> that the way to do it is the same as above: your allocated resources
> gets stored in some environment variable that you then read from R and
> feed into the doMPI setup.
>
> Kasper
>
>
> On Wed, Sep 22, 2010 at 7:08 AM, Renaud Gaujoux
> <renaud at mancala.cbio.uct.ac.za> wrote:
>
>> Hi,
>>
>> I want to run an MPI-multithread job on our local cluster (Rocks + SGE).
>> My R script uses the doMPI/doMC packages to compute, say 10 tasks.
>> I'd like to compute each task using as many CPUs available on a worker-host
>> (meaning all available and assigned slots by SGE).
>> Suppose I know each host as 4 slots.
>>
>> 1. SGE question: ideally I'd like to be able to ask for say 9 slots,
>> allocated on 3 hosts (4 + 4 + 1), using the isolated slot to run the master
>> thread.
>> This way I can spawn one master, two 4-core workers to perform the tasks.
>> I read one can configure an SGE parallel environment to pass to qsub -pe
>> <pe_name> to ensure the allocation follows this rule.
>> Has anybody this kind of environment available on its cluster?
>> Would this one work?
>>
>> pe_name mtmpich
>> slots 9999
>> user_lists NONE
>> xuser_lists NONE
>> start_proc_args /opt/gridengine/mpi/startmpi.sh -catch_rsh $pe_hostfile
>> stop_proc_args /opt/gridengine/mpi/stopmpi.sh
>> allocation_rule $pe_slots
>> control_slaves TRUE
>> job_is_first_task FALSE
>> urgency_slots min
>>
>>
>> 2. Rmpi/doMPI question: suppose I cannot be sure the slots are allocated in
>> such a way, say I get (2+2+3+2). This means that other users are using the
>> other CPUs, which I do not want to over-use. Currently when I registerDoMC()
>> it registers all 4 CPUs which interferes with the other users' jobs.
>> Is it possible from within R to figure out which worker-host is running the
>> code and the number of CPU I am allowed to use on it?
>>
>> 3. If anybody has successfully done this kind of thing, please let me know
>> how.
>>
>> Thank you.
>> Renaud
>>
>> --
>> Renaud Gaujoux
>> Computational Biology - University of Cape Town
>> South Africa
>>
>>
>>
>>
>> ###
>> UNIVERSITY OF CAPE TOWN
>> This e-mail is subject to the UCT ICT policies and e-mail disclaimer
>> published on our website at
>> http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27
>> 21 650 4500. This e-mail is intended only for the person(s) to whom it is
>> addressed. If the e-mail has reached you in error, please notify the author.
>> If you are not the intended recipient of the e-mail you may not use,
>> disclose, copy, redirect or print the content. If this e-mail is not related
>> to the business of UCT it is sent by the sender in the sender's individual
>> capacity.
>>
>> ###
>>
>> _______________________________________________
>> R-sig-hpc mailing list
>> R-sig-hpc at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>
>>
###
UNIVERSITY OF CAPE TOWN
This e-mail is subject to the UCT ICT policies and e-mail disclaimer published on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 21 650 4500. This e-mail is intended only for the person(s) to whom it is addressed. If the e-mail has reached you in error, please notify the author. If you are not the intended recipient of the e-mail you may not use, disclose, copy, redirect or print the content. If this e-mail is not related to the business of UCT it is sent by the sender in the sender's individual capacity.
###
More information about the R-sig-hpc
mailing list