[R-sig-hpc] parallelising: nodes and cores

Martin Ivanov martin.ivanov at ifg.uni-tuebingen.de
Sat Apr 20 19:19:29 CEST 2013

Dear all,

I have a task, which I would like to run in parallel on 3 nodes, each 
with 2 cores.
So I create an MPI cluster, use clusterSplit to split the task into 3 
subtasks, one for each node.
My question is: Will the function in the fun argument to clusterApply be 
run in parallel on the 2 available cores on each node?
Do I need to explicitly parallelise within fun along the 2 available 
cores on the current node?

To be more specific, this is my current setup (testNodes.R):

cl <- makeCluster(spec=3L, type="MPI", outfile=""); # 3 slave nodes are 
x <- seq_len(20L);
xseq <- clusterSplit(cl=cl, seq=x);
y <- clusterApply(cl=cl, x=xseq, fun=function(x) 
list(sysInfo=Sys.info()[c("nodename","machine")], x=x));
save(x,y, xseq, file="/home-link/epaiv01/test.RData");

I submit this script (testNodes.R) to torque via:

#PBS -l nodes=1:ppn=1+3:ppn=2
#PBS -l walltime=00:01:00
#PBS -l pmem=100kb
. /$HOME/.bashrc
mpirun -np 1 --hostfile ${PBS_NODEFILE} 
/home-link/epaiv01/system/usr/bin/Rscript testNodes.R

I have deliberately chosen 1 node with 1 core for the master process, 
and 3 nodes with 2 cpus for the slave processes, spawned in R.
The question is, will clusterApply use the 2 cores on each node? How can 
I make sure that the task is not actually
enaging only 1 core on the current node? Is there a way to actually 
check that?

Any comments will be appreciated.

Best regards,


Dr. Martin Ivanov
Eberhard-Karls-Universität Tübingen
Mathematisch-Naturwissenschaftliche Fakultät
Fachbereich Geowissenschaften
Water & Earth System Science (WESS)
Hölderlinstraße 12, 72074 Tübingen, Deutschland
Tel. +4970712974213

More information about the R-sig-hpc mailing list