[R-sig-hpc] doSNOW + foreach = embarrassingly frustrating computation
Marius Hofert
m_hofert at web.de
Tue Dec 21 21:49:11 CET 2010
Okay, I ran (2) and (3) again, now with the lines
mpi.close.Rslaves()
mpi.quit()
(as suggested) in the end. Both programs stopped with:
> mpi.close.Rslaves()
Error in mpi.close.Rslaves() : It seems no slaves running on comm 1
Execution halted
I therefore ran (2) and (3) again, but only with
mpi.quit()
Below is the output.
So it seems to work! I have the following remaining questions:
(1) with "n <- mpi.universe.size()", program (3) elegantly uses all available CPUs (if I understand this command correctly). Can this always be used or should one think like that: "I need 3 workers, that's why I should use makeCluster(3, type = "MPI")". Obviously, with "n <- mpi.universe.size()" one is not required to think about the number of workers.
(2) Brian Peterson pointed out that doMPI might be a better choice. I would like to have a similar minimal example as (3) but with doMPI. Unfortunately, I couldn't find an equivalent of "clusterSetupRNG()" [from snow] for doMPI. Do you know how to setup rlecuyer with doMPI? I tried the following, but that does not work (of course):
## snippet doMPI start ====
library(doMPI)
library(foreach)
library(rlecuyer)
cl <- startMPIcluster()
clusterSetupRNG(cl, seed = rep(1,6)) # => only works with doSNOW (otherwise, you'll get 'Error: could not find function "clusterSetupRNG"')
registerDoMPI(cl) # register the cluster object with foreach
## start the work
x <- foreach(i = 1:3) %dopar% {
sqrt(i)
}
x
stopCluster(cl) # properly shut down the cluster
mpi.quit()
## snippet doMPI end ====
Cheers,
Marius
## === output of (2) start ===
Sender: LSF System <lsfadmin at a6169>
Subject: Job 195661: <mpirun -n 1 R --no-save -q -f m02.R> Done
Job <mpirun -n 1 R --no-save -q -f m02.R> was submitted from host <brutus3> by user <hofertj> in cluster <brutus>.
Job was executed on host(s) <4*a6169>, in queue <pub.1h>, as user <hofertj> in cluster <brutus>.
</cluster/home/math/hofertj> was used as the home directory.
</cluster/home/math/hofertj> was used as the working directory.
Started at Tue Dec 21 21:15:16 2010
Results reported at Tue Dec 21 21:15:27 2010
Your job looked like:
------------------------------------------------------------
# LSBATCH: User input
mpirun -n 1 R --no-save -q -f m02.R
------------------------------------------------------------
Successfully completed.
Resource usage summary:
CPU time : 6.41 sec.
Max Memory : 4 MB
Max Swap : 29 MB
Max Processes : 1
Max Threads : 1
The output (if any) follows:
> library(doSNOW)
Loading required package: foreach
Loading required package: iterators
Loading required package: codetools
Loading required package: snow
> library(Rmpi)
> library(rlecuyer)
>
> cl <- makeCluster(3, type = "MPI") # create cluster object with the given number of slaves
3 slaves are spawned successfully. 0 failed.
> clusterSetupRNG(cl, seed = rep(1,6)) # initialize uniform rng streams in a SNOW cluster (L'Ecuyer)
[1] "RNGstream"
> registerDoSNOW(cl) # register the cluster object with foreach
> ## start the work
> x <- foreach(i = 1:3) %dopar% {
+ sqrt(i)
+ }
> x
[[1]]
[1] 1
[[2]]
[1] 1.414214
[[3]]
[1] 1.732051
> stopCluster(cl) # properly shut down the cluster
[1] 1
> mpi.quit()
## === output of (2) end ===
## === output of (3) start ===
Sender: LSF System <lsfadmin at a6169>
Subject: Job 195663: <mpirun -n 1 R --no-save -q -f m03.R> Done
Job <mpirun -n 1 R --no-save -q -f m03.R> was submitted from host <brutus3> by user <hofertj> in cluster <brutus>.
Job was executed on host(s) <4*a6169>, in queue <pub.1h>, as user <hofertj> in cluster <brutus>.
</cluster/home/math/hofertj> was used as the home directory.
</cluster/home/math/hofertj> was used as the working directory.
Started at Tue Dec 21 21:15:16 2010
Results reported at Tue Dec 21 21:15:31 2010
Your job looked like:
------------------------------------------------------------
# LSBATCH: User input
mpirun -n 1 R --no-save -q -f m03.R
------------------------------------------------------------
Successfully completed.
Resource usage summary:
CPU time : 24.67 sec.
Max Memory : 297 MB
Max Swap : 1934 MB
Max Processes : 8
Max Threads : 19
The output (if any) follows:
> library(doSNOW)
Loading required package: foreach
Loading required package: iterators
Loading required package: codetools
Loading required package: snow
> library(Rmpi)
> library(rlecuyer)
>
> n <- mpi.universe.size()
> cl <- makeCluster(n, type ="MPI") # create cluster object
4 slaves are spawned successfully. 0 failed.
> clusterSetupRNG(cl, seed = rep(1,6)) # initialize uniform rng streams in a SNOW cluster (L'Ecuyer)
[1] "RNGstream"
> registerDoSNOW(cl) # register the cluster object with foreach
> ## start the work
> x <- foreach(i = 1:3) %dopar% {
+ sqrt(i)
+ }
> x
[[1]]
[1] 1
[[2]]
[1] 1.414214
[[3]]
[1] 1.732051
> stopCluster(cl) # properly shut down the cluster
[1] 1
> mpi.quit()
## === output of (3) end ===
More information about the R-sig-hpc
mailing list