[R-sig-hpc] Recommended method for mutlithreaded, multinode in R?

Ostrouchov, George georgeo@t @end|ng |rom gm@||@com
Tue Jun 11 07:08:11 CEST 2019


Take a look at a couple of scripts on GitHub, which I use to illustrate how to balance pbdMPI ranks and mclapply() forked threads: https://github.com/RBigData/mpi_balance. Similar balancing and potential thread conflict management is needed for threaded OpenBLAS and other threaded functions.

With MPI, typically there is no master and each R instance just finds its own tasks based on comm.rank() and higher level helper functions. For example, the RBigData/pbdIO package has a comm.chunk() function that provides various chunking options to return different tasks to different ranks.

George Ostrouchov

-----Original Message-----
From: R-sig-hpc <r-sig-hpc-bounces using r-project.org> on behalf of Martin Morgan <mtmorgan.bioc using gmail.com>
Date: Monday, June 10, 2019 at 4:16 PM
To: Bennet Fauber <bennet using umich.edu>, "R-sig-hpc using r-project.org" <R-sig-hpc using r-project.org>
Subject: Re: [R-sig-hpc]  Recommended method for mutlithreaded, multinode in R?

    The pbdR project https://pbdr.org adopts a more traditional HPC approach, including in the pbdMPI package on CRAN.
    
    ~$ mpiexec -np 2  R --slave -e "cat('hi\n')"
    hi
    hi
    
    
    Martin Morgan
    
    On 6/10/19, 3:54 PM, "R-sig-hpc on behalf of Bennet Fauber" <r-sig-hpc-bounces using r-project.org on behalf of bennet using umich.edu> wrote:
    
        Are there standard combinations of packages that people use when they
        want to use R on multiple nodes and have the R processes on those
        nodes call threaded C++ libraries?
        
        For example, if this were straight MPI, one might use a scheduler to
        get 8 cores per node assigned to a job, but then only start two MPI
        processes per node so that OpenMP for each of those process would have
        access to 4 cores.
        
        Most of the ways I've seen to get R going seem to assume that Rmpi
        will be used, and that the master R process will be started and it
        will in turn spawn one R process for each core in the job.
        
        Has someone written a way to do this where one might do something like
        this with OpenMPI ( or the equivalent with a different MPI)
        
            $ mpirun -pernode R hybrid.R
        
        where the hybrid.R would be able to come up and know how to find its
        tasks and do them?
        
        Thanks in advance for any pointers anyone might have.
        
        _______________________________________________
        R-sig-hpc mailing list
        R-sig-hpc using r-project.org
        https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
        
    _______________________________________________
    R-sig-hpc mailing list
    R-sig-hpc using r-project.org
    https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
    



More information about the R-sig-hpc mailing list