[R-sig-hpc] mpirun and R

Jonathan Greenberg jgrn at illinois.edu
Sun Jun 3 17:24:32 CEST 2012


Incidentally, to give the OS the freedom to parse out the load itself,
would it make any difference (and be better practice) if I let each
spawn have access to all the processors :

clusterApply(cl, 1:12, function(x) { require(parallel); mcaffinity(1:12) } )

Also, what is the best way to have R figure out how many CPUs there
are available (so I can pass it to the function as a variable)?

--j

On Sat, Jun 2, 2012 at 8:29 PM, Simon Urbanek
<simon.urbanek at r-project.org> wrote:
> On Jun 2, 2012, at 8:41 PM, Jonathan Greenberg wrote:
>
>> Ah HAH.  Yep, clusterApply(cl, 1:12, mcaffinity) did it.
>>
>> One thing I'm noticing (not sure why) is when I switch to one of the
>> OpenBLAS libraries via update-alternatives, when I boot R the first
>> few times it returns a "segmentation fault" but if I try running R
>> maybe 2-4 times, it will eventually start running.  Using the built-in
>> libRblas.so and the ATLAS versions do not appear to have this problem.
>>
>> Re: the intel MKL stuff -- since I'm using update-alternatives, is
>> there a .so file hiding someplace that I should be able to link
>> against?
>>
>
> Yes, R will happily create one for you if you simply add --enable-BLAS-shlib -- e.g., if you link R against MKL with --enable-BLAS-shlib the resulting libRblas.so will be a fully functional MKL dynamic library that you can link against directly with no dependencies and you can use it in the alternatives setup.
>
> Cheers,
> Simon
>
>
>>
>> On Sat, Jun 2, 2012 at 7:18 PM, Simon Urbanek
>> <simon.urbanek at r-project.org> wrote:
>>>
>>> On Jun 2, 2012, at 8:00 PM, Jonathan Greenberg wrote:
>>>
>>>> Wanted to follow up on this using OpenBLAS 0.1.1 with the affinity
>>>> flag commented out -- I did try:
>>>> require(parallel)
>>>> cl <- makeCluster(12,type="MPI")
>>>> # R spawns are running on a single core
>>>> mcaffinity(1:12)
>>>> # No change, the R spawns are still running on a single core.
>>>>
>>>
>>> But are you running mcaffinity on the *cluster instances*? It won't do you any good to run it in the master instance. E.g. if you want to use different CPUs for each instance you would run something like clusterApply(cl, 1:12, mcaffinity)
>>>
>>>
>>>> In terms of your comment about the Intel MKL -- is it possible to run this as a shared dynamic library?  If so, I was never able to figure out what specific file to link against with our local install of Intel MKL (i.e. what is the swap-in for libRblas.so?).
>>>>
>>>
>>> Yes (although I prefer to use the static version for speed), if I recall correctly it's something like
>>> -lmkl_gf_lp64 -lmkl_gnu_thread -lmkl_core -lpthread
>>>
>>> Cheers,
>>> Simon
>>>
>>>
>>>>
>>>> On Sat, Jun 2, 2012 at 6:16 PM, Simon Urbanek
>>>> <simon.urbanek at r-project.org> wrote:
>>>>>
>>>>> On Jun 2, 2012, at 4:55 PM, Jonathan Greenberg wrote:
>>>>>
>>>>>> Sorry, short followup: if I keep the NO_AFFINITY=1 commented, I don't
>>>>>> get the failure below, but I get the aforementioned single-threaded
>>>>>> cluster execution.  I tried this:
>>>>>>
>>>>>> # NO_AFFINITY=1 LEFT COMMENTED IN OPENBLAS MAKEFILE.RULE
>>>>>> require(parallel)
>>>>>> mcaffinity()
>>>>>> # It shows, of course, "1"
>>>>>> mcaffinity(1:12)
>>>>>>> cl <- makeCluster(12,type="MPI")
>>>>>> Loading required package: Rmpi
>>>>>> [mynode:00415] [[47733,1],0] routed:binomial: Connection to lifeline
>>>>>> [[47733,0],0] lost
>>>>>>
>>>>>
>>>>> Can you run mcaffinity *after* you make the cluster? (I.e. as a part of your code - you could for example have each instance use a different CPU).
>>>>>
>>>>> Cheers,
>>>>> Simon
>>>>>
>>>>>
>>>>>> So it seems like setting the affinity causes Rmpi to choke...
>>>>>>
>>>>>> --j
>>>>>>
>>>>>> On Sat, Jun 2, 2012 at 3:39 PM, Jonathan Greenberg <jgrn at illinois.edu> wrote:
>>>>>>> Ok, moving along here.  So I got R-devel installed and by switching
>>>>>>> back and forth between a fresh install of openblas (with the
>>>>>>> NO_AFFINITY = 1 line uncommented) I can get a parallel BLAS, but when
>>>>>>> trying an Rmpi call with that BLAS selected I get (under OpenMPI):
>>>>>>>
>>>>>>>> require("snow")
>>>>>>> Loading required package: snow
>>>>>>>> cl <- makeCluster(12,type="MPI")
>>>>>>> Loading required package: Rmpi
>>>>>>> [mynode:10521] [[37603,1],0] routed:binomial: Connection to lifeline
>>>>>>> [[37603,0],0] lost
>>>>>>>
>>>>>>> Note that I get the same error if I additionally uncomment
>>>>>>> USE_OPENMP=1 when I compile openblas as well.
>>>>>>>
>>>>>>> By setting the libRblas.so that comes with R (I'm using
>>>>>>> update-alternatives to swap back and forth for testing), I can run
>>>>>>> Rmpi correctly, but I am left with a single threaded BLAS.
>>>>>>>
>>>>>>> Given that I can't install binary packages on the system I'm using
>>>>>>> (everything I'm using has to be compiled "by hand" -- this is a system
>>>>>>> policy, and nothing I can do to get around it), is there any way to
>>>>>>> get a multithreaded BLAS working AND Rmpi behaving properly?  If I
>>>>>>> need to be using a different BLAS, which is recommended?  ATLAS?  Is
>>>>>>> there a way to get the Intel MKL working with R?
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> --j
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Jun 2, 2012 at 1:04 PM, Simon Urbanek
>>>>>>> <simon.urbanek at r-project.org> wrote:
>>>>>>>>
>>>>>>>> On Jun 2, 2012, at 1:30 PM, Jonathan Greenberg wrote:
>>>>>>>>
>>>>>>>>> Simon:
>>>>>>>>>
>>>>>>>>> Thanks -- I'm afraid I'm a little unclear how to go about grabbing the updated parallel (since my version running on R 2.15 doesn't have mcaffinity) -- is there a download link to it?
>>>>>>>>>
>>>>>>>>
>>>>>>>> No, the official way is to simply use R-devel. However, you should be able to merge r59189 from R-devel to R-2.15 with no conflicts if you wish.
>>>>>>>>
>>>>>>>> FWIW the link to the original thread which includes the other solutions is at
>>>>>>>> https://stat.ethz.ch/pipermail/r-sig-hpc/2012-April/001357.html
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Simon
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Jun 2, 2012 at 12:20 PM, Simon Urbanek
>>>>>>>>> <simon.urbanek at r-project.org> wrote:
>>>>>>>>>>
>>>>>>>>>> On Jun 2, 2012, at 12:46 PM, Jonathan Greenberg wrote:
>>>>>>>>>>
>>>>>>>>>>> Steve:
>>>>>>>>>>>
>>>>>>>>>>> It was built with OpenBLAS,
>>>>>>>>>>
>>>>>>>>>> That is the problem - this was previously discussed here, see the archive. OpenBLAS changes the affinity of the process to use only one CPU. You have to reset the affinity either with Linux tools or using mcaffinity (available in R-devel).
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Simon
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> but does that matter with an MPI-based
>>>>>>>>>>> function (i.e. I thought GotoBLAS was an entirely different hpc aspect
>>>>>>>>>>> that is only used for linear algebra routines) -- but yes, all the
>>>>>>>>>>> spawned R processes end up spawning on a single cpu, but if I use
>>>>>>>>>>> mpirun it functions properly.  I had to "roll" OpenBLAS myself on this
>>>>>>>>>>> system, because it only has Intel MKL installed by the admins which I
>>>>>>>>>>> have yet to get to play right with R.  OpenBLAS does work for LA
>>>>>>>>>>> commands tho.
>>>>>>>>>>>
>>>>>>>>>>> # In fact, running this on the normal spawned R uses all cores:
>>>>>>>>>>> a = matrix(rnorm(5000*5000), 5000, 5000)
>>>>>>>>>>> b = matrix(rnorm(5000*5000), 5000, 5000)
>>>>>>>>>>> c = a%*%b
>>>>>>>>>>>
>>>>>>>>>>> #But then in the same instance running:
>>>>>>>>>>> require(raster)
>>>>>>>>>>> beginCluster()
>>>>>>>>>>> # Only spawns on one core.
>>>>>>>>>>>
>>>>>>>>>>> Are there "better" parameters I might pass to snow to get this
>>>>>>>>>>> working?  I get the same behavior in snowfall and sfInit():
>>>>>>>>>>>
>>>>>>>>>>> require(snowfall)
>>>>>>>>>>> sfInit(parallel=TRUE,cpus=12)
>>>>>>>>>>> sfStop()
>>>>>>>>>>> # All spawns execute on a single CPU
>>>>>>>>>>> sfInit(parallel=TRUE,cpus=12,type="MPI")
>>>>>>>>>>> sfStop()
>>>>>>>>>>> # All spawns execute on a single CPU
>>>>>>>>>>>
>>>>>>>>>>> Incidentally (and I don't consider this a perfectly satisfactory
>>>>>>>>>>> answer, so please continue to give me some advice to try out), this
>>>>>>>>>>> command at least lets me run R in interactive mode and doesn't always
>>>>>>>>>>> bail when I type in an incorrect statement:
>>>>>>>>>>>
>>>>>>>>>>> `which mpirun` -n 1 -machinefile $PBS_NODEFILE R --interactive
>>>>>>>>>>> (note the --interactive instead of the --vanilla)
>>>>>>>>>>>
>>>>>>>>>>> With that said if I need to kill a process with control-c (usually
>>>>>>>>>>> just returning me to an R prompt) I do get R bailing back to the bash
>>>>>>>>>>> command line.  The other reason I'd like a within-R solution to this
>>>>>>>>>>> is that I do my development within the Stat-et/Eclipse environment,
>>>>>>>>>>> and (at least right now) there is no way to modify how it launches R
>>>>>>>>>>> remotely.
>>>>>>>>>>>
>>>>>>>>>>> --j
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jun 1, 2012 at 3:12 PM, Stephen Weston
>>>>>>>>>>> <stephen.b.weston at gmail.com> wrote:
>>>>>>>>>>>> So you wanted 12 cpus on a single node, but the 12 spawned
>>>>>>>>>>>> R processes were all scheduled by your OS on a single cpu
>>>>>>>>>>>> rather than multiple cpus/cores on that node?
>>>>>>>>>>>>
>>>>>>>>>>>> If so, that suggests that somehow the cpu affinity has been set.
>>>>>>>>>>>> We've seen this type of problem when using GotoBLAS2/OpenBLAS.
>>>>>>>>>>>> Has your R installation been built with either of them?
>>>>>>>>>>>>
>>>>>>>>>>>> - Steve
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jun 1, 2012 at 11:52 AM, Jonathan Greenberg <jgrn at illinois.edu> wrote:
>>>>>>>>>>>>> R-sig-hpc'ers:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Our system (running openmpi) allows for an interactive session to be
>>>>>>>>>>>>> created with N number of CPUs allotted to it (12 in my case).  Here's
>>>>>>>>>>>>> the qsub command to get the interactive node running:
>>>>>>>>>>>>>
>>>>>>>>>>>>> qsub -X -I -q [mygroup] -l nodes=1:ppn=12,walltime=48:00:00
>>>>>>>>>>>>>
>>>>>>>>>>>>> If I boot R and then try some HPC R commands e.g.:
>>>>>>>>>>>>>
>>>>>>>>>>>>> require(raster)
>>>>>>>>>>>>> # Note this is just a wrapper for a snow call:
>>>>>>>>>>>>> beginCluster()
>>>>>>>>>>>>>
>>>>>>>>>>>>> I get:
>>>>>>>>>>>>>> beginCluster()
>>>>>>>>>>>>> Loading required package: snow
>>>>>>>>>>>>> 12 cores detected
>>>>>>>>>>>>> cluster type: MPI
>>>>>>>>>>>>> Loading required package: Rmpi
>>>>>>>>>>>>>        12 slaves are spawned successfully. 0 failed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If I "top" I see that I have 12 (13?) R spawns running.  The problem
>>>>>>>>>>>>> is, they are all running on a SINGLE cpu, not distributed amongst all
>>>>>>>>>>>>> 12 cpus (even though it detected it).  My first question is: why is
>>>>>>>>>>>>> this?  Is there a way to fix this from a standard "R" launch?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Now, I can SOMEWHAT fix this by:
>>>>>>>>>>>>> `which mpirun` -n 1 -machinefile $PBS_NODEFILE R --vanilla
>>>>>>>>>>>>>
>>>>>>>>>>>>> When I run the same commands, they distribute properly to all 12 cpus
>>>>>>>>>>>>> BUT ANY error I make in typing will cause the entire system to "die":
>>>>>>>>>>>>>> require(raster)
>>>>>>>>>>>>> require(raster)
>>>>>>>>>>>>> Loading required package: raster
>>>>>>>>>>>>> Loading required package: sp
>>>>>>>>>>>>> raster 1.9-92 (1-May-2012)
>>>>>>>>>>>>>> beginCluster()
>>>>>>>>>>>>> beginCluster()
>>>>>>>>>>>>> Loading required package: snow
>>>>>>>>>>>>> 12 cores detected
>>>>>>>>>>>>> cluster type: MPI
>>>>>>>>>>>>> Loading required package: Rmpi
>>>>>>>>>>>>>        12 slaves are spawned successfully. 0 failed.
>>>>>>>>>>>>>> abc
>>>>>>>>>>>>> Error: object 'abc' not found
>>>>>>>>>>>>> Execution halted
>>>>>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>>>>>> mpirun has exited due to process rank 0 with PID 28932 on
>>>>>>>>>>>>> node [mynode] exiting without calling "finalize". This may
>>>>>>>>>>>>> have caused other processes in the application to be
>>>>>>>>>>>>> terminated by signals sent by mpirun (as reported here).
>>>>>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is there a way to allow me a "safer" mpirun launch that won't die if I
>>>>>>>>>>>>> make a small typo?  This makes it REALLY hard to troubleshoot code if
>>>>>>>>>>>>> any little error causes the quit.
>>>>>>>>>>>>>
>>>>>>>>>>>>> --j
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Jonathan A. Greenberg, PhD
>>>>>>>>>>>>> Assistant Professor
>>>>>>>>>>>>> Department of Geography and Geographic Information Science
>>>>>>>>>>>>> University of Illinois at Urbana-Champaign
>>>>>>>>>>>>> 607 South Mathews Avenue, MC 150
>>>>>>>>>>>>> Urbana, IL 61801
>>>>>>>>>>>>> Phone: 415-763-5476
>>>>>>>>>>>>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>>>>>>>>>>>>> http://www.geog.illinois.edu/people/JonathanGreenberg.html
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> R-sig-hpc mailing list
>>>>>>>>>>>>> R-sig-hpc at r-project.org
>>>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Jonathan A. Greenberg, PhD
>>>>>>>>>>> Assistant Professor
>>>>>>>>>>> Department of Geography and Geographic Information Science
>>>>>>>>>>> University of Illinois at Urbana-Champaign
>>>>>>>>>>> 607 South Mathews Avenue, MC 150
>>>>>>>>>>> Urbana, IL 61801
>>>>>>>>>>> Phone: 415-763-5476
>>>>>>>>>>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>>>>>>>>>>> http://www.geog.illinois.edu/people/JonathanGreenberg.html
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> R-sig-hpc mailing list
>>>>>>>>>>> R-sig-hpc at r-project.org
>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jonathan A. Greenberg, PhD
>>>>>>>>> Assistant Professor
>>>>>>>>> Department of Geography and Geographic Information Science
>>>>>>>>> University of Illinois at Urbana-Champaign
>>>>>>>>> 607 South Mathews Avenue, MC 150
>>>>>>>>> Urbana, IL 61801
>>>>>>>>> Phone: 415-763-5476
>>>>>>>>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>>>>>>>>> http://www.geog.illinois.edu/people/JonathanGreenberg.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jonathan A. Greenberg, PhD
>>>>>>> Assistant Professor
>>>>>>> Department of Geography and Geographic Information Science
>>>>>>> University of Illinois at Urbana-Champaign
>>>>>>> 607 South Mathews Avenue, MC 150
>>>>>>> Urbana, IL 61801
>>>>>>> Phone: 415-763-5476
>>>>>>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>>>>>>> http://www.geog.illinois.edu/people/JonathanGreenberg.html
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jonathan A. Greenberg, PhD
>>>>>> Assistant Professor
>>>>>> Department of Geography and Geographic Information Science
>>>>>> University of Illinois at Urbana-Champaign
>>>>>> 607 South Mathews Avenue, MC 150
>>>>>> Urbana, IL 61801
>>>>>> Phone: 415-763-5476
>>>>>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>>>>>> http://www.geog.illinois.edu/people/JonathanGreenberg.html
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jonathan A. Greenberg, PhD
>>>> Assistant Professor
>>>> Department of Geography and Geographic Information Science
>>>> University of Illinois at Urbana-Champaign
>>>> 607 South Mathews Avenue, MC 150
>>>> Urbana, IL 61801
>>>> Phone: 415-763-5476
>>>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>>>> http://www.geog.illinois.edu/people/JonathanGreenberg.html
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Jonathan A. Greenberg, PhD
>> Assistant Professor
>> Department of Geography and Geographic Information Science
>> University of Illinois at Urbana-Champaign
>> 607 South Mathews Avenue, MC 150
>> Urbana, IL 61801
>> Phone: 415-763-5476
>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>> http://www.geog.illinois.edu/people/JonathanGreenberg.html
>>
>>
>



-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
http://www.geog.illinois.edu/people/JonathanGreenberg.html



More information about the R-sig-hpc mailing list