[R-sig-hpc] mpirun and R

Jonathan Greenberg jgrn at illinois.edu
Sat Jun 2 22:39:31 CEST 2012


Ok, moving along here.  So I got R-devel installed and by switching
back and forth between a fresh install of openblas (with the
NO_AFFINITY = 1 line uncommented) I can get a parallel BLAS, but when
trying an Rmpi call with that BLAS selected I get (under OpenMPI):

> require("snow")
Loading required package: snow
> cl <- makeCluster(12,type="MPI")
Loading required package: Rmpi
[mynode:10521] [[37603,1],0] routed:binomial: Connection to lifeline
[[37603,0],0] lost

Note that I get the same error if I additionally uncomment
USE_OPENMP=1 when I compile openblas as well.

By setting the libRblas.so that comes with R (I'm using
update-alternatives to swap back and forth for testing), I can run
Rmpi correctly, but I am left with a single threaded BLAS.

Given that I can't install binary packages on the system I'm using
(everything I'm using has to be compiled "by hand" -- this is a system
policy, and nothing I can do to get around it), is there any way to
get a multithreaded BLAS working AND Rmpi behaving properly?  If I
need to be using a different BLAS, which is recommended?  ATLAS?  Is
there a way to get the Intel MKL working with R?

Thanks!

--j


On Sat, Jun 2, 2012 at 1:04 PM, Simon Urbanek
<simon.urbanek at r-project.org> wrote:
>
> On Jun 2, 2012, at 1:30 PM, Jonathan Greenberg wrote:
>
>> Simon:
>>
>> Thanks -- I'm afraid I'm a little unclear how to go about grabbing the updated parallel (since my version running on R 2.15 doesn't have mcaffinity) -- is there a download link to it?
>>
>
> No, the official way is to simply use R-devel. However, you should be able to merge r59189 from R-devel to R-2.15 with no conflicts if you wish.
>
> FWIW the link to the original thread which includes the other solutions is at
> https://stat.ethz.ch/pipermail/r-sig-hpc/2012-April/001357.html
>
> Cheers,
> Simon
>
>
>>
>> On Sat, Jun 2, 2012 at 12:20 PM, Simon Urbanek
>> <simon.urbanek at r-project.org> wrote:
>>>
>>> On Jun 2, 2012, at 12:46 PM, Jonathan Greenberg wrote:
>>>
>>>> Steve:
>>>>
>>>> It was built with OpenBLAS,
>>>
>>> That is the problem - this was previously discussed here, see the archive. OpenBLAS changes the affinity of the process to use only one CPU. You have to reset the affinity either with Linux tools or using mcaffinity (available in R-devel).
>>>
>>> Cheers,
>>> Simon
>>>
>>>
>>>> but does that matter with an MPI-based
>>>> function (i.e. I thought GotoBLAS was an entirely different hpc aspect
>>>> that is only used for linear algebra routines) -- but yes, all the
>>>> spawned R processes end up spawning on a single cpu, but if I use
>>>> mpirun it functions properly.  I had to "roll" OpenBLAS myself on this
>>>> system, because it only has Intel MKL installed by the admins which I
>>>> have yet to get to play right with R.  OpenBLAS does work for LA
>>>> commands tho.
>>>>
>>>> # In fact, running this on the normal spawned R uses all cores:
>>>> a = matrix(rnorm(5000*5000), 5000, 5000)
>>>> b = matrix(rnorm(5000*5000), 5000, 5000)
>>>> c = a%*%b
>>>>
>>>> #But then in the same instance running:
>>>> require(raster)
>>>> beginCluster()
>>>> # Only spawns on one core.
>>>>
>>>> Are there "better" parameters I might pass to snow to get this
>>>> working?  I get the same behavior in snowfall and sfInit():
>>>>
>>>> require(snowfall)
>>>> sfInit(parallel=TRUE,cpus=12)
>>>> sfStop()
>>>> # All spawns execute on a single CPU
>>>> sfInit(parallel=TRUE,cpus=12,type="MPI")
>>>> sfStop()
>>>> # All spawns execute on a single CPU
>>>>
>>>> Incidentally (and I don't consider this a perfectly satisfactory
>>>> answer, so please continue to give me some advice to try out), this
>>>> command at least lets me run R in interactive mode and doesn't always
>>>> bail when I type in an incorrect statement:
>>>>
>>>> `which mpirun` -n 1 -machinefile $PBS_NODEFILE R --interactive
>>>> (note the --interactive instead of the --vanilla)
>>>>
>>>> With that said if I need to kill a process with control-c (usually
>>>> just returning me to an R prompt) I do get R bailing back to the bash
>>>> command line.  The other reason I'd like a within-R solution to this
>>>> is that I do my development within the Stat-et/Eclipse environment,
>>>> and (at least right now) there is no way to modify how it launches R
>>>> remotely.
>>>>
>>>> --j
>>>>
>>>> On Fri, Jun 1, 2012 at 3:12 PM, Stephen Weston
>>>> <stephen.b.weston at gmail.com> wrote:
>>>>> So you wanted 12 cpus on a single node, but the 12 spawned
>>>>> R processes were all scheduled by your OS on a single cpu
>>>>> rather than multiple cpus/cores on that node?
>>>>>
>>>>> If so, that suggests that somehow the cpu affinity has been set.
>>>>> We've seen this type of problem when using GotoBLAS2/OpenBLAS.
>>>>> Has your R installation been built with either of them?
>>>>>
>>>>> - Steve
>>>>>
>>>>>
>>>>> On Fri, Jun 1, 2012 at 11:52 AM, Jonathan Greenberg <jgrn at illinois.edu> wrote:
>>>>>> R-sig-hpc'ers:
>>>>>>
>>>>>> Our system (running openmpi) allows for an interactive session to be
>>>>>> created with N number of CPUs allotted to it (12 in my case).  Here's
>>>>>> the qsub command to get the interactive node running:
>>>>>>
>>>>>> qsub -X -I -q [mygroup] -l nodes=1:ppn=12,walltime=48:00:00
>>>>>>
>>>>>> If I boot R and then try some HPC R commands e.g.:
>>>>>>
>>>>>> require(raster)
>>>>>> # Note this is just a wrapper for a snow call:
>>>>>> beginCluster()
>>>>>>
>>>>>> I get:
>>>>>>> beginCluster()
>>>>>> Loading required package: snow
>>>>>> 12 cores detected
>>>>>> cluster type: MPI
>>>>>> Loading required package: Rmpi
>>>>>>        12 slaves are spawned successfully. 0 failed.
>>>>>>
>>>>>> If I "top" I see that I have 12 (13?) R spawns running.  The problem
>>>>>> is, they are all running on a SINGLE cpu, not distributed amongst all
>>>>>> 12 cpus (even though it detected it).  My first question is: why is
>>>>>> this?  Is there a way to fix this from a standard "R" launch?
>>>>>>
>>>>>> Now, I can SOMEWHAT fix this by:
>>>>>> `which mpirun` -n 1 -machinefile $PBS_NODEFILE R --vanilla
>>>>>>
>>>>>> When I run the same commands, they distribute properly to all 12 cpus
>>>>>> BUT ANY error I make in typing will cause the entire system to "die":
>>>>>>> require(raster)
>>>>>> require(raster)
>>>>>> Loading required package: raster
>>>>>> Loading required package: sp
>>>>>> raster 1.9-92 (1-May-2012)
>>>>>>> beginCluster()
>>>>>> beginCluster()
>>>>>> Loading required package: snow
>>>>>> 12 cores detected
>>>>>> cluster type: MPI
>>>>>> Loading required package: Rmpi
>>>>>>        12 slaves are spawned successfully. 0 failed.
>>>>>>> abc
>>>>>> Error: object 'abc' not found
>>>>>> Execution halted
>>>>>> --------------------------------------------------------------------------
>>>>>> mpirun has exited due to process rank 0 with PID 28932 on
>>>>>> node [mynode] exiting without calling "finalize". This may
>>>>>> have caused other processes in the application to be
>>>>>> terminated by signals sent by mpirun (as reported here).
>>>>>> --------------------------------------------------------------------------
>>>>>>
>>>>>> Is there a way to allow me a "safer" mpirun launch that won't die if I
>>>>>> make a small typo?  This makes it REALLY hard to troubleshoot code if
>>>>>> any little error causes the quit.
>>>>>>
>>>>>> --j
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jonathan A. Greenberg, PhD
>>>>>> Assistant Professor
>>>>>> Department of Geography and Geographic Information Science
>>>>>> University of Illinois at Urbana-Champaign
>>>>>> 607 South Mathews Avenue, MC 150
>>>>>> Urbana, IL 61801
>>>>>> Phone: 415-763-5476
>>>>>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>>>>>> http://www.geog.illinois.edu/people/JonathanGreenberg.html
>>>>>>
>>>>>> _______________________________________________
>>>>>> R-sig-hpc mailing list
>>>>>> R-sig-hpc at r-project.org
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>>>
>>>>
>>>>
>>>> --
>>>> Jonathan A. Greenberg, PhD
>>>> Assistant Professor
>>>> Department of Geography and Geographic Information Science
>>>> University of Illinois at Urbana-Champaign
>>>> 607 South Mathews Avenue, MC 150
>>>> Urbana, IL 61801
>>>> Phone: 415-763-5476
>>>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>>>> http://www.geog.illinois.edu/people/JonathanGreenberg.html
>>>>
>>>> _______________________________________________
>>>> R-sig-hpc mailing list
>>>> R-sig-hpc at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Jonathan A. Greenberg, PhD
>> Assistant Professor
>> Department of Geography and Geographic Information Science
>> University of Illinois at Urbana-Champaign
>> 607 South Mathews Avenue, MC 150
>> Urbana, IL 61801
>> Phone: 415-763-5476
>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>> http://www.geog.illinois.edu/people/JonathanGreenberg.html
>>
>>
>



-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
http://www.geog.illinois.edu/people/JonathanGreenberg.html



More information about the R-sig-hpc mailing list