[R-sig-hpc] foreach on HPC

Stefan Lüdtke sluedtke at gfz-potsdam.de
Wed Oct 15 13:41:13 CEST 2014


Hi,

thanks for your answer. I am aware of this difference, but on a long
term we want to use multiple node, thus MPI is necessary.

Cheers,

stefan

On 10/15/2014 11:46 AM, Brian G. Peterson wrote:
> On 10/15/2014 04:11 AM, Stefan Lüdtke wrote:
>> Dear friends,
>>
>> we run an application on a HPC using the foreach package linked with
>> doMPI. Each iteration is quite RAM hungry, so we have been touching the
>> limit of 128GB. At the moment, the admin is figuring out whether an "out
>> of memory" killed the job or not, so apparently it did not finish
>> successfully.
>>
>> Now coming to my question: A couple of iterations run fine and I was
>> checking the load of the host every now and then using "htop". I was
>> observing that htop listed R processes that did not use any cpu at all
>> but have been occupying up to 4% of RAM. So, what are these processes?
>> Do I still see finished iterations that eat RAM from the system, and if
>> so, how do I get rid of them??
> 
> If you're on a single machine, you will have much more efficient RAM
> usage if you use doMC or doParallel with a fork cluster than using doMPI.
> 
> The workers in an MPI cluster will not be torn down and released until
> everything is done in an MPI cluster.  The fork cluster is more efficient.
> 
> If you're on multiple machines, then doMPI or doRedis (or zmq or
> similar) are necessary, but on one machine fork will have far less
> overhead.
> 
> Regards,
> 
> Brian
> 

-- 
Stefan Lüdtke

Section 5.4-  Hydrology
Tel.: +49 331 288 2821
Fax: +49 331 288 1570
Email: sluedtke at gfz-potsdam.de

Helmholtz-Zentrum Potsdam
Deutsches GeoForschungsZentrum GFZ
(GFZ German Research Centre for Geoscience)
Stiftung des öff. Rechts Land Brandenburg
Telegrafenberg, 14473 Potsdam
-------------------

PGP Public Key: http://bit.ly/13d9Sca



More information about the R-sig-hpc mailing list