[R-sig-hpc] R, Nomad, HTCondor, etc... and future

David Bellot d@v|d@be||ot @end|ng |rom gm@||@com
Fri May 22 03:46:59 CEST 2020


Thanks George.
To give more colors to what I'm trying to achieve, let me describe the two
opposite use cases. My use case is in R obviously, and I run one-shot jobs
to explore data sets as fast as possible and run optimization algos in
which the objective function is really cpu-intensive to compute.

At the same time, other people in the same organization want to run
services, written in other languages and use the same cluster of computers.
Those services are very different in nature but in general, the idea is to
have a collection of processes always ready to answer to a request when
needed. Ideally, the same cluster should be used by everyone so that to
maximize its uptime, not waste on expensive resources, etc... And ideally,
I don't want to have many job scheduler/distribution engine to manage at
the same time. Kind of a Holy Grail, I concede.

Hence me looking at things like Nomad, HTCondor, etc...

On Fri, May 22, 2020 at 11:19 AM Ostrouchov, George <georgeost using gmail.com>
wrote:

> Hi David,
>
> I live in a large HPC world, where distributed computing is inherently
> batch, so take my advice with that perspective. Large systems are mostly
> incompatible with the interactive concept of "backend" and instead support
> SPMD-style batch programming. SPMD is mostly MPI+X, meaning that the
> distributed aspect is handled by MPI and within node aspects can vary among
> several options including MPI, fork, OpenMP, and OpenACC. But even on a
> medium slurm-managed cluster (possibly in a corporate environment), for R I
> would recommend a combination of pbdR.org distributed packages and parallel
> package's mclapply components for within node parallelism.
>
> Best,
> George
>
> -----Original Message-----
> From: R-sig-hpc <r-sig-hpc-bounces using r-project.org> on behalf of David
> Bellot <david.bellot using gmail.com>
> Date: Thursday, May 21, 2020 at 8:26 PM
> To: <r-sig-hpc using r-project.org>
> Subject: [R-sig-hpc] R, Nomad, HTCondor, etc... and future
>
>     Hi R HPC,
>
>     I was wondering if anyone has ever used Nomad from Hashicorp as a
> backend
>     engine to run R distributed code.
>     Moreover, if you use it with *future, *I'd love to hear about your
>     experience and if you published a package for it, I'd love to use it
> too.
>
>     If not, which other distribution engine do you use? (apart from those
>     supported in *batchtools*)?
>     HTCondor, DockerSwarm, BOINC, etc... ?
>
>     I didn't make a final decision yet on which engine to use, but it has
> to be
>     versatile enough (I know it's not a lof of information but think about
>     "corporate environment" with various needs. My R need is just one among
>     many other).
>
>     Thanks for your help.
>     David
>
>         [[alternative HTML version deleted]]
>
>     _______________________________________________
>     R-sig-hpc mailing list
>     R-sig-hpc using r-project.org
>     https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>
>
>

	[[alternative HTML version deleted]]



More information about the R-sig-hpc mailing list