[R-sig-hpc] R, Nomad, HTCondor, etc... and future

David Bellot d@v|d@be||ot @end|ng |rom gm@||@com
Sat Dec 19 03:25:17 CET 2020


many months later...
Slurm is indeed a great tool and perfectly adapted to what I wanted to do.

My feedback for the community:

- in R, batchtools is great to use with Slurm, even if the documentation
could be improved. Simple things can take time to do because it's not
always obvious from the documentation. When you know how things work, it's
very easy to use and the solutions are incredibly simple and short to
write. Like: "Oooooh, I just had to pass this flag to this function. Now it
makes sense!!! :-D"
- I also recommend using batchtools directly and not always try to mix it
with `future` or `foreach` for example. But, keep reading...
- I found the combination of batchtools, future, purrr and furrr to be a
great tool too for slightly more complex jobs. Don't underestimate those
packages.

Cheers,
Davi

On Fri, May 22, 2020 at 9:54 PM Bennet Fauber <bennet using umich.edu> wrote:

> David,
>
> Slurm is a good tool, especially if you are not doing complicated
> scheduling things with it.  It is really designed to do HPC, so you
> might want to take a quick look at your needs and see whether HPC is
> really thing thing you want or whether you might be better off in an
> HTC environment, like HTCondor.  They are really designed to do
> different things in different ways.
>
> Many, if not most, sites seem to end up building HPC clusters, but
> many of the users might be better off with HTC, instead.  I'd counsel
> you to take a scan through the HTCondor documentation, and at Open
> Science Grid, just to get a sense of what the differences are.
>
> For example, with HTCondor, you could configure workstations to be
> part of your available resource pool during off hours, or if they are
> idle, and it's much harder to do that with something like Slurm.
>
> Anyway, you're buying the shoe, I would just make sure it fits well
> before walking a long way with it.
>
> -- bennet
>
> On Thu, May 21, 2020 at 10:45 PM David Bellot <david.bellot using gmail.com>
> wrote:
> >
> > >
> > > I don't see in the above how your 'one-shot job' is different from your
> > > colleagues need to send spot requests.
> > >
> >
> > You're right I didn't explain correclty. On one hand, I have experiments
> to
> > run.
> > Think about 'foreach %dopar%' loops and things like that. When it's
> done, I
> > look at the result, and the work is done. My program has run and I don't
> it
> > need anymore.
> > On the other hand, they have many small services they want to keep
> waiting
> > 24/7 and run when called, I mean on-demand. They don't need to be heavy
> on
> > CPU, except for the few seconds, maybe when the services are called. In
> my
> > use case, I don't need a service to stay up 24/7, but I use the CPU very
> > intensively.
> >
> > And describing it like this now, I simply realized that solving these two
> > different problems with one single solution seems a bit ... huh... silly
> :-)
> >
> > I found slurm reasonable in the past, and it has only gotten more widely
> > > used
> > > / available sense.  It will provide you with access to the compute
> > > resource,
> > > will account for 'who does what' and can schedule / resource (which I
> never
> > > really needed, and sounds like you don't either). Plus it will give you
> > > easy
> > > view on what is currently up or down, available etc pp.
> > >
> > > The devil is as always in the details. I'd say experiment and a little
> and
> > > take it from there.
> > >
> >
> > I'll give Slurm a try then. You're not the first one to say it's a good
> > tool.
> > Thanks Dirk.
> >
> >         [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-hpc mailing list
> > R-sig-hpc using r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>

	[[alternative HTML version deleted]]



More information about the R-sig-hpc mailing list