On Wed, Aug 15, 2018 at 11:57 AM, Michael Jennings wrote:
> We [...] are planning to investigate clush [...] in the near future.
I can only encourage you to do so, as ClusterShell comes with nice
Slurm bindings out of the box, that allow, among other things, to
execute commands on all the nodes:
On Wednesday, 15 August 2018, at 10:01:19 (-0400),
Paul Edmon wrote:
> On 08/14/2018 05:16 AM, Pablo Llopis wrote:
> >
> >Integration with a possible built-in healthcheck is also something
> >to consider, as the orchestration logic would need to take care of
> >disabling the healthcheck funcionali
On Wed, Aug 15, 2018 at 7:01 AM, Paul Edmon wrote:
> So we use NHC for our automatic node closer. For reopening we have a series
> of scripts that we use but they are all ad hoc and not formalized. Same
> with closing off subsets of nodes we just have a bunch of bash scripts that
> we have rolle
So we use NHC for our automatic node closer. For reopening we have a
series of scripts that we use but they are all ad hoc and not
formalized. Same with closing off subsets of nodes we just have a bunch
of bash scripts that we have rolled to do that.
-Paul Edmon-
On 08/14/2018 05:16 AM, Pa
Hi,
We intend to oversubscribe our GPU nodes with OverSubscribe=YES,
ExclusiveUser=YES and GRES=GPU:: (and of course with
gres.conf and cgroup.conf properly configured). We're not sure yet how
to approach accounting with this setup. We want to charge users for the
whole node, whether they are runn
"Tina Fora" writes:
> My guess is that there is something in the database that slurmdbd does not
> like. I'm not sure how to debug it further.
You could turn on logging of the actual sql statements that slurmdbd
sends to mysql by adding to the DebugFlags in slurmdbd.conf (see
slurmdbd.conf(5)).