On 20/6/19 3:24 am, Brian Andrus wrote:
Can you give the exact command/output you have from this?
I suspect a typo in your slurm.conf for nodenames or what you are typing.
Brian Andrus
Hi Brian,
I am pretty sure there is no error in my typing of the commands, but
just in case find below t
On 6/18/19 11:29 PM, nathan norton wrote:
Without knowing the internals of slurm it feels like nodes that are
turned off+cloud state don't exist in the system until they are on?
Not quite, they exist internally but are not exposed until in use:
https://slurm.schedmd.com/elastic_computing.html
Can you give the exact command/output you have from this?
I suspect a typo in your slurm.conf for nodenames or what you are typing.
Brian Andrus
On 6/18/2019 11:29 PM, nathan norton wrote:
Hi,
It just shows
"Node $NODE not found"
Whereas others all work as expected (ie, they are running)
W
Hi,
It just shows
"Node $NODE not found"
Whereas others all work as expected (ie, they are running)
Without knowing the internals of slurm it feels like nodes that are turned
off+cloud state don't exist in the system until they are on?
Any other ideas?
Thanks
Nathan
On Wed., 19 Jun. 2019, 4:
On Tuesday, 18 June 2019 9:36:56 PM PDT nathan norton wrote:
> Just tried running that command, but it only shows nodes that are up and
> running, doesn’t tell me about any nodes that are down and turned off, as
> an example please see below. There is a job running that should be using
> the 100 n
Hi,
Just tried running that command, but it only shows nodes that are up and
running, doesn’t tell me about any nodes that are down and turned off, as
an example please see below. There is a job running that should be using
the 100 nodes but only 52 are allocated (plus 2 down* (that I know about
Hi Nathan,
The command I use to get the reason for failed nodes is ... 'sinfo -Ral'. If
you need to extend the width of the output then ... 'sinfo -Ral -O
reason:35,user,timestamp,statelong,nodelist'.
Using the timestamp of the failure look in the slurmd or slurmctld logs.
---
Sam Gallop