To whom it may concern,
I have a question about the event and would like to ask for help.
When a node has a job running, it performs the offline operation. The command
of ‘sacctmgr show event ’cannot view the node offline record, which can only be
seen after the job completed. Moreover,
This problem turned out to be that the new node was on a different subnet than
the other nodes. Once our network admin opened up ports 6817, 6818, and 6188
between the subnets the new node worked.
Thanks for all the responses.
From: slurm-users On Behalf Of Riebs,
Andy
Sent: Friday, Ap
Well, the documentation is rather clear on this: "SuspendTime: Nodes becomes
eligible for power saving mode after being idle or down for this number of
seconds."
A drained node is neither idle nor down in my mind.
Thanks,
Florian
From: slurm-users on behalf of
Try suspending and resuming the users pending jobs to force a re-evaluation.
If the user is not in the zone of jobs that is evaluated, ie if enough higher
priority jobs have dropped in ahead then this job may not have been evaluated
for scheduling since a point in time when the user was indeed p
On Thu, 2020-05-14 at 13:10:04 +, Florian Zillner wrote:
> Hi,
>
> I'm experimenting with slurm's power saving feature and shutdown of "idle"
> nodes works in general, also the power up works when "idle~" nodes are
> requested.
> So far so good, but slurm is also shutting down nodes that are
Hi,
I'm experimenting with slurm's power saving feature and shutdown of "idle"
nodes works in general, also the power up works when "idle~" nodes are
requested.
So far so good, but slurm is also shutting down nodes that are not explicitly
"idle". Previously I drained a node to debug something o