Forgot to attach the release notes, they are included below for reference:
* Changes in Slurm 18.08.5
==
-- Backfill - If a job has a time_limit guess the end time of a job better
if OverTimeLimit is Unlimited.
-- Fix "sacctmgr show events event=cluster"
-- Fix sac
Slurm versions 17.11.13 and 18.08.5 are now available, and include a
series of recent bug fixes, as well as a fix for a security
vulnerability (CVE-2019-6438) on 32-bit systems. We believe that 64-bit
builds - the overwhelming majority of installations - of Slurm are not
affected by this issue.
Hello,
Memory for one of the jobs is going over the limit and slurm lets it run
and the job does not terminate. The job forks multiple jobs, but I don't
think we have had problems with slurm calculating total memory usage for
these kind of jobs. I've tested it on a single thread and the killing
me
In case you haven’t already done something similar, I reduced some of the
cumbersome-ness of my job_submit.lua by breaking it out into subsidiary
functions, and adding some logic to detect if I was in test mode or not. Basic
structure, with subsidiary functions defined ahead of slurm_job_submit(
Miguel,
Thanks for the reply. I've already thought about doing that, but I was
hoping there was an easier, "more universal" way of doing that. Right
now, I have a rather long job_submit.lua, which has made making changes
in my environment cumbersome, so I'm trying to minimize my reliance on
j
Hi Prentice,
You could add something like this to your job_submit.lua
QOS_DEBUG = ’system_debug'
PARTITION_DEBUG= ‘debug'
[...]
function slurm_job_submit(job_desc, part_list, submit_uid)
-- DEBUG/QOS ---
Hi everyone,
I am scratching my head to find a way to do this on slurm. We have three
nodes configured as such:
> # COMPUTE NODES
> NodeName=brassica NodeAddr=10.1.10.83 CPUs=64 RealMemory=95 Sockets=4
> CoresPerSocket=8 ThreadsPerCore=2 State=UNKNOWN
> NodeName=triticum NodeAddr=10.1.10.170