Re: [slurm-users] Slurm versions 17.11.13 and 18.08.5 are now available (CVE-2019-6438)

2019-01-30 Thread Tim Wickberg
Forgot to attach the release notes, they are included below for reference: * Changes in Slurm 18.08.5 == -- Backfill - If a job has a time_limit guess the end time of a job better if OverTimeLimit is Unlimited. -- Fix "sacctmgr show events event=cluster" -- Fix sac

[slurm-users] Slurm versions 17.11.13 and 18.08.5 are now available (CVE-2019-6438)

2019-01-30 Thread Tim Wickberg
Slurm versions 17.11.13 and 18.08.5 are now available, and include a series of recent bug fixes, as well as a fix for a security vulnerability (CVE-2019-6438) on 32-bit systems. We believe that 64-bit builds - the overwhelming majority of installations - of Slurm are not affected by this issue.

[slurm-users] Job memory not being tracked correctly for some processes

2019-01-30 Thread JinSung Kang
Hello, Memory for one of the jobs is going over the limit and slurm lets it run and the job does not terminate. The job forks multiple jobs, but I don't think we have had problems with slurm calculating total memory usage for these kind of jobs. I've tested it on a single thread and the killing me

Re: [slurm-users] Assigning a QOS to a partition?

2019-01-30 Thread Renfro, Michael
In case you haven’t already done something similar, I reduced some of the cumbersome-ness of my job_submit.lua by breaking it out into subsidiary functions, and adding some logic to detect if I was in test mode or not. Basic structure, with subsidiary functions defined ahead of slurm_job_submit(

Re: [slurm-users] Assigning a QOS to a partition?

2019-01-30 Thread Prentice Bisbal
Miguel, Thanks for the reply. I've already thought about doing that, but I was hoping there was an easier, "more universal" way of doing that. Right now, I have a rather long job_submit.lua, which has made making changes in my environment cumbersome, so I'm trying to minimize my reliance on j

Re: [slurm-users] Assigning a QOS to a partition?

2019-01-30 Thread Miguel Gila
Hi Prentice, You could add something like this to your job_submit.lua QOS_DEBUG = ’system_debug' PARTITION_DEBUG= ‘debug' [...] function slurm_job_submit(job_desc, part_list, submit_uid) -- DEBUG/QOS ---

[slurm-users] Specify number of cpus allocated to each partition when nodes are shared

2019-01-30 Thread Bruno Santos
Hi everyone, I am scratching my head to find a way to do this on slurm. We have three nodes configured as such: > # COMPUTE NODES > NodeName=brassica NodeAddr=10.1.10.83 CPUs=64 RealMemory=95 Sockets=4 > CoresPerSocket=8 ThreadsPerCore=2 State=UNKNOWN > NodeName=triticum NodeAddr=10.1.10.170