Re: [slurm-users] Where to adjust the memory limit from sinfo vs free command?

2019-05-16 Thread Renfro, Michael
Should be set on your NodeName lines in slurm.conf. For a 256 GB node, I’ve got: NodeName=node038 CoresPerSocket=14 RealMemory=254000 Sockets=2 ThreadsPerCore=1 so that users can’t reserve every bit of physical memory, leaving a small amount for OS operation. > On May 16, 2019, at 3:47 PM,

[slurm-users] Where to adjust the memory limit from sinfo vs free command?

2019-05-16 Thread Robert Kudyba
The MEMORY limit here shows 1, which I believe is 1 MB? But the results of the free command clearly shows we have more than that. Where is this configured? sinfo -lNe Thu May 16 16:41:23 2019 NODELIST NODES PARTITION STATE CPUSS:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON

Re: [slurm-users] Issue with x11

2019-05-16 Thread Christopher Samuel
On 5/16/19 1:04 AM, Alan Orth wrote: but now we get a handful of nodes drained every day with reason "Kill task failed". In ten years of using SLURM I've never had so many problems as I'm having now. :\ We see "kill task failed" issues but as Marcus says that's not related to X11 support, wh

[slurm-users] Myload script from Slurm Gang Scheduling tutorial

2019-05-16 Thread Robert Kudyba
Hello, Can anyone share the myload script referenced in https://slurm.schedmd.com/gang_scheduling.html Would like to test this on our Bright Cluster running Slurm now as the workload manager and allowing multiple jobs to run concurrently. Than

Re: [slurm-users] Issue with x11

2019-05-16 Thread Christopher Samuel
On 5/16/19 8:53 AM, Mahmood Naderan wrote: Can I ask what is the expected release date for 19? It seems that rc1 has been released in theMay? Sometime in May hopefully! -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA

Re: [slurm-users] Issue with x11

2019-05-16 Thread Mahmood Naderan
Can I ask what is the expected release date for 19? It seems that rc1 has been released in theMay? Regards, Mahmood On Thu, May 16, 2019 at 4:48 PM Marcus Wagner wrote: > Hi Alan, > > we are also seeing this, but that has nothing to do with X11 support, > since we compile atm. SLURM without

[slurm-users] Testing/evaluating new versions of slurm (19.05 in this case)

2019-05-16 Thread David Baker
Hello, Following the various postings regarding slurm 19.05 I thought it was an opportune time to send this question to the forum. Like others I'm awaiting 19.05 primarily due to the addition of the XFACTOR priority setting, but due to other new/improved features as well. I'm interested to hea

Re: [slurm-users] User submitted advance reservations? (SGE qrsub equivalent)

2019-05-16 Thread WRIGHT Lawrence
Hi Tina, Good call - I prefer that the the scripting option. Would prefer native support (FEATURE REQUEST!) but sudo is definitely better than my idea! Would probably make it so the user sudos to a service account with Operator permission rather than root as that'll be more tolerable from a pr

Re: [slurm-users] User submitted advance reservations? (SGE qrsub equivalent)

2019-05-16 Thread Tina Friedrich
Hi Lawrence, no, as far as I can tell, SLURM doesn't have any way to allow users to submit/create advance reservations. Could you get around it with sudo? It would be easy to allow a group of user to run 'sudo scontrol create ' (or a suitable wrapper script, to make the syntax easy). It'd

[slurm-users] User submitted advance reservations? (SGE qrsub equivalent)

2019-05-16 Thread WRIGHT Lawrence
Wondering if there's a way in SLURM for (appropriately permissioned) end users to submit Advance Reservations, in a similar manner to the "qrsub" command in Grid Engine? As far as I've been able to glean from the docs, the only way to do this would be to make the users in question Operators and

Re: [slurm-users] Issue with x11

2019-05-16 Thread Marcus Wagner
Hi Alan, we are also seeing this, but that has nothing to do with X11 support, since we compile atm. SLURM without X11 support. We also see sometimes jobs running on, even if e.g. mpi rank one got killed by oom, rank zero is stuck in mpi_finalize. SLURM seems to not detect everytimes, if oom ki

Re: [slurm-users] Issue with x11

2019-05-16 Thread Alan Orth
Yes I'm also looking forward to SLURM 19.05. We have had lots of issues with X11 since we upgraded to 18.08 and started using its built-in X11 support. Part of this was resolved by setting "X11Parameters=local_xauthority" in slurm.conf to reduce locking contention on the Xauthority file, but now we