Re: [slurm-users] Decreasing time limit of running jobs (notification)

2023-07-06 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA)
Given that the usual way to kill a job that's running is to use scancel, I would tend to agree that killing by shortening the walltime to below the already used time is likely to be an error, and deserves a warning.

Re: [slurm-users] Decreasing time limit of running jobs (notification)

2023-07-06 Thread Jason Simms
My opinion is no, at least not forced. On Thu, Jul 6, 2023 at 1:40 PM Amjad Syed wrote: > Agreed the point of greater responsibility but even rm -r ( without > f) gives a warning. In this case should slurm have that option ( > forced) especially if it can immediately kill a running

Re: [slurm-users] Decreasing time limit of running jobs (notification)

2023-07-06 Thread Amjad Syed
Agreed the point of greater responsibility but even rm -r ( without f) gives a warning. In this case should slurm have that option ( forced) especially if it can immediately kill a running job? On Thu, 6 Jul 2023, 18:16 Jason Simms, wrote: > An unfortunate example of the “with

Re: [slurm-users] Decreasing time limit of running jobs (notification)

2023-07-06 Thread Jason Simms
An unfortunate example of the “with great power comes great responsibility” maxim. Linux will gleefully let you rm -fr your entire system, drop production databases, etc., provided you have the right privileges. Ask me how I know… Still, I get the point. Would it be possible to somehow ask for con

Re: [slurm-users] Decreasing time limit of running jobs (notification)

2023-07-06 Thread Amjad Syed
Yes, the initial End Time was 7-00:00:00 but it allowed the typo (16:00:00) which caused the jobs to be killed without warning Amjad On Thu, Jul 6, 2023 at 5:27 PM Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) wrote: > Is the issue that the error in the time made it shorter than the ti

Re: [slurm-users] Decreasing time limit of running jobs (notification)

2023-07-06 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA)
Is the issue that the error in the time made it shorter than the time the job had already run, so it killed it immediately? On Jul 6, 2023, at 12:04 PM, Jason Simms mailto:jsim...@swarthmore.edu>> wrote: No, not a bug, I would say. When the time limit is reached, that's it, job dies. I wouldn'

[slurm-users] Register for SLUG '23!

2023-07-06 Thread Victoria Hobson
Slurm User Group 2023 (SLUG) standard pricing is now available through August 4th. Be sure to get your tickets before prices jump! This year's SLUG event will take place September 12-13 at Brigham Young University in Provo, Utah. A welcome reception is set for the evening of Monday, September 11th

Re: [slurm-users] Decreasing time limit of running jobs (notification)

2023-07-06 Thread Jason Simms
No, not a bug, I would say. When the time limit is reached, that's it, job dies. I wouldn't be aware of a way to manage that. Once the time limit is reached, it wouldn't be a hard limit if you then had to notify the user and then... what? How long would you give them to extend the time? Wouldn't be

[slurm-users] Decreasing time limit of running jobs (notification)

2023-07-06 Thread Amjad Syed
Hello We were trying to increase the time limit of a slurm running job scontrol update job= TimeLimit=16-00:00:00 But we accidentally got it to 16 hours scontrol update job= TimeLimit=16:00:00 This actually timeout and killed the running job and did not give any notification Is this a bug, sh

Re: [slurm-users] AllowGroups for Partition not working?

2023-07-06 Thread Matthias Leopold
Thanks a lot. That did the trick. Matthias Am 05.07.23 um 18:37 schrieb Xand Meaden: You should check whether the relevant group's members can be seen using command `getent group `. If not, you probably need to add/change the "winbind expand groups" option in smb.conf. Xand On 05/07/2023 17

Re: [slurm-users] Distribute a single node resources across multiple partitons

2023-07-06 Thread Jason Simms
Hello Purvesh, I'm not an expert in this, but I expect a common question would be, why are you wanting to do this? More information would be helpful. On the surface, it seems like you could just allocate two full nodes to each partition. You must have a reason why that is unacceptable, however. M

Re: [slurm-users] Distribute a single node resources across multiple partitons

2023-07-06 Thread Loris Bennett
Hi Purvesh, Purvesh Parmar writes: > Hi, > > Do I need separate slurmctld and slurmd to run for this? I am struggling for > this. Any pointers. > > -- > Purvesh > > On Mon, 26 Jun 2023 at 12:15, Purvesh Parmar wrote: > > Hi, > > I have slurm 20.11 in a cluster of 4 nodes, with each node havi

Re: [slurm-users] Distribute a single node resources across multiple partitons

2023-07-06 Thread Purvesh Parmar
Hi, Do I need separate slurmctld and slurmd to run for this? I am struggling for this. Any pointers. -- Purvesh On Mon, 26 Jun 2023 at 12:15, Purvesh Parmar wrote: > Hi, > > I have slurm 20.11 in a cluster of 4 nodes, with each node having 16 cpus. > I want to create two partitions (ppart and