Re: [slurm-users] EXTERNAL-Re: Block jobs on GPU partition when GPU is not specified

2021-09-27 Thread Renfro, Michael
On a quick read, it did look correct. From: slurm-users on behalf of Ratnasamy, Fritz Date: Monday, September 27, 2021 at 1:59 PM To: Slurm User Community List Subject: Re: [slurm-users] EXTERNAL-Re: Block jobs on GPU partition when GPU is not specified External Email Warning This email ori

Re: [slurm-users] EXTERNAL-Re: Block jobs on GPU partition when GPU is not specified

2021-09-27 Thread Ratnasamy, Fritz
Does the script below look correct? function slurm_job_submit(job_desc, part_list, submit_uid) if job_desc.partition == 'gpu' then if (job_desc.gres == nil) then slurm.log_info("User did not specified gres=gpu: ")

Re: [slurm-users] EXTERNAL-Re: Block jobs on GPU partition when GPU is not specified

2021-09-27 Thread Renfro, Michael
Might need a restart of slurmctld at most, I expect. From: slurm-users on behalf of Ratnasamy, Fritz Date: Monday, September 27, 2021 at 12:32 PM To: Slurm User Community List Subject: Re: [slurm-users] EXTERNAL-Re: Block jobs on GPU partition when GPU is not specified External Email Warning

Re: [slurm-users] Possible bug with Prologslurmctld and Epilogslurmctld scripts?

2021-09-27 Thread Brian Andrus
Those would be considered separate for each job. You may want to have your prolog check to see if there is an epilogue running and wait for the epilogue to be done before starting its prolog work. Brian Andrus On 9/27/2021 9:15 AM, Joe Teumer wrote: Should the Prologslurmctld script only run

Re: [slurm-users] EXTERNAL-Re: Block jobs on GPU partition when GPU is not specified

2021-09-27 Thread Ratnasamy, Fritz
Hi Michael Renfro, Thanks for your reply. Based on your answers, would this work: 1/ a function job_submit.lua with the following contents (just need a function that errored when gres:gpu is not specified in srun or in sbatch): function slurm_job_submit(job_desc, part_list, submit_uid) i

[slurm-users] Possible bug with Prologslurmctld and Epilogslurmctld scripts?

2021-09-27 Thread Joe Teumer
Should the Prologslurmctld script only run after the Epilogslurmctld script finishes? Below you can see JobA runs and completes. While Epilogslurmctld (from JobA Node A) is executing on the Slurm controller the Prologslurmctld script for the next job (from Job B Node A) is also running on the Slur

[slurm-users] Information about lack of resources

2021-09-27 Thread Pavel Vashchenkov
Hi all There are some jobs in queue with message "(Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions)" But there are free nodes in the same time. My question: Is there a script to show what the resources are required and why exactly the queue system