Re: [slurm-users] srun using infiniband
Hello Anne, On 01/09/2022 02:01:53, Anne Hammond wrote: We have a CentOS 8.5 cluster slurm 20.11 Mellanox ConnectX 6 HDR IB and Mellanox 32 port switch Our application is not scaling. I
Re: [slurm-users] Jobs cancelled due to job requeue
On 02-09-2022 20:52, Nicolas Sonoda wrote: I'm submiting a job but after a few seconds it got cancelled and the Slurm output file show this message: slurmstepd: error: *** JOB 23883 ON gn01 CANCELLED AT 2022-09-02T14:28:19 DUE TO JOB REQUEUE *** After this the job turn into PD state on queue