Hi
I made some progress trying to understand the problem i reported some weeks ago:
https://lists.schedmd.com/pipermail/slurm-users/2023-May/010027.html
I noticed that the intermittent connection timeout that i am experiencing
occurs only
when using the tcp based direct connection to establi
Alexander Grund wrote:
> Our first approach with `scancel $SLURM_JOB_ID; exit 1` doesn't seem to
> work as the (sbatch) job still gets re-queued.
Try to exit with 0, because it's not your prolog that failed.
Hi all,
I'm trying to set up GPU Gres Types to correctly identify the installed
hardware (generation and memory size). I'm using a mix of explicit
configuration (to set a friendly type name) and autodetection (to handle the
cores and links detection). I'm seeing two related issues which I don't