Il 30/11/2021 16:12, Benjamin Nacar ha scritto:
However, the version of Slurm in the standard debian repositories was
apparently not compiled on a system with the necessary Nvidia library installed,
That's not a good news :( I have a GPU node arriving by the end of the
year. Does it only impact
Hi! Does anyone know what could the the cause of such error?
I have a shared home, slurm 20.11.8 and i try a simple script in the submit
directory
which is in the home that is nfs shared...
also i have job_container.conf defined, but i have no idea if this is a
problem..
Thank you!
Adrian
Hi,
We're trying to use Slurm's built-in Nvidia GPU detection mechanism to avoid
having to specify GPUs explicitly in slurm.conf and gres.conf. We're running
Debian 11, and the version of Slurm available for Debian 11 is 20.11. However,
the version of Slurm in the standard debian repositories w
Hi,
Thanks for your feedback.
It seems we are in the 1st case, but then looking deeper: for SL7 node we
didn’t encounter the problem thanks to this service configuration (*).
So the solution seems to configure KillMode=process as mention there (**): we
will still have jobs listed when doing a 's
Hi,I had the same issue with ntpd. My ntp service on clients did not synchronize because the drift with the ntp server was too large.Maybe you can synchronize with ntpdate before using ntp service on your clients.Regards,Le 30 nov. 2021 12:23, Gestió Servidors a écrit :
Hello,
In last days,
Hello,
In last days, my nodes are showing error "slurm_receive_msg_and_forward: Zero
Bytes were transmitted or received". After reviewing all configuration, I have
notice that problem is the time difference between nodes and server. If nodes
are "bad" configured (time in the future or in the pa