[slurm-users] REST API Error Slurmdb

2022-11-16 Thread 김종록
Hello everyone.   I installed Slurm 20.11.09 and installed mariadb, slurmdb and slurmrest normally. After issuing the jwt token, API test was performed. /slurm/v0.0.36/ping /slurm/v0.0.36/diag /slurm/v0.0.36/jobs /slurm/v0.0.36/job/submit I tested this APIs and there was no special pr

[slurm-users] Ignore state file recover error at first starting of slurmctld

2022-11-16 Thread 김종록
Hello Everyone,   When I started slurmctld for the first time, the following error message is displayed.   ... slurmctld: error: Could not open node state file /var/spool/slurm/ctld/node_state: No such file or directory slurmctld: error: NOTE: Trying backup state save file. Information 

Re: [slurm-users] NVIDIA MIG question

2022-11-16 Thread Groner, Rob
That does help, thanks for the extra info. If I have two separate GPU cards in the node, and I setup 7 MIGs on each card, for a total of 14 MIG "gpus" in the node...then, SHOULD I be able to salloc requesting, say 10 GPUs (7 from 1 card, 3 from the other)? Because I can't. I can request up to

Re: [slurm-users] SLURM in K8s, any advice?

2022-11-16 Thread Urban Borštnik
Hi Hans, We run Slurm in k8s at the ETH Zurich to manage physical compute nodes. The link you include and Nicolas's followup already contain the basics. We build several Docker containers based on CentOS 7 (for now) with Slurm compiled from source for the following services: * slurmdbd

Re: [slurm-users] NVIDIA MIG question

2022-11-16 Thread Yair Yarom
Hi, >From what we observed, Slurm sees the MIGs each as a distinct gres/gpu. So you can have 14 jobs each using a different MIG. However (unless something has changed in the past year), due to nvidia limitations, a single process can't access more than one MIG simultaneously (this is unrelated to