Hi, Thanks for the reply. I already went through this 🙁. I checked all nodes, id works as does a ssh login.
[root@node4 ~]# id xxxjone...@xxx.ac.nz uid=1204805830(xxxjone...@xxx.ac.nz) gid=1204805830(xxxjone...@xxx.ac.nz) 8><--- Connection to node1 closed. [root@xxxunicobuildt1 warewulf]# ssh xxxjone...@xxx.ac.nz@node4 (xxxjone...@xxx.ac.nz@node4) Password: [xxxjone...@xxx.ac.nz@node4 ~]$ whoami xxxjone...@xxx.ac.nz [xxxjone...@xxx.ac.nz@node4 ~]$ regards Steven ________________________________ From: Chris Samuel via slurm-users <slurm-users@lists.schedmd.com> Sent: Monday, 3 February 2025 10:00 am To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com> Subject: [slurm-users] Re: RHEL8.10 V slurmctld On 29/1/25 10:44 am, Steven Jones via slurm-users wrote: > "2025-01-28T21:48:50.271] sched: Allocate JobId=4 NodeList=node4 #CPUs=1 > Partition=debug > [2025-01-28T21:48:50.280] Killing non-startable batch JobId=4: Invalid > user id" Looking at the source code it looks like that second error is reported back by slurmctld when it sends the RPC out to the compute node and it gets a response back, so I would look at what's going on with node4 to see what's being reported there. All the best, Chris -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com