[slurm-users] Re: [EXT] Re: slurm_pam_adopt module not working

2025-06-16 Thread William Brown via slurm-users
You say that you modified the file in a different way. It may be worth checking file permissions as for some security functions files can be ignored if they don't have the required permissions. That said, that would show in the journal/ logs. William On Tue, 17 Jun 2025, 06:24 Ratnasamy, Fritz v

[slurm-users] Re: Implementing a "soft" wall clock limit

2025-06-16 Thread Loris Bennett via slurm-users
Hi Prentice, Prentice Bisbal via slurm-users writes: > I think the idea of having a generous default timelimit is the wrong way to > go. In fact, I think any defaults for jobs are a bad way to go. The majority > of your > users will just use that default time limit, and backfill scheduling w

[slurm-users] Re: [EXT] Re: slurm_pam_adopt module not working

2025-06-16 Thread Ratnasamy, Fritz via slurm-users
Yes the file exists in /usr/lib64/security/. Best, *Fritz Ratnasamy*Data Scientist Information Technology On Tue, Jun 17, 2025 at 12:17 AM Sean Crosby wrote: > Hi Fritz, > > Does pam_slurm_adopt.so exist in the right location on the node? Normally > on EL hosts it would be /usr/lib64/securi

[slurm-users] Re: [EXT] Re: slurm_pam_adopt module not working

2025-06-16 Thread Sean Crosby via slurm-users
Hi Fritz, Does pam_slurm_adopt.so exist in the right location on the node? Normally on EL hosts it would be /usr/lib64/security/pam_slurm_adopt.so # ls /usr/lib64/security/pam_slurm_adopt.so -la -rwxr-xr-x 1 root root 291936 Mar 4 12:44 /usr/lib64/security/pam_slurm_adopt.so If the file doesn

[slurm-users] Re: slurm_pam_adopt module not working

2025-06-16 Thread Ratnasamy, Fritz via slurm-users
Thanks, for some reason I edited the /etc/pam.d/sshd via ansible but that locked all users to the cluster. That same file works on a different cluster where the files are pushed via puppet but with ansible it looks like it is locking all users to the cluster. See below config file sshd: auth

[slurm-users] Re: MIG H100 with xeon Intel

2025-06-16 Thread Patryk Bełzak via slurm-users
Which hardware platform is this? We've had the same issue on Dell with H100 even without MIG setup, we've had to restart the slurmd daemon after boot in order to make sure that everything is fine. Patryk. On 25/06/12 01:46, Richard Lefebvre via slurm-users wrote: [-- Type: text/plain; charset=

[slurm-users] Re: Implementing a "soft" wall clock limit

2025-06-16 Thread Prentice Bisbal via slurm-users
I think the idea of having a generous default timelimit is the wrong way to go. In fact, I think any defaults for jobs are a bad way to go.  The majority of your users will just use that default time limit, and backfill scheduling will remain useless to you. Instead, I recommend you use your j