[slurm-users] Temporarily bypassing pam_slurm_adopt.so

2024-07-08 Thread Daniel L'Hommedieu via slurm-users
Hi, all. We have a use case where we need to allow a group of users (members of an LDAP group, which I can easily add to a Linux group) to SSH to a compute node, without disabling pam_slurm_adopt.so. Is there a way to do this? We can add users to the sudo group, which will bypass pam_slurm_ad

Re: [slurm-users] Database cluster

2024-01-23 Thread Daniel L'Hommedieu
ieve HA for our Slurm databases. > There is a single virtual IP that will be kept on one of the cluster's > servers using keepalived. > > Regards, > Xand > From: slurm-users <mailto:slurm-users-boun...@lists.schedmd.com>> on behalf of Daniel > L'Hommedieu

Re: [slurm-users] Database cluster

2024-01-23 Thread Daniel L'Hommedieu
03:23, Diego Zuccato wrote: > > IIUC the database is not "critical": if it goes down, you lose access to some > statistics. But job data gets cached anyway and the db will be updated when > it comes back online. > > Diego > > Il 22/01/2024 18:23, Daniel

[slurm-users] Database cluster

2024-01-22 Thread Daniel L'Hommedieu
Community: What do you do to ensure database reliability in your SLURM environment? We can have multiple controllers and multiple slurmdbds, but my understanding is that slurmdbd can be configured with a single MySQL server, so what do you do? Do you have that “single MySQL server” be a clust

[slurm-users] Problem with srun on ARM Ubuntu servers

2023-07-21 Thread Daniel L'Hommedieu
Hi, everyone. My team runs a SLURM cluster, currently SLURM17, but we are working to upgrade to 22, of about 800 servers. We currently have only x64 front-end servers, but we are looking to add some ARM servers. I have deployed some new ARM front end servers in exactly the same way the x64 on