[slurm-users] Problem building slurm with PMIx

2024-02-14 Thread Patrick Begou via slurm-users
Hi ! I manage a small CentOS8 cluster using slurm  slurm-20.11.7-1 and OpenMPI built from sources. - I know this OS is not maintained any more and I need to negotiate downtime to reinstall - I know Slurm 20.11.7 has security issue (I've built it from source some years ago with rpmbuild -ta --w

[slurm-users] Re: Question about IB and Ethernet networks

2024-03-03 Thread Patrick Begou via slurm-users
Hi Josef, on a cluster using pxe boot and automatic (re) installation of nodes, I do not think you can do this with IPoIB on an infiniband interface. On my cluster nodes I have: - 1Gb ethernet network for OOB - 10 or 25Gb ethernet for session, automatic deployment and management - IB HDR100 fo

[slurm-users] First setup of slurm with a GPU node

2024-11-13 Thread Patrick Begou via slurm-users
Hi, I'm using slurm on a small 8 nodes cluster. I've recently added one GPU node with two Nvidia A100, one with 40Gb of RAM and one with 80Gb. As using this GPU resource increase I would like to manage this resource with Gres to avoid usage conflict. But at this time my setup do not works as

[slurm-users] Re: First setup of slurm with a GPU node

2024-11-13 Thread Patrick Begou via slurm-users
Le 13/11/2024 à 15:45, Roberto Polverelli Monti via slurm-users a écrit : Hello Patrick, On 11/13/24 12:01 PM, Patrick Begou via slurm-users wrote: As using this GPU resource increase I would like to manage this resource with Gres to avoid usage conflict. But at this time my setup do not

[slurm-users] Re: First setup of slurm with a GPU node

2024-11-13 Thread Patrick Begou via slurm-users
0-80:1 * Ben On 13/11/2024 16:00, Patrick Begou via slurm-users wrote: This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. Le 13/11/2024 à 15:45, Roberto Polverell

[slurm-users] Re: sinfo not listing any partitions

2024-11-28 Thread Patrick Begou via slurm-users
Hi Kent, on your management node could you run: systemctl status slurmctld and check your 'Nodename=' and 'PartitionName=...' in /etc/slurm.conf ? In my slurm.conf I have a more detailed description and the Nodename Keyword start with an upper case (do'nt know if slurm.conf is case sensit

[slurm-users] slurm releases

2025-04-01 Thread Patrick Begou via slurm-users
Hi slurm team, I would ask some clarifications with slurm releases. Why two versions of slurm are available ? I speak of 24.05.7 versus 24.11.3 on https://www.schedmd.com/slurm-support/release-announcements  and announces made on this list ? I'm managing small clusters in a french public r

[slurm-users] Re: Setting QoS with slurm 24.05.7

2025-04-22 Thread Patrick Begou via slurm-users
he cluster's limit. To clear a previously set value use the modify command with a new value of -1 for each TRES id.    - sacctmgr(1) The "MaxCPUs" is a limit on the number of CPUs the association can use.  -- Michael On Fri, Apr 18, 2025 at 8:01 AM Patrick Begou via slurm-us

[slurm-users] Setting QoS with slurm 24.05.7

2025-04-18 Thread Patrick Begou via slurm-users
Hi all, I'm trying to setup a QoS on a small 5 nodes cluster running slurm 24.05.7. My goal is to limit the resources on a (time x number of cores) strategy to avoid one large job requesting all the resources for too long time. I've read from https://slurm.schedmd.com/qos.html and some discus

[slurm-users] Re: Setting QoS with slurm 24.05.7

2025-04-25 Thread Patrick Begou via slurm-users
Us" is a limit on the number of CPUs the association can use.  -- Michael On Fri, Apr 18, 2025 at 8:01 AM Patrick Begou via slurm-users wrote: Hi all, I'm trying to setup a QoS on a small 5 nodes cluster running slurm 24.05.7. My goal is to