[slurm-users] associations, limits,qos

2021-01-22 Thread Nizar Abed
Hi list, I’m trying to enforce limits based on associations, but behavior is not as expected. In slurm.conf: AccountingStorageEnforce=associations,limit,qos Two partitions: part1.q part2.q One user: user1 One QOS: qos1 MaxJobsPU is not set I’d like to have an association for user 1 for each

Re: [slurm-users] Cluster nodes on multiple cluster networks

2021-01-22 Thread William Brown
I think there would be no reason why a slurm node will care about traffic on multiple interfaces as long as your configuration is set to listen on them, e.g. no firewalld rules in the way restricting traffic to the private network. William From: slurm-users On Behalf Of Sajesh Singh Sen

Re: [slurm-users] Cluster nodes on multiple cluster networks

2021-01-22 Thread Sajesh Singh
Thank you for the recommendation. Will try that out. Unfortunately the on-prem nodes cannot reach the head node via the public IP -Sajesh- From: slurm-users On Behalf Of Michael Gutteridge Sent: Friday, January 22, 2021 3:18 PM To: Slurm User Community List Subject: Re: [slurm-users] Cluster

Re: [slurm-users] Cluster nodes on multiple cluster networks

2021-01-22 Thread Michael Gutteridge
I don't believe the IP address is required- if you can configure a DNS/hosts entry differently for cloud nodes you can set: SlurmCtldhost = controllername Then have "controllername" resolve to the private IP for the controller for the on-prem cluster, the public IP for the nodes in the cloud.

Re: [slurm-users] Cluster nodes on multiple cluster networks

2021-01-22 Thread Sajesh Singh
How would I deal with the address of the head node defined in the slurm.conf as I have it defined as SlurmctldHost=private-hostname(private.ip.addr) The private.ip.addr address is not reachable from the cloud nodes -Sajesh- From: slurm-users On Behalf Of Brian Andrus Sent: Friday, January 22

Re: [slurm-users] Cluster nodes on multiple cluster networks

2021-01-22 Thread Brian Andrus
You would need to have a direct connect/vpn so the cloud nodes can connect to your head node. Brian Andrus On 1/22/2021 10:37 AM, Sajesh Singh wrote: We are looking at rolling out cloud bursting to our on-prem Slurm cluster and I am wondering how to deal with the slurm.conf variable Slurmct

[slurm-users] Cluster nodes on multiple cluster networks

2021-01-22 Thread Sajesh Singh
We are looking at rolling out cloud bursting to our on-prem Slurm cluster and I am wondering how to deal with the slurm.conf variable SlurmctldHost. It is currently configured with the private cluster network address that the on-prem nodes use to contact it. The nodes in the cloud would contact

Re: [slurm-users] Using "Environment Modules" in a SLURM script

2021-01-22 Thread Peter Kjellström
On our slurm clusters the module system (Lmod) works without extra init in job scripts due to the environment-forwarding in slurm. "module" in the submitting context (in bash) on the login node is an "exported" function and as such makes it across. /Peter On Fri, 22 Jan 2021 10:41:06 + Gestió

[slurm-users] Validating SLURM sreport cluster utilization report

2021-01-22 Thread David Simpson
Hi, We've been using the sreport cluster utilization report to report on Down time and therefore produce an uptime figure for the entire cluster. Which we hope will be above 99% or very close to, for every month of the year. Most of the time the figure that comes back is one that fits the perce

Re: [slurm-users] Using "Environment Modules" in a SLURM script

2021-01-22 Thread Thomas M. Payerle
On our clusters, we typically find that an explicit source of the initialization dot files is need IF the default shell of the user submitting the job does _not_ match the shell being used to run the script. I.e., for sundry historical and other reasons, the "default" login shell for users on our

[slurm-users] Using "Environment Modules" in a SLURM script

2021-01-22 Thread Gestió Servidors
Hello, I use "Environment Modules" (http://modules.sourceforge.net/) in my SLURM cluster. In my scripts I do need to add an explicit "source /soft/modules-3.2.10/Modules/3.2.10/init/bash". However, in several examples I have read about SLURM scripts, nobody comments that. So, have I forgotten a

[slurm-users] OpenMP job and not expected results

2021-01-22 Thread Gestió Servidors
Hello, I'm running this script in a cluster composed by 11 nodes, each one with 1 processor with 4 cores and 1 thread per core: #!/bin/bash #SBATCH --job-name=hellohybrid #SBATCH --output=hellohybrid.out #SBATCH --ntasks=4 #SBATCH --cpus-per-task=3 #SBATCH --partition=nodes # Load the default