Re: [slurm-users] A Slurm topological scheduling question

2021-12-07 Thread Ole Holm Nielsen
Hi David, The topology.conf file groups nodes into sets such that parallel jobs will not be scheduled by Slurm across disjoint sets. Even though the topology.conf man-page refers to network switches, it's really about topology rather than network. You may use fake (non-existing) switch name

Re: [slurm-users] A Slurm topological scheduling question

2021-12-07 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA)
You can schedule jobs across the two racks, with any given job only using one rack, by specifying #SBATCH --partition rack1,rack2 It'll only use 1 partition, in order of priority (not liti I never found a way for topology to do that - all I could get it to do is to prefer to keep things within a

Re: [slurm-users] A Slurm topological scheduling question

2021-12-07 Thread Paul Edmon
This should be fine assuming you don't mind the mismatch in CPU speeds.  Unless the codes are super sensitive to topology things should be okay as modern IB is wicked fast. In our environment here we have a variety of different hardware types all networked together on the same IB fabric.  Tha