> On Jun 12, 2018, at 11:08 AM, Prentice Bisbal <pbis...@pppl.gov> wrote: > > On 06/12/2018 12:33 AM, Chris Samuel wrote: > >> Hi Prentice! >> >> On Tuesday, 12 June 2018 4:11:55 AM AEST Prentice Bisbal wrote: >> >>> I to make this work, I will be using job_submit.lua to apply this logic >>> and assign a job to a partition. If a user requests a specific partition >>> not in line with these specifications, job_submit.lua will reassign the >>> job to the appropriate QOS. >> Yeah, that's very much like what we do for GPU jobs (redirect them to the >> partition with access to all cores, and ensure non-GPU jobs go to the >> partition with fewer cores) via the submit filter at present.. >> >> I've already coded up something similar in Lua for our submit filter (that >> only >> affects my jobs for testing purposes) but I still need to handle memory >> correctly, in other words only pack jobs when the per-task memory request * >> tasks per node < node RAM (for now we'll let jobs where that's not the case >> go >> through to the keeper for Slurm to handle as now). >> >> However, I do think Scott's approach is potentially very useful, by directing >> jobs < full node to one end of a list of nodes and jobs that want full nodes >> to the other end of the list (especially if you use the partition idea to >> ensure that not all nodes are accessible to small jobs). >> > This was something that was very easy to do with SGE. It's been a while since > I worked with SGE so I forget all the details, but in essence, you could > assign nodes a 'serial number' which would specify the preferred order in > which nodes would be assigned to jobs, and I believe that order was specific > to each queue, so if you had 64 nodes, one queue could assign jobs starting > at node 1 and work it's way up to node 64, while another queue could start at > node 64 and work its way down to node 1. This technique was mentioned in the > SGE documentation to allow MPI and shared memory jobs to share the cluster. > > At the time, I used it, for exactly that purpose, but I didn't think it was > that big a deal. Now that I don't have that capability, I miss it.
SLURM has the ability to do priority “weights” as well for nodes, to somewhat the same affect — so far as I know. At our site, though, that does not work as it apparently conflicts with the topology plugin, which we also use, instead of layering or something more useful. -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novos...@rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `'
signature.asc
Description: Message signed with OpenPGP
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf