Re: [slurm-users] ticking time bomb? launching too many jobs in parallel

Paul Edmon Fri, 30 Aug 2019 12:17:01 -0700

Yes, QoS's are dynamic.

-Paul Edmon-


On 8/30/19 2:58 PM, Guillaume Perrault Archambault wrote:

Hi Paul,

Thanks for your pointers.

I'll looking into QOS and MCS after my paper deadline (Sept 5). ReQOS, as expressed to Peter in the reply I just now sent, I wonder ifit the QOS of a job can be change while it's pending (submitted butnot yet running).


Regards,
Guillaume.

On Fri, Aug 30, 2019 at 10:24 AM Paul Edmon <ped...@cfa.harvard.edu<mailto:ped...@cfa.harvard.edu>> wrote:


    A QoS is probably your best bet.  Another variant might be MCS, which
    you can use to help reduce resource fragmentation.  For limits though
    QoS will be your best bet.

    -Paul Edmon-

    On 8/30/19 7:33 AM, Steven Dick wrote:
    > It would still be possible to use job arrays in this situation, it's
    > just slightly messy.
    > So the way a job array works is that you submit a single script, and
    > that script is provided an integer for each subjob.  The integer
    is in
    > a range, with a possible step (default=1).
    >
    > To run the situation you describe, you would have to
    predetermine how
    > many of each test you want to run (i.e., you coudln't dynamically
    > change the number of jobs that run within one array)., and a master
    > script would map the integer range to the job that was to be
    started.
    >
    > The most trivial way to do it would be to put the list of
    regressions
    > in a text file and the master script would index it by line
    number and
    > then run the appropriate command.
    > A more complex way would be to do some math (a divide?) to get the
    > script name and subindex (modulus?) for each regression.
    >
    > Both of these would require some semi-advanced scripting, but
    nothing
    > that couldn't be cut and pasted with some trivial modifications for
    > each job set.
    >
    > As to the unavailability of the admin ...
    > An alternate approach that would require the admin's help would
    be to
    > come up with a small set of alocations (e.g., 40 gpus, 80 gpus, 100
    > gpus, etc.) and make a QOS for each one with a gpu limit (e.g.,
    > maxtrespu=gpu=40 ) Then the user would assign that QOS to the
    job when
    > starting it to set the overall allocation for all the jobs.  The
    admin
    > woudln't need to tweak this except once, you just pick which
    tweak to
    > use.
    >
    > On Fri, Aug 30, 2019 at 2:36 AM Guillaume Perrault Archambault
    > <gperr...@uottawa.ca <mailto:gperr...@uottawa.ca>> wrote:
    >> Hi Steven,
    >>
    >> Thanks for taking the time to reply to my post.
    >>
    >> Setting a limit on the number of jobs for a single array isn't
    sufficient because regression-tests need to launch multiple
    arrays, and I would need a job limit that would take effect over
    all launched jobs.
    >>
    >> It's very possible I'm not understand something. I'll lay out a
    very specific example in the hopes you can correct me if I've gone
    wrong somewhere.
    >>
    >> Let's take the small cluster with 140 GPUs and no fairshare as
    an example, because it's easier for me to explain.
    >>
    >> The users, who all know each other personally and interact via
    chat, decide on a daily basis how many jobs each user can run at a
    time.
    >>
    >> Let's say today is Sunday (hypothetically). Nobody is actively
    developing today, except that user 1 has 10 jobs running for the
    entire weekend. That leaves 130 GPUs unused.
    >>
    >> User 2, whose jobs all run on 1 GPU decides to run a regression
    test. The regression test comprises of 9 different scripts each
    run 40 times, for a grand total of 360 jobs. The duration of the
    scripts vary from 1 and 5 hours to complete, and the jobs take on
    average 4 hours to complete.
    >>
    >> User 2 gets the user group's approval (via chat) to use 90 GPUs
    (so that 40 GPUs will remain for anyone else wanting to work that
    day).
    >>
    >> The problem I'm trying to solve is this: how do I ensure that
    user 2 launches his 360 jobs in such a way that 90 jobs are in the
    run state consistently until the regression test is finished?
    >>
    >> Keep in mind that:
    >>
    >> limiting each job array to 10 jobs is inefficient: when the
    first job array finishes (long before the last one), only 80 GPUs
    will be used, and so on as other arrays finish
    >> the admin is not available, he cannot be asked to set a hard
    limit of 90 jobs for user 2 just for today
    >>
    >> I would be happy to use job arrays if they allow me to set an
    overarching job limit across multiple arrays. Perhaps this is
    doable. Admttedly I'm working on a paper to be submitted in a few
    days, so I don't have time to test jobs arrays thoroughly, but I
    will try out job arrays more thoroughly once I've submitted my
    paper (ie after sept 5).
    >>
    >> My solution, for now, is to not use job arrays. Instead, I
    launch each job individually, and I use singleton (by launching
    all jobs with the same 90 unique names) to ensure that exactly 90
    jobs are run at a time (in this case, corresponding to 90 GPUs in
    use).
    >>
    >> Side note: the unavailability of the admin might sound
    contrived by picking Sunday as an example, but it's in fact very
    typical. The admin is not available:
    >>
    >> on weekends (the present example)
    >> at any time outside of 9am to 5pm (keep in mind, this is a
    cluster used by students in different time zones)
    >> any time he is on vacation
    >> anytime the he is looking after his many other
    responsibilities. Constantly setting user limits that change on a
    daily basis would be too much too ask.
    >>
    >>
    >> I'd be happy if you corrected my misunderstandings, especially
    if you could show me how to set a job limit that takes effect over
    multiple job arrays.
    >>
    >> I may have very glaring oversights as I don't necessarily have
    a big picture view of things (I've never been an admin, most
    notably), so feel free to poke holes at the way I've constructed
    things.
    >>
    >> Regards,
    >> Guillaume.
    >>
    >>
    >> On Fri, Aug 30, 2019 at 1:22 AM Steven Dick <kg4...@gmail.com
    <mailto:kg4...@gmail.com>> wrote:
    >>> This makes no sense and seems backwards to me.
    >>>
    >>> When you submit an array job, you can specify how many jobs
    from the
    >>> array you want to run at once.
    >>> So, an administrator can create a QOS that explicitly limits
    the user.
    >>> However, you keep saying that they probably won't modify the
    system
    >>> for just you...
    >>>
    >>> That seems to me to be the perfect case to use array jobs and
    tell it
    >>> how many elements of the array to run at once.
    >>> You're not using array jobs for exactly the wrong reason.
    >>>
    >>> On Tue, Aug 27, 2019 at 1:19 PM Guillaume Perrault Archambault
    >>> <gperr...@uottawa.ca <mailto:gperr...@uottawa.ca>> wrote:
    >>>> The reason I don't use job arrays is to be able limit the
    number of jobs per users

Re: [slurm-users] ticking time bomb? launching too many jobs in parallel

Reply via email to