Just to revisit this, for jobs that are queued, but prevented from running, will have a more useful reason in 18.08, which will address one of my issues with reservation collisions.

https://bugs.schedmd.com/show_bug.cgi?id=5138

https://bugs.schedmd.com/show_bug.cgi?id=4987

Prentice Bisbal
Lead Software Engineer
Princeton Plasma Physics Laboratory
http://www.pppl.gov

On 05/11/2018 10:36 AM, Douglas Jacobsen wrote:
A feature that many slurm users might like is sbatch --time-min.  Using both --time-min and --time a user can specify the range of acceptable wall times limits.  This can make it much easier to keep jobs running right  up to the maintenance reservation.  e.g.:

sbatch --time-min=30:00 --time=48:00:00 script.sh

would allow the job to schedule for any time-slot between 30 minutes and 2 days in length.  If the user has some mechanism for job chaining or similar, this can allow them to make the most of backfill opportunities.

-Doug

----
Doug Jacobsen, Ph.D.
NERSC Computer Systems Engineer
National Energy Research Scientific Computing Center <http://www.nersc.gov>
dmjacob...@lbl.gov <mailto:dmjacob...@lbl.gov>

------------- __o
---------- _ '\<,_
----------(_)/  (_)__________________________



On Fri, May 11, 2018 at 7:27 AM Paul Edmon <ped...@cfa.harvard.edu <mailto:ped...@cfa.harvard.edu>> wrote:

    In the past we used the LUA job submit plugin to block jobs that
    would
    intersect maintenance reservations.  I would look at that.

    -Paul Edmon-


    On 05/11/2018 08:19 AM, Bill Wichser wrote:
    > The problem is that reservations can be in there yet have no
    effect on
    > the submitted job if they would run before the reservation takes
    > place. One can pull the starting time simply using something
    like this
    >
    > scontrol show res -o | awk '{print $2}'
    >
    > with output
    >
    > StartTime=2018-06-12T06:00:00
    > StartTime=2018-06-12T06:00:00
    >
    > You'd need more code around that, obviously, to determine if this
    > starttime might hold up the job.
    >
    > Bill
    >
    >
    > On 05/10/2018 04:23 PM, Prentice Bisbal wrote:
    >> Dear Slurm Users,
    >>
    >> We've started using maintenance reservations. As you would expect,
    >> this caused some confusion for users who were wondering why their
    >> jobs were queuing up and not running. Some of my users provide a
    >> public service of sorts that automatically submits jobs to our
    >> cluster. They would like to have their submission framework
    >> automatically detect if there's a reservation that may
    interfere with
    >> their jobs, and act accordingly.
    >>
    >> What is the best way to do this? Typically, in my shell scripts, I
    >> have some command that tests something, and then check exit code
    >> returned by the command. For example to check if my name is in
    file
    >> 'foo.txt', I'd do something like this:
    >>
    >> grep -iq prentice foo.txt
    >> retval=$?
    >> if [ $retval -eq 0 ]; then
    >>      echo "Prentice found"
    >> else
    >>      echo "Prentice not found"
    >> fi
    >> unset retval
    >>
    >> Or something like that. I was also thinking this might work, too:
    >>
    >> num_res=$(scontrol -o show res  | wc -l)
    >> if [ $num_res -eq 0 ]; then
    >>      echo "No reservations found"
    >> else
    >>      echo "$num_res reservation(s) found"
    >> fi
    >>
    >> Are there any better or other ways that you would recommend?
    Also, if
    >> there's more than one, is are they listed in any kind of order
    in the
    >> scontrol or sinfo output (soonest first, soonest last, etc.)? From
    >> the man page, it looks like 'scontrol show reservation' doesn't
    >> provide any sorting.
    >>
    >> Prentice
    >>
    >>
    >>
    >>
    >>
    >



Reply via email to