A feature that many slurm users might like is sbatch --time-min. Using both --time-min and --time a user can specify the range of acceptable wall times limits. This can make it much easier to keep jobs running right up to the maintenance reservation. e.g.:
sbatch --time-min=30:00 --time=48:00:00 script.sh would allow the job to schedule for any time-slot between 30 minutes and 2 days in length. If the user has some mechanism for job chaining or similar, this can allow them to make the most of backfill opportunities. -Doug ---- Doug Jacobsen, Ph.D. NERSC Computer Systems Engineer National Energy Research Scientific Computing Center <http://www.nersc.gov> dmjacob...@lbl.gov ------------- __o ---------- _ '\<,_ ----------(_)/ (_)__________________________ On Fri, May 11, 2018 at 7:27 AM Paul Edmon <ped...@cfa.harvard.edu> wrote: > In the past we used the LUA job submit plugin to block jobs that would > intersect maintenance reservations. I would look at that. > > -Paul Edmon- > > > On 05/11/2018 08:19 AM, Bill Wichser wrote: > > The problem is that reservations can be in there yet have no effect on > > the submitted job if they would run before the reservation takes > > place. One can pull the starting time simply using something like this > > > > scontrol show res -o | awk '{print $2}' > > > > with output > > > > StartTime=2018-06-12T06:00:00 > > StartTime=2018-06-12T06:00:00 > > > > You'd need more code around that, obviously, to determine if this > > starttime might hold up the job. > > > > Bill > > > > > > On 05/10/2018 04:23 PM, Prentice Bisbal wrote: > >> Dear Slurm Users, > >> > >> We've started using maintenance reservations. As you would expect, > >> this caused some confusion for users who were wondering why their > >> jobs were queuing up and not running. Some of my users provide a > >> public service of sorts that automatically submits jobs to our > >> cluster. They would like to have their submission framework > >> automatically detect if there's a reservation that may interfere with > >> their jobs, and act accordingly. > >> > >> What is the best way to do this? Typically, in my shell scripts, I > >> have some command that tests something, and then check exit code > >> returned by the command. For example to check if my name is in file > >> 'foo.txt', I'd do something like this: > >> > >> grep -iq prentice foo.txt > >> retval=$? > >> if [ $retval -eq 0 ]; then > >> echo "Prentice found" > >> else > >> echo "Prentice not found" > >> fi > >> unset retval > >> > >> Or something like that. I was also thinking this might work, too: > >> > >> num_res=$(scontrol -o show res | wc -l) > >> if [ $num_res -eq 0 ]; then > >> echo "No reservations found" > >> else > >> echo "$num_res reservation(s) found" > >> fi > >> > >> Are there any better or other ways that you would recommend? Also, if > >> there's more than one, is are they listed in any kind of order in the > >> scontrol or sinfo output (soonest first, soonest last, etc.)? From > >> the man page, it looks like 'scontrol show reservation' doesn't > >> provide any sorting. > >> > >> Prentice > >> > >> > >> > >> > >> > > > > >