That’s great news — this is is a vFAQ at our site. > On Jun 13, 2018, at 1:37 PM, Prentice Bisbal <pbis...@pppl.gov> wrote: > > Just to revisit this, for jobs that are queued, but prevented from running, > will have a more useful reason in 18.08, which will address one of my issues > with reservation collisions. > https://bugs.schedmd.com/show_bug.cgi?id=5138 > https://bugs.schedmd.com/show_bug.cgi?id=4987 > > Prentice Bisbal > Lead Software Engineer > Princeton Plasma Physics Laboratory > > http://www.pppl.gov > On 05/11/2018 10:36 AM, Douglas Jacobsen wrote: >> A feature that many slurm users might like is sbatch --time-min. Using both >> --time-min and --time a user can specify the range of acceptable wall times >> limits. This can make it much easier to keep jobs running right up to the >> maintenance reservation. e.g.: >> >> sbatch --time-min=30:00 --time=48:00:00 script.sh >> >> would allow the job to schedule for any time-slot between 30 minutes and 2 >> days in length. If the user has some mechanism for job chaining or similar, >> this can allow them to make the most of backfill opportunities. >> >> -Doug >> >> ---- >> Doug Jacobsen, Ph.D. >> NERSC Computer Systems Engineer >> National Energy Research Scientific Computing Center >> dmjacob...@lbl.gov >> >> ------------- __o >> ---------- _ '\<,_ >> ----------(_)/ (_)__________________________ >> >> >> >> On Fri, May 11, 2018 at 7:27 AM Paul Edmon <ped...@cfa.harvard.edu> wrote: >> In the past we used the LUA job submit plugin to block jobs that would >> intersect maintenance reservations. I would look at that. >> >> -Paul Edmon- >> >> >> On 05/11/2018 08:19 AM, Bill Wichser wrote: >> > The problem is that reservations can be in there yet have no effect on >> > the submitted job if they would run before the reservation takes >> > place. One can pull the starting time simply using something like this >> > >> > scontrol show res -o | awk '{print $2}' >> > >> > with output >> > >> > StartTime=2018-06-12T06:00:00 >> > StartTime=2018-06-12T06:00:00 >> > >> > You'd need more code around that, obviously, to determine if this >> > starttime might hold up the job. >> > >> > Bill >> > >> > >> > On 05/10/2018 04:23 PM, Prentice Bisbal wrote: >> >> Dear Slurm Users, >> >> >> >> We've started using maintenance reservations. As you would expect, >> >> this caused some confusion for users who were wondering why their >> >> jobs were queuing up and not running. Some of my users provide a >> >> public service of sorts that automatically submits jobs to our >> >> cluster. They would like to have their submission framework >> >> automatically detect if there's a reservation that may interfere with >> >> their jobs, and act accordingly. >> >> >> >> What is the best way to do this? Typically, in my shell scripts, I >> >> have some command that tests something, and then check exit code >> >> returned by the command. For example to check if my name is in file >> >> 'foo.txt', I'd do something like this: >> >> >> >> grep -iq prentice foo.txt >> >> retval=$? >> >> if [ $retval -eq 0 ]; then >> >> echo "Prentice found" >> >> else >> >> echo "Prentice not found" >> >> fi >> >> unset retval >> >> >> >> Or something like that. I was also thinking this might work, too: >> >> >> >> num_res=$(scontrol -o show res | wc -l) >> >> if [ $num_res -eq 0 ]; then >> >> echo "No reservations found" >> >> else >> >> echo "$num_res reservation(s) found" >> >> fi >> >> >> >> Are there any better or other ways that you would recommend? Also, if >> >> there's more than one, is are they listed in any kind of order in the >> >> scontrol or sinfo output (soonest first, soonest last, etc.)? From >> >> the man page, it looks like 'scontrol show reservation' doesn't >> >> provide any sorting. >> >> >> >> Prentice >> >> >> >> >> >> >> >> >> >> >> > >> >> >
signature.asc
Description: Message signed with OpenPGP