Just to revisit this, for jobs that are queued, but prevented from
running, will have a more useful reason in 18.08, which will address one
of my issues with reservation collisions.
https://bugs.schedmd.com/show_bug.cgi?id=5138
https://bugs.schedmd.com/show_bug.cgi?id=4987
Prentice Bisbal
Lead Software Engineer
Princeton Plasma Physics Laboratory
http://www.pppl.gov
On 05/11/2018 10:36 AM, Douglas Jacobsen wrote:
A feature that many slurm users might like is sbatch --time-min.
Using both --time-min and --time a user can specify the range of
acceptable wall times limits. This can make it much easier to keep
jobs running right up to the maintenance reservation. e.g.:
sbatch --time-min=30:00 --time=48:00:00 script.sh
would allow the job to schedule for any time-slot between 30 minutes
and 2 days in length. If the user has some mechanism for job chaining
or similar, this can allow them to make the most of backfill
opportunities.
-Doug
----
Doug Jacobsen, Ph.D.
NERSC Computer Systems Engineer
National Energy Research Scientific Computing Center
<http://www.nersc.gov>
dmjacob...@lbl.gov <mailto:dmjacob...@lbl.gov>
------------- __o
---------- _ '\<,_
----------(_)/ (_)__________________________
On Fri, May 11, 2018 at 7:27 AM Paul Edmon <ped...@cfa.harvard.edu
<mailto:ped...@cfa.harvard.edu>> wrote:
In the past we used the LUA job submit plugin to block jobs that
would
intersect maintenance reservations. I would look at that.
-Paul Edmon-
On 05/11/2018 08:19 AM, Bill Wichser wrote:
> The problem is that reservations can be in there yet have no
effect on
> the submitted job if they would run before the reservation takes
> place. One can pull the starting time simply using something
like this
>
> scontrol show res -o | awk '{print $2}'
>
> with output
>
> StartTime=2018-06-12T06:00:00
> StartTime=2018-06-12T06:00:00
>
> You'd need more code around that, obviously, to determine if this
> starttime might hold up the job.
>
> Bill
>
>
> On 05/10/2018 04:23 PM, Prentice Bisbal wrote:
>> Dear Slurm Users,
>>
>> We've started using maintenance reservations. As you would expect,
>> this caused some confusion for users who were wondering why their
>> jobs were queuing up and not running. Some of my users provide a
>> public service of sorts that automatically submits jobs to our
>> cluster. They would like to have their submission framework
>> automatically detect if there's a reservation that may
interfere with
>> their jobs, and act accordingly.
>>
>> What is the best way to do this? Typically, in my shell scripts, I
>> have some command that tests something, and then check exit code
>> returned by the command. For example to check if my name is in
file
>> 'foo.txt', I'd do something like this:
>>
>> grep -iq prentice foo.txt
>> retval=$?
>> if [ $retval -eq 0 ]; then
>> echo "Prentice found"
>> else
>> echo "Prentice not found"
>> fi
>> unset retval
>>
>> Or something like that. I was also thinking this might work, too:
>>
>> num_res=$(scontrol -o show res | wc -l)
>> if [ $num_res -eq 0 ]; then
>> echo "No reservations found"
>> else
>> echo "$num_res reservation(s) found"
>> fi
>>
>> Are there any better or other ways that you would recommend?
Also, if
>> there's more than one, is are they listed in any kind of order
in the
>> scontrol or sinfo output (soonest first, soonest last, etc.)? From
>> the man page, it looks like 'scontrol show reservation' doesn't
>> provide any sorting.
>>
>> Prentice
>>
>>
>>
>>
>>
>