In case you haven’t already done something similar, I reduced some of the
cumbersome-ness of my job_submit.lua by breaking it out into subsidiary
functions, and adding some logic to detect if I was in test mode or not. Basic
structure, with subsidiary functions defined ahead of slurm_job_submit():
=====
function fix_undefined_partition(job_desc)
if (job_desc.partition == nil) then
local default_partition = "batch"
job_desc.partition = default_partition
slurm.log_info(
"slurm_job_submit: No partition specified, moved to batch.")
end
end
…
function slurm_job_submit(job_desc, part_list, submit_uid)
test_user_table = {}
test_user_table[MYUID] = ‘MYUSERNAME'
-- test_enabled = (test_user_table[submit_uid] ~= nil)
test_enabled = false
if (test_enabled) then -- use logic for testing
slurm.log_info("testing mode enabled”)
fix_undefined_partition(job_desc)
…
else -- use default logic for production
fix_undefined_partition(job_desc)
…
end -- detect if testing or production
return slurm.SUCCESS
end
=====
That way, I can make a table of users in the testing population, change the
“test_enabled" lines, and write new logic and call new functions in the “if
(test_enabled)” block as needed. Once everything is tested, I can quickly copy
the updated logic into the other half of the “if (test_enabled)” block and roll
it out to the rest of the population.
> On Jan 30, 2019, at 9:54 AM, Prentice Bisbal <[email protected]> wrote:
>
> Miguel,
> Thanks for the reply. I've already thought about doing that, but I was hoping
> there was an easier, "more universal" way of doing that. Right now, I have a
> rather long job_submit.lua, which has made making changes in my environment
> cumbersome, so I'm trying to minimize my reliance on job_submit.lua as much
> as possible.
>
> It looks like the best way to do this to use sacctmgr to make "general" the
> default QOS, which I just did.
> Prentice
> On 1/30/19 6:48 AM, Miguel Gila wrote:
>> Hi Prentice,
>>
>> You could add something like this to your job_submit.lua
>>
>> QOS_DEBUG = ’system_debug'
>> PARTITION_DEBUG = ‘debug'
>> [...]
>> function slurm_job_submit(job_desc, part_list, submit_uid)
>> -- ------------------------ DEBUG/QOS -------------------------------
>> if (job_desc.partition) and (job_desc.partition == PARTITION_DEBUG) then
>> slurm.log_info("::slurm_job_submit partition DEBUG. Original QOS: %s,
>> new QOS: %s”, job_desc.qos, QOS_DEBUG)
>> job_desc.qos=QOS_DEBUG
>> slurm.log_user(“Setting QoS=%s for this job.”,QOS_DEBUG)
>> end
>> [...]
>>
>> Hope this helps.
>>
>> Miguel
>>
>>> On 29 Jan 2019, at 16:27, Prentice Bisbal <[email protected]> wrote:
>>>
>>> How does one assign a QOS to a partition? This is mentioned several
>>> different places in the Slurm documentation, but nowhere does it explain
>>> exactly how to do this.
>>> You can assign a QOS to a partition in slurm.conf like this:
>>> PartitionName=mypartition Nodes=node[001-100] QOS=myqos
>>> But that doesn't seem to really do much. And the explanation for defining a
>>> QOS in a partition definition, while rather vague, seems to state as much:
>>>
>>>> QOS
>>>> Used to extend the limits available to a QOS on a partition. Jobs will not
>>>> be associated to this QOS outside of being associated to the partition.
>>>> They will still be associated to their requested QOS. By default, no QOS
>>>> is used. NOTE: If a limit is set in both the Partition's QOS and the Job's
>>>> QOS the Partition QOS will be honored unless the Job's QOS has the
>>>> OverPartQOS flag set in which the Job's QOS will have priority.
>>>
>>> If I want to have every job that requests the partition "mypartition" use
>>> the QOS "myqos", how do I do that?
>>> Also, can someone please explain to me the explanation of the QOS field in
>>> the partition definition I quoted above?
>>> --
>>> Prentice
>>>
>>