Hi Loris,
On 9/26/22 12:51, Loris Bennett wrote:
When designing restriction in job_submit.lua, I found there is no member in
job_desc struct can directly be used to determine the node number finally
allocated to a job. The job_desc.min_nodes seem to
be a close answer, but it will be 0xFFFFFFFE when user not specify –node
option. Then in such case we think we can use job_desc.num_tasks and
job_desc.ntasks_per_node to calculate node number.
But again, we find ntasks_per_node may also be default value 0xFFFE if user not
specify related option.
So what is the complete and elegant way to predict the job node number in
job_submit.lua in all case, no matter how user write their submit options?
I don't think you can expect to know the node(s) a job will eventually
run on at submission time. How would this work? Resources will become
available earlier than Slurm expects, if jobs finish before the given
time-time (or if they crash). If your are using fairshare, jobs can be
scheduled which have a higher priority than the currently waiting jobs.
What is your use-case for needing to know the node the job will run on?
I think he meant the *number of nodes*, and not the *hostnames* of the
compute nodes selected by Slurm at a later time.
/Ole