On 9/26/22 08:48, taleinterve...@sjtu.edu.cn wrote:
When designing restriction in job_submit.lua, I found there is no member in job_desc struct can directly be used to determine the node number finally allocated to a job. The *job_desc.min_nodes *seem to be a close answer, but it will be 0xFFFFFFFE when user not specify –node option. Then in such case we think we can use *job_desc.num_tasks* and *job_desc.ntasks_per_node *to calculate node number. But again, we find *ntasks_per_node* may also be default value 0xFFFE if user not specify related option.

The hex-values which you quote are actually defined as symbols in Slurm as slurm.NO_VAL16, slurm.NO_VAL, and slurm.NO_VAL64 which are easier to understand :-) See my notes in https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#lua-functions-for-the-job-submit-plugin

So what is the complete and elegant way to predict the job node number in job_submit.lua in all case, no matter how user write their submit options?

The sbatch command provides defaults for nodes and tasks, if these are not defined in the user's job script, see the sbatch manual page:

If -N is not specified, the default behavior is to allocate enough nodes to satisfy the requested resources as expressed by per-job specification options, e.g. -n, -c and --gpus.

and

-n The default is one task per node, but note that the --cpus-per-task option 
will change this default.

Therefore you do not need to guess the number of nodes and tasks in your job_submit.lua script.

/Ole

Reply via email to