puzzled two ways: why not use the numeric jobid,
I don't know the job id before the actual job submission. Hence I would like to
get some kind of place holder, and `scommit` the job later with the actual
resource requirements as comments in an usual jobscript.
OK, in that case, you can make up an arbitrary identifier
(hash of user and time, etc), and simply pass that into the
job in the environment. (there is a religion of not passing state,
such as environment variables into jobs, but it's just dogma...)
and why would configuring
the scratch space be too slow to perform in the job prolog?
The access to /home is highly discouraged from the nodes, instead the users
should prepare an area in /scratch beforehand (copy all the files for the job
thereto) and submit the job from there. So the working directory of the job is
automatically in the /scratch area (fast parallel file system) ? no further
file staging needed. Essentially the nodes could work without a mounted /home.
there's no reason the prolog can't call standardized code that looks
for the relevant information and performs any staging (without human
intervention).
Sure, `sblank` which would provide a reserved job id could have some prolog and
prepare the workspace to tell the user: please put your files in
/scratch/job-id-task-id. For the users this would mean to issue:
sblank
copy files to the given location(s) fro the login node
scommit
I don't see any harm to doing this, which would require no assistance
from slurm. "sbatch --hold ...", then do your prep then "scontrol release".
but I'm not sure what it really gets you.
another approach would be to submit a dependent pair (or even triplet)
of jobs: data movement on either end and compute in the middle. one
attractive thing about this is that since the data movement would be
in dedicated jobs, you could handle them specially (run them on dedicated
nodes, rate-limit them, etc).
of course, user pro/epilog is very much like this, but somewhat less
structured (and potentially less queue time). possibly more wasteful.
and the users can be sure to find the job's files in /scratch/job-id-task-id,
and the admins can be sure that there is no access to /home slowing down the
cluster and interactive work on the login node.
sure, though you could make the name somewhat richer (username, account, etc)
regards, mark hahn.