puzzled two ways: why not use the numeric jobid,

I don't know the job id before the actual job submission. Hence I would like to 
get some kind of place holder, and `scommit` the job later with the actual 
resource requirements as comments in an usual jobscript.


OK, in that case, you can make up an arbitrary identifier
(hash of user and time, etc), and simply pass that into the job in the environment. (there is a religion of not passing state, such as environment variables into jobs, but it's just dogma...)

and why would configuring
the scratch space be too slow to perform in the job prolog?

The access to /home is highly discouraged from the nodes, instead the users 
should prepare an area in /scratch beforehand (copy all the files for the job 
thereto) and submit the job from there. So the working directory of the job is 
automatically in the /scratch area (fast parallel file system) ? no further 
file staging needed. Essentially the nodes could work without a mounted /home.


there's no reason the prolog can't call standardized code that looks for the relevant information and performs any staging (without human intervention).

Sure, `sblank` which would provide a reserved job id could have some prolog and 
prepare the workspace to tell the user: please put your files in 
/scratch/job-id-task-id. For the users this would mean to issue:

sblank
copy files to the given location(s) fro  the login node
scommit

I don't see any harm to doing this, which would require no assistance
from slurm.  "sbatch --hold ...", then do your prep then "scontrol release".

but I'm not sure what it really gets you.

another approach would be to submit a dependent pair (or even triplet) of jobs: data movement on either end and compute in the middle. one attractive thing about this is that since the data movement would be in dedicated jobs, you could handle them specially (run them on dedicated
nodes, rate-limit them, etc).

of course, user pro/epilog is very much like this, but somewhat less structured (and potentially less queue time). possibly more wasteful.

and the users can be sure to find the job's files in /scratch/job-id-task-id, 
and the admins can be sure that there is no access to /home slowing down the 
cluster and interactive work on the login node.

sure, though you could make the name somewhat richer (username, account, etc)

regards, mark hahn.

Reply via email to