I'm helping with a workflow manager that needs to submit Slurm jobs. For logging and management reasons, the job (e.g. srun python) needs to be run as though it were a regular subprocess (python):
- stdin, stdout and stderr for the command should be connected to process inside the job - signals sent to the command should be sent to the job process - We don't want to use the existing job allocation, if this is run from a Slurm job - The command should only terminate when the job is finished, to avoid us needing to poll Slurm We've tried: - sbatch --wait, but then SIGTERM'ing the process doesn't kill the job - salloc, but that requires a TTY process to control it (?) - salloc srun seems to mess with the terminal when it's killed, likely because of being "designed to be executed in the foreground" - Plain srun re-uses the existing Slurm allocation, and specifying resources like --mem will just request then from the current job rather than submitting a new one What is the best solution here?
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com