On Mar 8, 2021, at 1:35 PM, 
slurm-users-requ...@lists.schedmd.com<mailto:slurm-users-requ...@lists.schedmd.com>
 wrote:

What?s happening is that there?s no SLURM_JOBID (my speculation since I don?t 
have perms to use ?no-alloc) is set, but SLURM_NODELIST may be set, so its 
confusing ORTE.
Could you list which SLURM env variables are set in the shell in which your 
running the srun command?

Howard,

I believe you are correct.  Once I set SLURM_JOBID then ORTE starts functioning 
again with the --no-alloc option.  Since you asked (and for completeness) I 
include the list of environment variables that were different with/without 
--no-alloc below, but my tests show that jobid seems to be the magic one, as 
you predicted.

I guess I will manufacture an artificial job id for our “--no-alloc” runs, but 
if anyone is aware of any dangers lurking in the shadows from that approach I 
would be interested.

Thanks for the guidance ... impressive that you could identify the issue so 
quickly!

chris

----------------------------------------------------------

SLURM_JOB_CPUS_PER_NODE=1
SLURM_JOB_ID=25300
SLURM_JOBID=25300
SLURM_JOB_NUM_NODES=1
SLURM_JOB_PARTITION=psfehq
SLURM_JOB_QOS=normal
SLURM_CPUS_ON_NODE=1

Reply via email to