I can confirm the behavior you are reporting.  We noticed this a number of 
months ago as well. What is happening is the HOSTNAME variable is being 
exported on a front-end/login node and when the environment is being copied out 
for the salloc the variable is being picked up and replicated on the compute 
node.  This is due to the variable likely being exported rather than set.  
Additionally, the /etc/profile script is NOT setting the HOSTNAME variable on 
the compute nodes because it is already set…

/etc/profile:
…
if test -s /etc/HOSTNAME ; then
    test -z "$HOSTNAME" && HOSTNAME=`cat /etc/HOSTNAME`
else
    test -z "$HOSTNAME" && HOSTNAME=$HOST
fi
…

I have not tracked the origin of what changed and when, but both aforementioned 
things have to be the way they are for this behavior to be seen.  The fix 
therefore can be addressed by changing one of them… 1) Make sure the hostname 
is not being exported. (Probably the best way) or 2) change /etc/profile to not 
test for the HOSTNAME variable and just set it regardless from /etc/HOSTNAME

So this is largely an OS/environment issue, and less of a slurm thing.  I am 
curious though as to what OS you experienced this on.

Hope this helps.

Joshi Fullop
HPC-ENV
Los Alamos National Laboratory




From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf Of CB
Sent: Tuesday, July 10, 2018 10:55 AM
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Fwd: An issue with HOSTNAME env var when using 
salloc/srun for interactive job with Slurm 17.11.7

Hi,

We've recently upgraded to Slurm 17.11.7 from 16.05.8.

We noticed that the environment variable, HOSTNAME, does not refelct the 
compute node with an interactive job using the salloc/srun command.
Instead it still points to the submit hostname although .SLURMD_NODENAME 
reflects the correct  compute node name.

$ salloc --immediate -p manycore --constraint=xeon64c --exclusive -O -N 1 
--qos=high  srun --pty bash -i
salloc: Granted job allocation 2291315
salloc: Waiting for resource configuration
salloc: Nodes mc-1 are ready for job

[user1@mc-1 test]$ echo $HOSTNAME
login-3

[user1@mc-1 test]$ echo $SLURMD_NODENAME
nc-1

Is this a bug introduced with 17.11.x version or something that has been there 
before?  According to our user, it used to point the compute node name.

BTW, if I test the environment variable with a batch job, HOSTNAME environment 
variable reflects the compute node name correctly.

Thanks,
- Chansup

Reply via email to