We had something similar happen, when we migrated away from a Rocks-based cluster.  We used a script like the one attached, in /etc/profile.d, which was modeled heavily by something similar in Rocks.

You might need to adapt it a bit for your situation, but otherwise it's pretty straightforward.

Lloyd

--
Lloyd Brown
HPC Systems Administrator
Office of Research Computing
Brigham Young University
http://marylou.byu.edu



On 5/25/21 8:56 AM, Loris Bennett wrote:
Hi Ole,

Thanks for the links.

I have discovered that the users whose /home directories were migrated
from our previous cluster all seem to have a pair of keys which were
created along with files like '~/.bash_profile'.  Users who have been
set up on the new cluster don't have these files.

Is there some /etc/skel-like mechanism which will create passwordless
SSH keys when a user logs into the system for the first time?  It looks
increasingly to me that such a mechanism must have existed on our old
cluster.

Cheers,

Loris

Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk> writes:

Hi Loris,

I think you need, as pointed out by others, either of:

* SSH keys, see
https://wiki.fysik.dtu.dk/niflheim/SLURM#ssh-keys-for-password-less-access-to-cluster-nodes

* SSH host-base authentication, see
https://wiki.fysik.dtu.dk/niflheim/SLURM#host-based-authentication

/Ole

On 5/25/21 2:09 PM, Loris Bennett wrote:
Hi everyone,

Thanks for all the replies.

I think my main problem is that I expect logging in to a node with a job
to work with pam_slurm_adopt but without any SSH keys.  My assumption
was that MUNGE takes care of the authentication, since users' jobs start
on nodes with the need for keys.

Can someone confirm that this expectation is wrong and, if possible, why
the analogy with jobs is incorrect?

I have a vague memory that this used work on our old cluster with an
older version of Slurm, but I could be thinking of a time before we set
up pam_slurm_adopt.

Cheers,

Loris
Brian Andrus <toomuc...@gmail.com> writes:

Oh, you could also use the ssh-agent to mange the keys, then use 'ssh-add
~/.ssh/id_rsa' to type the passphrase once for your whole session (from that
system).

Brian Andrus


On 5/21/2021 5:53 AM, Loris Bennett wrote:
Hi,

We have set up pam_slurm_adopt using the official Slurm documentation
and Ole's information on the subject.  It works for a user who has SSH
keys set up, albeit the passphrase is needed:

     $ salloc --partition=gpu --gres=gpu:1 --qos=hiprio --ntasks=1 
--time=00:30:00 --mem=100
     salloc: Granted job allocation 7202461
     salloc: Waiting for resource configuration
     salloc: Nodes g003 are ready for job

     $ ssh g003
     Warning: Permanently added 'g003' (ECDSA) to the list of known hosts.
     Enter passphrase for key '/home/loris/.ssh/id_rsa':
     Last login: Wed May  5 08:50:00 2021 from login.curta.zedat.fu-berlin.de

     $ ssh g004
     Warning: Permanently added 'g004' (ECDSA) to the list of known hosts.
     Enter passphrase for key '/home/loris/.ssh/id_rsa':
     Access denied: user loris (uid=182317) has no active jobs on this node.
     Access denied by pam_slurm_adopt: you have no active jobs on this node
     Authentication failed.

If SSH keys are not set up, then the user is asked for a password:

     $ squeue --me
                  JOBID PARTITION     NAME     USER ST       TIME  NODES 
NODELIST(REASON)
                7201647      main test_job nokeylee  R    3:45:24      1 c005
                7201646      main test_job nokeylee  R    3:46:09      1 c005
     $ ssh c005
     Warning: Permanently added 'c005' (ECDSA) to the list of known hosts.
     nokeylee@c005's password:

My assumption was that a user should be able to log into a node on which
that person has a running job without any further ado, i.e. without the
necessity to set up anything else or to enter any credentials.

Is this assumption correct?

If so, how can I best debug what I have done wrong?

Attachment: ssh-key.sh
Description: application/shellscript

Reply via email to