Hi Loris,

I don't know if this would solve your problem, but I think that node SSH keys should be gathered and distributed. See my notes in
https://wiki.fysik.dtu.dk/niflheim/SLURM#ssh-keys-for-password-less-access-to-cluster-nodes

/Ole


On 21-05-2021 14:53, Loris Bennett wrote:
Hi,

We have set up pam_slurm_adopt using the official Slurm documentation
and Ole's information on the subject.  It works for a user who has SSH
keys set up, albeit the passphrase is needed:

   $ salloc --partition=gpu --gres=gpu:1 --qos=hiprio --ntasks=1 
--time=00:30:00 --mem=100
   salloc: Granted job allocation 7202461
   salloc: Waiting for resource configuration
   salloc: Nodes g003 are ready for job

   $ ssh g003
   Warning: Permanently added 'g003' (ECDSA) to the list of known hosts.
   Enter passphrase for key '/home/loris/.ssh/id_rsa':
   Last login: Wed May  5 08:50:00 2021 from login.curta.zedat.fu-berlin.de

   $ ssh g004
   Warning: Permanently added 'g004' (ECDSA) to the list of known hosts.
   Enter passphrase for key '/home/loris/.ssh/id_rsa':
   Access denied: user loris (uid=182317) has no active jobs on this node.
   Access denied by pam_slurm_adopt: you have no active jobs on this node
   Authentication failed.

If SSH keys are not set up, then the user is asked for a password:

   $ squeue --me
                JOBID PARTITION     NAME     USER ST       TIME  NODES 
NODELIST(REASON)
              7201647      main test_job nokeylee  R    3:45:24      1 c005
              7201646      main test_job nokeylee  R    3:46:09      1 c005
   $ ssh c005
   Warning: Permanently added 'c005' (ECDSA) to the list of known hosts.
   nokeylee@c005's password:

My assumption was that a user should be able to log into a node on which
that person has a running job without any further ado, i.e. without the
necessity to set up anything else or to enter any credentials.

Is this assumption correct?

If so, how can I best debug what I have done wrong?

Reply via email to