Hello Geert,
Thanks for your answer. I realised with your first question that
only the ssh connection of the user that launched the job gets closed,
and that it actually closes at the end of the job.
So it is the script /etc/slurm/slurm.epilog.clean that closes the
connection. I will then stop its execution while we use the master as
a computation note.
Thanks for guiding me into the right direction.
Best,
Alexandre
Geert Geurts <geert.geu...@dalco.ch> a écrit :
Hi Alexandre,
It would be good to know a bit more information.
I wrote some questions that might be of interest, but you probably
know better what info is relevant to your problem.
Which ssh connection gets dropped exactly?
What os/slurm version are you using?
How does your jobscript look like?
How does your slurm config look like?
Regards,
Geert
________________________________
From: alexandre.vid...@sichh.ch
Sent: Friday, December 1, 2017 13:22
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] SSH closed by remote host when job starts
Hi everyone,
I have currently a setup consisting only of one node, which will
grow in the future. Everything works fine so far, except that
everytime a job starts, the ssh session is closed automatically and
I have to log in again.
Here are the different logs:
Console:
sbatch launch_job.sh
Submitted batch job 42
[user@computer]$ Connection to 192.168.1.1 closed by remote host.
/var/log/secure/
Dec 1 12:30:27 computer systemd-logind: Removed session 114.
Dec 1 12:30:27 computer systemd: Removed slice User Slice of user.
Dec 1 12:30:27 computer systemd: Stopping User Slice of user.
It does not seem to be an error, however the session is closed and I
have not seen any parameter to prevent it.