Re: [slurm-users] Socket Timed Out on Send/Recv Operation

2019-04-18 Thread Janne Blomqvist
On 17/04/2019 18.54, Yang Liu wrote: > We often received errors due to socket time out on send/recv opeartion: > > slurm_load_jobs error: Socket timed out on send/recv operation > slurm_load_node: Socket timed out on send/recv operation > > > What could cause the errors? How likely job_submit.lu

[slurm-users] Socket Timed Out on Send/Recv Operation

2019-04-17 Thread Yang Liu
We often received errors due to socket time out on send/recv opeartion: slurm_load_jobs error: Socket timed out on send/recv operation slurm_load_node: Socket timed out on send/recv operation What could cause the errors? How likely job_submit.lua could cause such errors? We have a program runni

Re: [slurm-users] Socket timed out on send/recv operation

2018-10-20 Thread Chris Samuel
On Friday, 19 October 2018 4:58:37 AM AEDT Kirk Main wrote: > I'm a new administrator to Slurm and I've just got my new cluster up and > running. We started getting a lot of "Socket timed out on send/recv > operation" errors when submitting jobs, and also if you try to "squeue" > while others are

Re: [slurm-users] Socket timed out on send/recv operation

2018-10-18 Thread John Hearns
Kirk, MailProg=/usr/bin/sendmail MailProg should be the program used to SEND mail ie. /bin/mail not sendmail If I am not wrong int he jargon MailProg is a MUA not an MTA (sendmail is an MTA) On Thu, 18 Oct 2018 at 19:01, Kirk Main wrote: > Hi all, > > I'm a new administrator to Slurm a

[slurm-users] Socket timed out on send/recv operation

2018-10-18 Thread Kirk Main
Hi all, I'm a new administrator to Slurm and I've just got my new cluster up and running. We started getting a lot of "Socket timed out on send/recv operation" errors when submitting jobs, and also if you try to "squeue" while others are submitting jobs. The job does eventually run after about a m