Re: [slurm-users] backfill scheduler does not work for heterogeneous jobs (version 17.11)

2018-12-03 Thread Kenneth Roberts
after 15 minutes, but the third job >requires only two nodes and 2 minutes, thus it can start immediately, but this >does not happen. It seems there is a bug here. I also tried with the version 18.03, but it does not work either. Ana On Fri, 30 Nov 2018 at 17:46, Ken

Re: [slurm-users] backfill scheduler does not work for heterogeneous jobs (version 17.11)

2018-11-30 Thread Kenneth Roberts
There are some Limitations that mention backfill on the heterogeneous job support page. https://slurm.schedmd.com/heterogeneous_jobs.html#limitations Maybe there’s some information there to help? Ken From: slurm-users On Behalf Of Ana Jokanovic Sent: Thursday, November 29, 2018 4

Re: [slurm-users] Slurm / OpenHPC socket timeout errors

2018-11-26 Thread Kenneth Roberts
users On Behalf Of Kenneth Roberts Sent: Monday, November 26, 2018 9:38 AM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] Slurm / OpenHPC socket timeout errors I wasn't looking close enough at the times in the log file. c2: [2018-11-26T10:09:40.963] debu

Re: [slurm-users] Slurm / OpenHPC socket timeout errors

2018-11-26 Thread Kenneth Roberts
g out after 20 seconds. Back to finding out why ... From: slurm-users On Behalf Of Kenneth Roberts Sent: Monday, November 26, 2018 8:35 AM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] Slurm / OpenHPC socket timeout errors Here is the debug log on a node (c2) when the job fa

Re: [slurm-users] Slurm / OpenHPC socket timeout errors

2018-11-26 Thread Kenneth Roberts
ors reading slurm.conf ... Continuing the search ... From: slurm-users On Behalf Of Kenneth Roberts Sent: Friday, November 23, 2018 4:15 PM To: slurm-users@lists.schedmd.com Subject: [slurm-users] Slurm / OpenHPC socket timeout errors Hi - I have the following on a new cluster

[slurm-users] Slurm / OpenHPC socket timeout errors

2018-11-23 Thread Kenneth Roberts
Hi - I have the following on a new cluster with OpenHPC & Slurm built off the latest recipe and packages from OpenHPC (built this week). One master node and 4 compute nodes. NodeName=c[1-4] Sockets=2 CoresPerSocket=10 ThreadsPerCore=1 State=UNKNOWN With simple test scripts, sbatch prod