Hi –
The time stamps show that your 1st sbatch job components start at the same time
and then run for 1 minute.
30 seconds after the simultaneous end of all three components of the 1st
sbatch, the two components of the 3rd sbatch and the three components of the
2nd all start. The two com
Made the change in the gres.conf on local server file and restarted slurmd
and slurmctld on master Unfortunately same error...
Distributed corrected gres.conf to all k20 servers, restarted slurmd and
slurmdctl... Still has same error...
On Mon, Dec 3, 2018 at 4:04 PM Brian W. Johanson wrot
Is that a lowercase k in k20 specified in the batch script and nodename and a
uppercase K specified in gres.conf?
On 12/03/2018 09:13 AM, Lou Nicotra wrote:
Hi All, I have recently set up a slurm cluster with my servers and I'm running
into an issue while submitting GPU jobs. It has something t
Here you go... Thanks for looking into this...
lnicotra@tiger11 run# scontrol show config
Configuration data as of 2018-12-03T15:39:51
AccountingStorageBackupHost = (null)
AccountingStorageEnforce = none
AccountingStorageHost = panther02
AccountingStorageLoc= N/A
AccountingStoragePort = 681
are you willing to paste an `scontrol show config` from the machine
having trouble
On Mon, Dec 3, 2018 at 12:10 PM Lou Nicotra wrote:
>
> I'm running slurmd version 18.08.0...
>
> It seems that the system recognizes the GPUs after a slurmd restart. I tuned
> debug to 5, restarted and then submit
I'm running slurmd version 18.08.0...
It seems that the system recognizes the GPUs after a slurmd restart. I
tuned debug to 5, restarted and then submitted job. Nothing get logged to
log file in local server...
[2018-12-03T11:55:18.442] Slurmd shutdown completing
[2018-12-03T11:55:18.484] debug:
do you get anything additional in the slurm logs? have you tried
adding gres to the debugflags? what version of slurm are you running?
On Mon, Dec 3, 2018 at 9:18 AM Lou Nicotra wrote:
>
> Hi All, I have recently set up a slurm cluster with my servers and I'm
> running into an issue while submi
What does scontrol show partition EMERALD give you? I’m assuming its
AllowAccounts output won’t match your /etc/slurm/parts settings.
> On Dec 2, 2018, at 12:34 AM, Mahmood Naderan wrote:
>
> Hi
> Although I have created an account and associated that to a partition, but
> the submitted job re
Hi All, I have recently set up a slurm cluster with my servers and I'm
running into an issue while submitting GPU jobs. It has something to to
with gres configurations, but I just can't seem to figure out what is
wrong. Non GPU jobs run fine.
The error is as follows:
sbatch: error: Batch job submi
Hi Ken,
I have read this page and I understood that in case of my example the third
job should be backfilled. The second job can start after 15 minutes, but
the third job requires only two nodes and 2 minutes, thus it can start
immediately, but this does not happen.
In the page that you referred
10 matches
Mail list logo