Re: [slurm-users] sacct: job state code CANCELLED+

2019-11-16 Thread Uwe Seher
Hello! I thought it has a deeper meaning, because for node states exists some extensions. Thank you all! Uwe Seher Am Sa., 16. Nov. 2019 um 07:39 Uhr schrieb Chris Samuel : > On Friday, 15 November 2019 2:13:15 AM PST Loris Bennett wrote: > > > If the contents of the column are wi

[slurm-users] sacct: job state code CANCELLED+

2019-11-15 Thread Uwe Seher
+ batch CANCELLED 2019-11-14T11:28:39 2019-11-14T13:12:07 01:43:28 115 277.0 ortedFAILED 2019-11-14T11:28:58 2019-11-14T13:12:06 01:43:08 1 1 Best regards Uwe Seher

Re: [slurm-users] Problem with accounting/slurmdbd

2019-11-13 Thread Uwe Seher
mand. > > If the issue persists, restart mysql and slurmdbd. > > Brian Andrus > On 11/11/2019 2:10 AM, Uwe Seher wrote: > > Hello! > I like zu use accounting via slurmdbd/mariadb and have some problems with > connection to the database. > When i try to connect via sacct

Re: [slurm-users] Problem with accounting/slurmdbd

2019-11-13 Thread Uwe Seher
ommand. > > If the issue persists, restart mysql and slurmdbd. > > Brian Andrus > On 11/11/2019 2:10 AM, Uwe Seher wrote: > > Hello! > I like zu use accounting via slurmdbd/mariadb and have some problems with > connection to the database. > When i try to connect via sacc

[slurm-users] Problem with accounting/slurmdbd

2019-11-11 Thread Uwe Seher
eems to be established. But i can not do any configuration, so no further logging is possible. Below are some further infomations. Thank you in advance for some hints concerning this issue. Regards Uwe Seher The accounting setup in slurm.conf is the following: # ACCOUNTING JobAcct

Re: [slurm-users] Running job is canceled when starting a new job from queue

2019-10-29 Thread Uwe Seher
not know if other distributions do the same or if the script is broken, but removing it solved the problem. Thank you! Uwe Seher Am Mo., 28. Okt. 2019 um 15:47 Uhr schrieb Uwe Seher : > Hello! > I cannot fond any hints on oom-kills, but it is systemd so i need maybe a > little more time

Re: [slurm-users] Running job is canceled when starting a new job from queue

2019-10-28 Thread Uwe Seher
. The task are running without a timelimit, so this should not be the reason. Thank you for the moment, when i get som more information i'll get back here. Uwe Seher Am Mo., 28. Okt. 2019 um 14:06 Uhr schrieb Lech Nieroda < lech.nier...@uni-koeln.de>: > Hello Uwe, > > when the

[slurm-users] Running job is canceled when starting a new job from queue

2019-10-28 Thread Uwe Seher
. Thank you in advance Uwe Seher slurmctld.log: [2019-10-27T06:33:27.735] debug: sched: Running job scheduler [2019-10-27T06:33:54.970] debug: backfill: beginning [2019-10-27T06:33:54.970] debug: backfill: 1 jobs to backfill [2019-10-27T06:34:02.328] _job_complete: JobID=160 State=0x1 NodeCnt=1

[slurm-users] Monitoring slurm with icinga2

2019-10-24 Thread Uwe Seher
know the state of the slurm-controller and the nodes and if possible, checks for hanging/pending jobs or jobs running in a timelimit. Is there a plugin out there, that can handle some of these tasks, an hour googling around didn't bring some effort. Thank you in advance Uwe Seher