We use sacctmgr list stats for our Slurmdbd check Our Nagios check is
RESULT=$(/usr/local/slurm/latest/bin/sacctmgr list stats) if [ $? -ne 0 ] then echo "ERROR: cannot connect to database" exit 2 fi echo "$RESULT" | head -n 4 exit 0 Sean ________________________________ From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Christoph Brüning <christoph.bruen...@uni-wuerzburg.de> Sent: Thursday, 10 June 2021 16:54 To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com> Subject: [EXT] Re: [slurm-users] Is there a scontrol ping slurmdbd? External email: Please exercise caution Hello, I'd usually use some simple sacctmgr command. Something like "sacctmgr list cluster". That said, we're running a single slurmdbd instance since slurmctld does some caching etc. Do you have multiple that you need to check individually? Cheers, Christoph On 09/06/2021 23.23, Heitor wrote: > Hello, > > The docs about scontrol says that the command `scontrol ping` allows > one to query if slurmctld nodes are up and running. > > I'm wondering if there's something analogous for slurmdbd? A command to > check if slurmdbd nodes are up and running? I couldn't find it in the > docs. > > Kind regards, > Heitor > -- Dr. Christoph Brüning Universität Würzburg HPC & DataManagement @ ct.qmat & RZUW Am Hubland D-97074 Würzburg Tel.: +49 931 31-80499