Re: [slurm-users] SlurmDBD losing connection to the backend MariaDB

2022-11-02 Thread Richard Chang
Hello Brian, Thank you for the reply and sharing your design. Can you please share your MariaDB server HA details.? ( Can be offline and DM to me ) I would like to understand it so that I can replicate it  here. Thanks & regards, Richard. On 11/2/2022 8:09 AM, Brian Andrus wrote: RC, In t

Re: [slurm-users] SlurmDBD losing connection to the backend MariaDB

2022-11-01 Thread Brian Andrus
RC, In that scenario, the backup slurmdbd would take over, but then its database would not necessarily be in sync with the 'main' database (hence the warnings/info about it in the documentation). For my setup, I have 2 slurmdbd hosts, but they both connect to the same, separate, MariaDB serv

Re: [slurm-users] SlurmDBD losing connection to the backend MariaDB

2022-11-01 Thread Richard Chang
Does it mean it is best to use a single slurmdbd host in my case? My primary slurmctld is the backup slurmdbd host, and my worry is if the primary slurmdbd host ( which is also the mariadb server) goes down, will the backup slurmdbd be able to cache data and wait till the mariadb catches up ?

Re: [slurm-users] SlurmDBD losing connection to the backend MariaDB

2022-11-01 Thread Brian Andrus
Ole, Fair enough, it is actually slurmctld that does the caching. Technical typo on my part there. Just trying to let the user know, there is a window that they have to ensure no information is lost during a database outage. Brian Andrus On 11/1/2022 1:43 AM, Ole Holm Nielsen wrote: Hi Br

Re: [slurm-users] SlurmDBD losing connection to the backend MariaDB

2022-11-01 Thread Ole Holm Nielsen
Hi Brian, On 11/1/22 05:28, Brian Andrus wrote: It caches up to a point. As I understand it, that is about an hour (depending on size and how busy the cluster is, as well as available memory, etc). Have you found any documentation of slurmdbd caching? It's well-known that slurmctld caches i

Re: [slurm-users] SlurmDBD losing connection to the backend MariaDB

2022-10-31 Thread Brian Andrus
It caches up to a point. As I understand it, that is about an hour (depending on size and how busy the cluster is, as well as available memory, etc). Brian Andrus On 10/31/2022 9:20 PM, Richard Chang wrote: Hi, Just for my info, I would like to know what happens when SlurmDBD loses connect

[slurm-users] SlurmDBD losing connection to the backend MariaDB

2022-10-31 Thread Richard Chang
Hi, Just for my info, I would like to know what happens when SlurmDBD loses connection to the backend Database, for ex, MariaDB. Does it cache the accounting info and keep them till the DB comes back up ?, or does it panic and shut down ? Thank you, RC.