We've settled on the idea of using a glusterfs file system for rolling out an HA Slurm controller. Over the last year we've averaged 88,000 job submissions per day, though it's usually lower than that (10-20K). Disk activity on the existing state databaseseems to be maxing out around 40-50 io/s with a peak disk usage under 700MB.
We're replacing that with two controller hosts (eventually configured as an HA pair) and a DBD host. I've spun up a 3 replica glusterfs mirror between these hosts for the state database. The physical disks backing this storage are all SSD. Are there any hints, tips, or problems anyone has run into with Glusterfs for the state database? Any recommended tunings? Thanks much - Michael