Hi there,
We've updated to 23.11.6 and replaced MUNGE with SACK.
Performance and stability have both been pretty good, but we're
occasionally seeing this in the slurmctld.log
/[2024-05-07T03:50:16.638] error: decode_jwt: token expired at 1715053769
[2024-05-07T03:50:16.638] error: cred_p_unpa
You can try DRBD
https://linbit.com/drbd/
or a shared-disk (clustered) FS like GFS2, OCFS2, etc
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/configuring_gfs2_file_systems/index
https://docs.oracle.com/en/operating-systems/oracle-linux/9/shareadmin/shareadm