Thank you to everyone who contributed to CASSANDRA-21293 and CASSANDRA-21290 incorporate Chris's feedback and improve the reliability of this feature!
Coming back to defaults, Chris mentions using a default minimum value of 3 hours as the amount of time a node is allowed to be down if any tables in the cluster have a very small or 0 gc_grace_seconds. I think this makes sense as a default value for the cassandra_latest.yaml implementation. Are there any concerns with using 3 hours or suggestions for an alternative value? On Wed, Mar 25, 2026 at 12:42 PM Isaac Reath <[email protected]> wrote: > Happy to add in the docs into the PR for CASSANDRA-21247 if there's > nothing already available. > > On Wed, Mar 25, 2026 at 12:36 PM Štefan Miklošovič <[email protected]> > wrote: > >> Hi Chris, >> >> If you have some time to put a patch together with these improvements >> that would be great. I can definitely review. >> >> Regards >> >> On Wed, Mar 25, 2026 at 5:24 PM Chris Lohfink <[email protected]> >> wrote: >> > >> > We enabled this across our fleet. We did make a couple small tweaks we >> might wanna consider >> > 1. (important one) if the process shuts down mid write you can end up >> with a corrupt json hint file then the process refuses to start up. We >> added fallback to the timestamp of the file and an atomic write. >> > 2 is we made it a minimum of 3 hours which was because we do have a lot >> of things that are set to 0 (or very short) gc_grace in the fleet and that >> we don't care about. There should probably be a setting for minimum >> threshold otherwise they can't really do anything other than delete >> heartbeat after every restart >> > 3. add some documentation to evaluate and delete heartbeat if its >> blocking startup >> > >> > On Wed, Mar 25, 2026 at 10:17 AM Štefan Miklošovič < >> [email protected]> wrote: >> >> >> >> Hi Isaac, >> >> >> >> I am fine with having that property set to true in >> cassandra_latest.yaml only. >> >> >> >> Regards >> >> >> >> On Tue, Mar 24, 2026 at 10:05 PM Isaac Reath <[email protected]> >> wrote: >> >> > >> >> > Hi all, >> >> > >> >> > There’s ongoing interest in preventing nodes from starting after >> being offline longer than gc_grace_seconds, to avoid data resurrection >> issues. >> >> > >> >> > This is already supported via `check_data_resurrection.enabled` >> (added in 4.1 via CASSANDRA-17180), but it remains disabled by default. >> Recent discussion in CASSANDRA-21221 suggests that operators may be unaware >> of this setting and end up reimplementing similar safeguards themselves. >> >> > >> >> > Given that this feature has now been available in 4.1 and 5.0, I'd >> like to propose enabling it by default in cassandra_latest.yaml for 6.0. >> Are there any concerns with making this change? >> >> > >> >> > Isaac >> >
