Hey folks
Helping nudge this along - any dissent? Anyone with concerns they’d like addressed before we move towards a vote?
- Jeff On Jul 26, 2025, at 12:35 PM, Ekaterina Dimitrova <e.dimitr...@gmail.com> wrote:
Hi,
Just wanted to mention that we already have some config related to duplicate keys in cassandra.yaml. (That doesn’t mean we cannot extend/improve things further, of course) From our docs:
CASSANDRA-17379 was opened to improve the user experience and deprecate the overloading. By default, we refuse starting Cassandra with a config containing both old and new config keys for the same parameter. Start Cassandra with -Dcassandra.allow_new_old_config_keys=true to override. For historical reasons duplicate config keys in cassandra.yaml are allowed by default, start Cassandra with -Dcassandra.allow_duplicate_config_keys=false to disallow this. Please note that key_cache_save_period , row_cache_save_period , counter_cache_save_period will be affected only by -Dcassandra.allow_duplicate_config_keys .
There are people who actively use the overloading in their environments so backward compatibility is important.
Best regards, Ekaterina That's a really good idea! I have updated the CEP to include this duplicate_key_policy functionality and corresponding test scenarios.
Another hit from the DevOps request backlog. I'm glad this has finally turned into something formal. This will make CI/CD much easier. One thing I hope this fosters more of is the sharing of configs. For example, "here are my recommended storage settings for EBS."
The CEP aborts on any duplicate key. For the people doing GitOps, there will be a need for layering a read-only baseline `cassandra.yaml` with environment-specific or secret files. This is exactly the way Argo CD/Helm handles value precedence. This is typical in cloud native environments. I propose a modification that allows an *opt-in* policy switch:
duplicate_key_policy: { ABORT (default) | LAST_WINS | WARN }
* ABORT – current behaviour, startup fails on duplicates. * WARN – duplicates allowed, last key wins, but every override is logged. * LAST_WINS – same as WARN, but without log
I suppose the only concern would be maintaining this version in alignment with what's going into the main cassandra.yaml as part of the regular development.
Seems like it'd be relatively easy to script something that'll generate modularized config files based on the reference cassandra.yaml by classifying different parameters based on file grouping. Not to scope creep or add on to the CEP or anything, just thinking out loud; as follow up work it could be useful.
From a technical perspective having a bi-directional sync that'd just dump things into a "overflow" file from monolithic -> modular, and tacking on at the end of cassandra.yaml under an overflow section for things not classified in the script from modular -> monolithic config shouldn't be too complex. If that proved stable, integrating that into the build process and even adding a checkstyle job target warning on non-classified configuration parameters could tighten the whole thing up and give us a minimally invasive way to support both through a transition.
On Mon, Jul 21, 2025, at 9:41 AM, Johnny Miller wrote: I have added the section "Reference Example Configuration" - will see what the feedback on this is. I suppose the only concern would be maintaining this version in alignment with what's going into the main cassandra.yaml as part of the regular development.
That sounds useful to me.
I'd like to see us move to "modularized by default"; our current config being 2839 lines of .yaml is a bad experience for both new and old users. Starting with examples of the new paradigm and then refactoring config out over time for the default config is a path forward I'd support.
On Mon, Jul 21, 2025, at 9:18 AM, Johnny Miller wrote: One feature I was thinking of adding to the CEP was to have an example yaml config setup using the includes with the config grouped logically so people have a reference example in the conf? Would this be a good idea?
Hello 👋
We would like to propose CEP-51: Support Include Semantics for cassandra.yaml for adoption by the community:
This CEP proposes adding completely optional include directives to Cassandra's configuration system, allowing users who need it to split their cassandra.yaml into multiple files for better security, organization, and deployment flexibility. No changes are made to the default cassandra.yaml, and this feature is entirely opt-in. The proposed include directives (include, include_if_exists, and include_dir) enable organizations to: - Apply the principle of least privilege by separating sensitive security configurations into files with restricted permissions
- Better organize large configuration files by logical subsystems
- Simplify configuration management in environments where different teams manage different aspects of the cluster
- Follow established patterns already present in PostgreSQL, MySQL, Redis, NGINX, and other widely-used systems
Key design principles: - Zero impact on users who don't use the feature
- No recursive includes (only the main cassandra.yaml can contain include directives)
- No duplicate configuration keys allowed (each setting must appear in exactly one file)
- Clear error messages for troubleshooting
This enhancement addresses real operational challenges faced by organisations with strict security requirements or complex deployment needs, while maintaining complete backward compatibility and requiring no changes to existing deployments.
Thanks in advance for your time and feedback. Please keep the discussion on this mailing list thread.
Johnny
|