So if you forget to update cassandra.yaml you fail the build? Makes sense to me.
One additional thing I would like to see is the reverse… did you put something in yaml that isn’t in Config? This is a bug I have seen a few times…. Mostly because people don’t know the rules from SnakeYAML so are not aware that their property isn’t exposed (not something I have seen but know about, if you have a getter without a setter its “exposed” to the vtable but not yaml...). > On Jan 24, 2025, at 8:25 AM, Bernardo Botella <conta...@bernardobotella.com> > wrote: > > Love the suggestion of marking the hidden/advanced configuration properties > with annotations. Leaving a “configuration” property out of the main > configuration file should be deliberate and well thought and argued. I highly > doubt we have 112 “advanced” properties that really need to be hidden to > protect users from themselves :-( > > Agreed with Ekaterina that it is worth reviewing the current properties and > come up with the subset that should stay hidden, and expose the rest on the > yaml file. > > >> On Jan 24, 2025, at 8:00 AM, Dmitry Konstantinov <netud...@gmail.com> wrote: >> >> https://issues.apache.org/jira/browse/CASSANDRA-20249 >> >> On Fri, 24 Jan 2025 at 15:40, Dmitry Konstantinov <netud...@gmail.com >> <mailto:netud...@gmail.com>> wrote: >>> Maybe I missed some patterns but it looks like a pretty good estimation, I >>> did like 10 random checks manually to verify :-) >>> I will try to make an ant target with a similar logic (hopefully, during >>> the weekend) >>> I will create a ticket to track this activity (to share attachments there >>> to not overload the thread with such outputs in future). >>> >>> On Fri, 24 Jan 2025 at 15:37, Štefan Miklošovič <smikloso...@apache.org >>> <mailto:smikloso...@apache.org>> wrote: >>>> Oh my god, 112? :DD I was thinking it would be less than 10. >>>> >>>> Anyway, I think we need to integrate this to some ant target. If you >>>> expanded on this, that would be great. >>>> >>>> On Fri, Jan 24, 2025 at 4:31 PM Dmitry Konstantinov <netud...@gmail.com >>>> <mailto:netud...@gmail.com>> wrote: >>>>> A very primitive implementation of the 1st idea below: >>>>> >>>>> String configUrl = >>>>> "file:///Users/dmitry/IdeaProjects/cassandra-trunk/conf/cassandra.yaml"; >>>>> Field[] allFields = Config.class.getFields(); >>>>> List<String> topLevelPropertyNames = new ArrayList<>(); >>>>> for(Field field : allFields) >>>>> { >>>>> if (!Modifier.isStatic(field.getModifiers())) >>>>> { >>>>> topLevelPropertyNames.add(field.getName()); >>>>> } >>>>> } >>>>> >>>>> URL url = new URL(configUrl); >>>>> List<String> lines = Files.readAllLines(Paths.get(url.toURI())); >>>>> >>>>> int missedCount = 0; >>>>> for (String propertyName : topLevelPropertyNames) >>>>> { >>>>> boolean found = false; >>>>> for (String line : lines) >>>>> { >>>>> if (line.startsWith(propertyName + ":") >>>>> || line.startsWith("#" + propertyName + ":") >>>>> || line.startsWith("# " + propertyName + ":")) { >>>>> found = true; >>>>> break; >>>>> } >>>>> } >>>>> if (!found) >>>>> { >>>>> missedCount++; >>>>> System.out.println(propertyName); >>>>> } >>>>> } >>>>> System.out.println("Total missed:" + missedCount); >>>>> >>>>> It prints the following config property names which are defined in >>>>> Config.java but not present as "property" or "# property " in a file: >>>>> permissions_cache_max_entries >>>>> roles_cache_max_entries >>>>> credentials_cache_max_entries >>>>> auto_bootstrap >>>>> force_new_prepared_statement_behaviour >>>>> use_deterministic_table_id >>>>> repair_request_timeout >>>>> stream_transfer_task_timeout >>>>> cms_await_timeout >>>>> cms_default_max_retries >>>>> cms_default_retry_backoff >>>>> epoch_aware_debounce_inflight_tracker_max_size >>>>> metadata_snapshot_frequency >>>>> available_processors >>>>> repair_session_max_tree_depth >>>>> use_offheap_merkle_trees >>>>> internode_max_message_size >>>>> native_transport_max_message_size >>>>> native_transport_max_request_data_in_flight_per_ip >>>>> native_transport_max_request_data_in_flight >>>>> native_transport_receive_queue_capacity >>>>> min_free_space_per_drive >>>>> max_space_usable_for_compactions_in_percentage >>>>> reject_repair_compaction_threshold >>>>> concurrent_index_builders >>>>> max_streaming_retries >>>>> commitlog_max_compression_buffers_in_pool >>>>> max_mutation_size >>>>> dynamic_snitch >>>>> failure_detector >>>>> use_creation_time_for_hint_ttl >>>>> key_cache_migrate_during_compaction >>>>> key_cache_invalidate_after_sstable_deletion >>>>> paxos_cache_size >>>>> file_cache_round_up >>>>> disk_optimization_estimate_percentile >>>>> disk_optimization_page_cross_chance >>>>> purgeable_tobmstones_metric_granularity >>>>> windows_timer_interval >>>>> otc_coalescing_strategy >>>>> otc_coalescing_window_us >>>>> otc_coalescing_enough_coalesced_messages >>>>> otc_backlog_expiration_interval_ms >>>>> scripted_user_defined_functions_enabled >>>>> user_defined_functions_threads_enabled >>>>> allow_insecure_udfs >>>>> allow_extra_insecure_udfs >>>>> user_defined_functions_warn_timeout >>>>> user_defined_functions_fail_timeout >>>>> user_function_timeout_policy >>>>> back_pressure_enabled >>>>> back_pressure_strategy >>>>> repair_command_pool_full_strategy >>>>> repair_command_pool_size >>>>> block_for_peers_timeout_in_secs >>>>> block_for_peers_in_remote_dcs >>>>> skip_stream_disk_space_check >>>>> snapshot_on_repaired_data_mismatch >>>>> validation_preview_purge_head_start >>>>> initial_range_tombstone_list_allocation_size >>>>> range_tombstone_list_growth_factor >>>>> snapshot_on_duplicate_row_detection >>>>> check_for_duplicate_rows_during_reads >>>>> check_for_duplicate_rows_during_compaction >>>>> autocompaction_on_startup_enabled >>>>> auto_optimise_inc_repair_streams >>>>> auto_optimise_full_repair_streams >>>>> auto_optimise_preview_repair_streams >>>>> consecutive_message_errors_threshold >>>>> internode_error_reporting_exclusions >>>>> compact_tables_enabled >>>>> vector_type_enabled >>>>> intersect_filtering_query_warned >>>>> intersect_filtering_query_enabled >>>>> streaming_slow_events_log_timeout >>>>> repair_state_expires >>>>> repair_state_size >>>>> paxos_variant >>>>> skip_paxos_repair_on_topology_change >>>>> paxos_purge_grace_period >>>>> paxos_on_linearizability_violations >>>>> paxos_state_purging >>>>> paxos_repair_enabled >>>>> paxos_topology_repair_no_dc_checks >>>>> paxos_topology_repair_strict_each_quorum >>>>> skip_paxos_repair_on_topology_change_keyspaces >>>>> paxos_contention_wait_randomizer >>>>> paxos_contention_min_wait >>>>> paxos_contention_max_wait >>>>> paxos_contention_min_delta >>>>> paxos_repair_parallelism >>>>> sstable_read_rate_persistence_enabled >>>>> client_request_size_metrics_enabled >>>>> max_top_size_partition_count >>>>> max_top_tombstone_partition_count >>>>> min_tracked_partition_size >>>>> min_tracked_partition_tombstone_count >>>>> top_partitions_enabled >>>>> severity_during_decommission >>>>> progress_barrier_min_consistency_level >>>>> progress_barrier_default_consistency_level >>>>> progress_barrier_timeout >>>>> progress_barrier_backoff >>>>> discovery_timeout >>>>> unsafe_tcm_mode >>>>> cql_start_time >>>>> native_transport_throw_on_overload >>>>> native_transport_queue_max_item_age_threshold >>>>> native_transport_min_backoff_on_queue_overload >>>>> native_transport_max_backoff_on_queue_overload >>>>> native_transport_timeout >>>>> enforce_native_deadline_for_hints >>>>> Total missed:112 >>>>> >>>>> >>>>> On Fri, 24 Jan 2025 at 15:10, Štefan Miklošovič <smikloso...@apache.org >>>>> <mailto:smikloso...@apache.org>> wrote: >>>>>> It should also work the other way around. If there is a property which >>>>>> is commented out in yaml and it is not in Config.java, that should fail >>>>>> as well. If it is not commented out and it is not in Config.java, that >>>>>> will fail in runtime as it fails on unrecognized property. >>>>>> >>>>>> This will be used in practice very rarely as we seldom remove the >>>>>> properties in Config but if we do and a property is commented out, we >>>>>> should not ship a dead property name, even commented out. >>>>>> >>>>>> On Fri, Jan 24, 2025 at 3:51 PM Paulo Motta <pa...@apache.org >>>>>> <mailto:pa...@apache.org>> wrote: >>>>>>> > > If "# my_cool_property: true" is NOT in cassandra.yaml, we might >>>>>>> > indeed add it, also commented out. I think it would be quite easy to >>>>>>> > check against yaml if there is a line starting on "# >>>>>>> > my_cool_property" or just on "my_cool_property". Both cases would >>>>>>> > satisfy the check. >>>>>>> >>>>>>> Makes sense, I think this would be good to have as a lint or test to >>>>>>> easily catch overlooks during review. >>>>>>> >>>>>>> >>>>>>> On Fri, Jan 24, 2025 at 9:44 AM Štefan Miklošovič >>>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jan 24, 2025 at 3:27 PM Paulo Motta <pa...@apache.org >>>>>>>> <mailto:pa...@apache.org>> wrote: >>>>>>>>> > from time to time I see configuration properties in Config.java and >>>>>>>>> > they are clearly not in cassandra.yaml. Not every property in >>>>>>>>> > Config is in cassandra.yaml. I would like to know if there is some >>>>>>>>> > specific reason behind that. >>>>>>>>> >>>>>>>>> I think one of the original reasons was to "hide" advanced configs >>>>>>>>> that are not meant to be updated, unless in very niche circumstances. >>>>>>>>> However I think this has been extrapolated to non-advanced settings. >>>>>>>>> >>>>>>>>> > Question related to that is if we could not have a build-time check >>>>>>>>> > that all properties in Config have to be in cassandra.yaml and fail >>>>>>>>> > the build if a property in Config does not have its counterpart in >>>>>>>>> > yaml. >>>>>>>>> >>>>>>>>> Are you saying every configuration property should be commented-out, >>>>>>>>> or do you think that every Config property should be specified in >>>>>>>>> cassandra.yaml with their default uncomented ? One issue with that is >>>>>>>>> that you could cause user confusion if you "reveal" a niche/advanced >>>>>>>>> config that is not meant to be updated. I think this would be >>>>>>>>> addressed by the @HiddenInYaml flag you are proposing in a later post. >>>>>>>> >>>>>>>> Yes, then can stay hidden, but we should annotate it with @Hidden or >>>>>>>> similar. As of now, if that property is not in yaml, we just don't >>>>>>>> know if it was forgotten to be added or if we have not added it on >>>>>>>> purpose. >>>>>>>> >>>>>>>> They can keep being commented out if they currently are. Imagine a >>>>>>>> property in Config.java >>>>>>>> >>>>>>>> public boolean my_cool_property = true; >>>>>>>> >>>>>>>> and then this in cassandra.yaml >>>>>>>> >>>>>>>> # my_cool_property: true >>>>>>>> >>>>>>>> It is completely ok. >>>>>>>> >>>>>>>> If "# my_cool_property: true" is NOT in cassandra.yaml, we might >>>>>>>> indeed add it, also commented out. I think it would be quite easy to >>>>>>>> check against yaml if there is a line starting on "# my_cool_property" >>>>>>>> or just on "my_cool_property". Both cases would satisfy the check. >>>>>>>> >>>>>>>> >>>>>>>>> > There are dozens of properties in Config and I have a strong >>>>>>>>> > suspicion that we missed to publish some to yaml so users do not >>>>>>>>> > even know such a property exists and as of now we do not even know >>>>>>>>> > which they are. >>>>>>>>> >>>>>>>>> I believe this is a problem. I think most properties should be in >>>>>>>>> cassandra.yaml, unless they are very advanced or not meant to be >>>>>>>>> updated. >>>>>>>>> >>>>>>>>> Another tangential issue is that there are features/settings that >>>>>>>>> don't even have a Config entry, but are just controlled by JVM >>>>>>>>> properties. >>>>>>>>> >>>>>>>>> I think that we should attempt to unify Config and jvm properties >>>>>>>>> under a predictable structure. For example, if there is a YAML config >>>>>>>>> enable_user_defined_functions, then there should be a respective JVM >>>>>>>>> flag -Dcassandra.enable_user_defined_functions, and vice versa. >>>>>>>> >>>>>>>> Yeah, good idea. >>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Jan 24, 2025 at 9:16 AM Štefan Miklošovič >>>>>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote: >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> from time to time I see configuration properties in Config.java and >>>>>>>>>> they are clearly not in cassandra.yaml. Not every property in Config >>>>>>>>>> is in cassandra.yaml. I would like to know if there is some specific >>>>>>>>>> reason behind that. >>>>>>>>>> >>>>>>>>>> Question related to that is if we could not have a build-time check >>>>>>>>>> that all properties in Config have to be in cassandra.yaml and fail >>>>>>>>>> the build if a property in Config does not have its counterpart in >>>>>>>>>> yaml. >>>>>>>>>> >>>>>>>>>> There are dozens of properties in Config and I have a strong >>>>>>>>>> suspicion that we missed to publish some to yaml so users do not >>>>>>>>>> even know such a property exists and as of now we do not even know >>>>>>>>>> which they are. >>>>> >>>>> >>>>> >>>>> -- >>>>> Dmitry Konstantinov >>> >>> >>> >>> -- >>> Dmitry Konstantinov >> >> >> >> -- >> Dmitry Konstantinov >