Thank you Hong, I think the lenient mode is better.
Best, Jark On Tue, 12 Aug 2025 at 12:08, Hongshun Wang <[email protected]> wrote: > Hi Jark, > > Thanks for the thoughtful feedback. Here are my responses to your > questions: > > > Could you clarify why default was chosen over the parent server node > in the FIP? > This name is inspired by Kafka’s configuration hierarchy. I agree that > global is clearer and more intuitive than default. Thanks for the > suggestion — I’ve updated the naming accordingly, and it does look better > now. > > > > ConfigEntry necessary in **AlterConfigOp**?* > > Naming consistency improvement > Modified it. > > > > It's also not clear to me how to rotate the secrets using > "password.encoder.old.secret”. > > You're right — the term “rotate” might be misleading. Here’s what this > mechanism actually does: When users want to change the encryption key for > security reasons (e.g., key rotation), simply updating > password.encoder.secret would prevent the server from decoding existing > encrypted values in ZooKeeper (since they were encrypted with the old key). > > > How the rotation works internally? How it handles failures during > rotation? > > > We can look at how Kafka handles this: the fallback mechanism only takes > effect during server restart. On startup, the system first attempts to > decode the configuration using password.encoder.old.secret. If that fails, > it assumes the value was encrypted with the current password.encoder.secret > and tries decoding with the new key. > After startup, whenever configuration properties are read from ZooKeeper, > any password that cannot be decoded is silently ignored (i.e., removed from > the config set), and a warning is logged. The relevant logic is shown below: > > java > > ```java > private[server] def fromPersistentProps(persistentProps: Properties, > perBrokerConfig: Boolean): Properties = { > val props = persistentProps.clone().asInstanceOf[Properties] > > def decodePassword(configName: String, value: String): Unit = { > if (value != null) { > try { > props.setProperty(configName, passwordEncoder.decode(value).value) > } catch { > case e: Exception => > error(s"Dynamic password config $configName could not be decoded, > ignoring.", e) > props.remove(configName) > } > } > } > > props.asScala.forKeyValue { (name, value) => > if (isPasswordConfig(name)) > decodePassword(name, value) > } > props > } > > ```` > > Given this behavior, in Fluss we have two possible approaches: > 1. Follow Kafka’s lenient model: During restart, if a config cannot be > decoded by either password.encoder.old.secret or password.encoder.secret, > it is ignored. The server continues to start up, using whatever valid > configurations remain. > 2. Fail-fast on undecodable secrets: During restart, if any password > cannot be decoded by either the old or new secret, the server fails to > start. This ensures configuration integrity but requires manual > intervention. > > Which choice are you preferred? > > > Best > Hongshun > > 2025年8月10日 12:17,Jark Wu <[email protected]> 写道: > > Besides, I noticed that there are many Kafka concepts mentioned in > "Sensitive Configuration Encryption" section, like "broker", > "server.properties", and we should change them to Fluss concepts. > > It's also not clear to me how to rotate the secrets using > "password.encoder.old.secret". Do we need to provide a specific API for the > rotation? How the rotation works internally? How it handles failures during > rotation? If it is complex, we can move it to the future works with another > FIP and keep current FIP simple. > > Best, > Jark > > On Sun, 10 Aug 2025 at 11:20, Jark Wu <[email protected]> wrote: > > Hi Hong, > > Thanks for putting together this design proposal—I appreciate the thorough > work. I have a few comments and suggestions for consideration: > > *1. ZooKeeper path: `/config/server/default`* > Could you clarify why choose `default` over the parent `server` node in > the FIP? If the intention is to support per-server overrides in the future, > using a name like `all` (e.g., `/config/server/all`) or `global` might be > more intuitive than `default`. The term "default" could be misinterpreted > as providing fallback or static default values, rather than active dynamic > overrides. > > *2. Backward compatibility of `password.encoder.secret`* > Marking `password.encoder.secret` as required raises concerns about > backward compatibility. Since this config doesn’t exist in v0.7 > deployments, upgrading to v0.8 could fail if the field is strictly > required. > > *3. Is ConfigEntry necessary in **AlterConfigOp**?* > The ConfigSource in ConfigEntry is meaningless and confusing for > AlterConfigOp, because all the config to alter are dynamic. I prefer that > AlterConfigOp just contains 3 members: configKey, configValue and opType to > make the inerface clean. > > *4. Naming consistency improvements* > To improve consistency across the codebase, I recommend the following > renames: > - `ConfigEntry.name` → `ConfigEntry.key` > - `PbDescribeConfigsResponseInfo.config_name` → `config_key` > - `PbAlterConfigsRequestInfo.config_name` → `config_key` > - `PbAlterConfigsRequestInfo.config_operation` → `op_type` (to align > with `AlterConfigOp.opType`) > > > Let me know your thoughts. > > Best, > Jark > > > > > > > > On Fri, 8 Aug 2025 at 10:39, Wang Cheng <[email protected]> wrote: > > Hi Hongshun, > > > > Since tablet servers lazily apply config changes, could we provide an RPC > call like IsAlterConfigDone to allow users to track the progress? > > Do we have a RESET command to restore the value of a run-time parameter > to the default value? > > I'm still unclear about the need for two distinct implementations for > dynamic server-level and table-level configurations. In modern distributed > databases like PGXL [1] (a distributed PostgreSQL variant), both DDL > operations and SET commands are handled uniformly via a two-phase commit > protocol to avoid any inconsistency problems across all data nodes and > coordinators. > > > > [1] > > https://postgres-x2.github.io/presentation_docs/2014-07-PGXC-Implementation/pgxc.pdf > > > > Regards, > Cheng > > > > > > > > > ------------------ Original ------------------ > From: > "dev" > < > [email protected]>; > Date: Thu, Aug 7, 2025 11:51 AM > To: "dev"<[email protected]>; > > Subject: [DISCUSS] FIP-12: Server Dynamic Config > > > > Hi devs, > > I'd like to start a discussion about FIP-12: Server Dynamic Config[1]. > Currently, any changes to the server.yaml configuration in a Fluss cluster > require a full restart of the cluster, which negatively impacts stability > and availability. To improve operational agility and reduce downtime, we > propose introducing dynamic configuration capabilities, enabling runtime > modification of key parameters—such as enabling/disabling lake-streaming > integration features or managing user accounts—without requiring service > interruption. > > The POC[2] code is provided to enable lake format. You can try and give > some advice. > > Best > Hongshun > > > [1] > > > https://cwiki.apache.org/confluence/display/FLUSS/FIP-12%3A+Server+Dynamic+Config > [2] https://github.com/loserwang1024/fluss/tree/poc-dymanic-config. > > > > >
