Thank you Hong,

I think the lenient mode is better.

Best,
Jark

On Tue, 12 Aug 2025 at 12:08, Hongshun Wang <[email protected]> wrote:

> Hi Jark,
>
> Thanks for the thoughtful feedback. Here are my responses to your
> questions:
>
> >  Could you clarify why default was chosen over the parent server node
> in the FIP?
> This name is inspired by Kafka’s configuration hierarchy. I agree that
> global is clearer and more intuitive than default. Thanks for the
> suggestion — I’ve updated the naming accordingly, and it does look better
> now.
>
>
> >  ConfigEntry necessary in **AlterConfigOp**?*
> > Naming consistency improvement
> Modified it.
>
>
> > It's also not clear to me how to rotate the secrets using
> "password.encoder.old.secret”.
>
>  You're right — the term “rotate” might be misleading. Here’s what this
> mechanism actually does: When users want to change the encryption key for
> security reasons (e.g., key rotation), simply updating
> password.encoder.secret would prevent the server from decoding existing
> encrypted values in ZooKeeper (since they were encrypted with the old key).
>
> >  How the rotation works internally? How it handles failures during
> rotation?
>
>
> We can look at how Kafka handles this: the fallback mechanism only takes
> effect during server restart. On startup, the system first attempts to
> decode the configuration using password.encoder.old.secret. If that fails,
> it assumes the value was encrypted with the current password.encoder.secret
> and tries decoding with the new key.
> After startup, whenever configuration properties are read from ZooKeeper,
> any password that cannot be decoded is silently ignored (i.e., removed from
> the config set), and a warning is logged. The relevant logic is shown below:
>
> java
>
> ```java
> private[server] def fromPersistentProps(persistentProps: Properties,
> perBrokerConfig: Boolean): Properties = {
> val props = persistentProps.clone().asInstanceOf[Properties]
>
> def decodePassword(configName: String, value: String): Unit = {
> if (value != null) {
> try {
> props.setProperty(configName, passwordEncoder.decode(value).value)
> } catch {
> case e: Exception =>
> error(s"Dynamic password config $configName could not be decoded,
> ignoring.", e)
> props.remove(configName)
> }
> }
> }
>
> props.asScala.forKeyValue { (name, value) =>
> if (isPasswordConfig(name))
> decodePassword(name, value)
> }
> props
> }
>
> ````
>
> Given this behavior, in Fluss we have two possible approaches:
> 1. Follow Kafka’s lenient model: During restart, if a config cannot be
> decoded by either password.encoder.old.secret or password.encoder.secret,
> it is ignored. The server continues to start up, using whatever valid
> configurations remain.
> 2. Fail-fast on undecodable secrets: During restart, if any password
> cannot be decoded by either the old or new secret, the server fails to
> start. This ensures configuration integrity but requires manual
> intervention.
>
> Which choice are you preferred?
>
>
> Best
> Hongshun
>
> 2025年8月10日 12:17,Jark Wu <[email protected]> 写道:
>
> Besides, I noticed that there are many Kafka concepts mentioned in
> "Sensitive Configuration Encryption" section, like "broker",
> "server.properties", and we should change them to Fluss concepts.
>
> It's also not clear to me how to rotate the secrets using
> "password.encoder.old.secret". Do we need to provide a specific API for the
> rotation? How the rotation works internally? How it handles failures during
> rotation? If it is complex, we can move it to the future works with another
> FIP and keep current FIP simple.
>
> Best,
> Jark
>
> On Sun, 10 Aug 2025 at 11:20, Jark Wu <[email protected]> wrote:
>
> Hi Hong,
>
> Thanks for putting together this design proposal—I appreciate the thorough
> work. I have a few comments and suggestions for consideration:
>
> *1. ZooKeeper path: `/config/server/default`*
> Could you clarify why choose `default` over the parent `server` node in
> the FIP? If the intention is to support per-server overrides in the future,
> using a name like `all` (e.g., `/config/server/all`) or `global` might be
> more intuitive than `default`. The term "default" could be misinterpreted
> as providing fallback or static default values, rather than active dynamic
> overrides.
>
> *2. Backward compatibility of `password.encoder.secret`*
> Marking `password.encoder.secret` as required raises concerns about
> backward compatibility. Since this config doesn’t exist in v0.7
> deployments, upgrading to v0.8 could fail if the field is strictly
> required.
>
> *3. Is ConfigEntry necessary in **AlterConfigOp**?*
> The ConfigSource in ConfigEntry is meaningless and confusing for
> AlterConfigOp, because all the config to alter are dynamic. I prefer that
> AlterConfigOp just contains 3 members: configKey, configValue and opType to
> make the inerface clean.
>
> *4. Naming consistency improvements*
> To improve consistency across the codebase, I recommend the following
> renames:
>   - `ConfigEntry.name` → `ConfigEntry.key`
>   - `PbDescribeConfigsResponseInfo.config_name` → `config_key`
>   - `PbAlterConfigsRequestInfo.config_name` → `config_key`
>   - `PbAlterConfigsRequestInfo.config_operation` → `op_type` (to align
> with `AlterConfigOp.opType`)
>
>
> Let me know your thoughts.
>
> Best,
> Jark
>
>
>
>
>
>
>
> On Fri, 8 Aug 2025 at 10:39, Wang Cheng <[email protected]> wrote:
>
> Hi Hongshun,
>
>
>
> Since tablet servers lazily apply config changes, could we provide an RPC
> call like IsAlterConfigDone to allow users ​​to track the progress​​?
>
> Do we have a RESET command to restore the value of a run-time parameter
> to the default value?
>
> I'm still unclear about the need for two distinct implementations​​ for
> dynamic server-level and table-level configurations. In modern distributed
> databases like ​​PGXL [1] (a distributed PostgreSQL variant)​​, both DDL
> operations and SET commands ​​are handled uniformly via a two-phase commit
> protocol​​ to avoid any inconsistency problems across all data nodes and
> coordinators.
>
>
>
> [1]
>
> https://postgres-x2.github.io/presentation_docs/2014-07-PGXC-Implementation/pgxc.pdf
>
>
>
> Regards,
> Cheng
>
>
>
> &nbsp;
>
>
>
>
> ------------------ Original ------------------
> From:
>                                                    "dev"
>                                                                  <
> [email protected]&gt;;
> Date:&nbsp;Thu, Aug 7, 2025 11:51 AM
> To:&nbsp;"dev"<[email protected]&gt;;
>
> Subject:&nbsp;[DISCUSS] FIP-12: Server Dynamic Config
>
>
>
> Hi devs,
>
> I'd like to start a discussion about FIP-12: Server Dynamic Config[1].
> Currently, any changes to the server.yaml configuration in a Fluss cluster
> require a full restart of the cluster, which negatively impacts stability
> and availability. To improve operational agility and reduce downtime, we
> propose introducing dynamic configuration capabilities, enabling runtime
> modification of key parameters—such as enabling/disabling lake-streaming
> integration features or managing user accounts—without requiring service
> interruption.
>
> The POC[2] code is provided to enable lake format. You can try and give
> some advice.
>
> Best
> Hongshun
>
>
> [1]
>
>
> https://cwiki.apache.org/confluence/display/FLUSS/FIP-12%3A+Server+Dynamic+Config
> [2] https://github.com/loserwang1024/fluss/tree/poc-dymanic-config.
>
>
>
>
>

Reply via email to