Hi Cheng,
> Since tablet servers lazily apply config changes, could we provide an RPC
call like IsAlterConfigDone to allow users to track the progress?
If the RPC contacts every broker — including all tablet servers — the
overhead would be too high for what is intended to be a lightweight
tracking operation.

Alternatively, if it only checks the coordinator, it cannot accurately
reflect the actual progress across all servers.

To enable accurate progress tracking, the coordinator would need to
maintain a per-operation state machine (with each operation potentially
involving multiple configuration changes). This introduces significant
complexity — and consecutive alter config operations would further compound
this complexity, requiring coordination, state cleanup, and failure
handling.

Given these costs, we’ve chosen to treat the operation as complete once the
coordinator has successfully applied the configuration — consistent with
our guiding principle:
“We Only Validate Dynamic Configs on the Coordinator.”

For reference, Kafka doesn’t even validate application on the coordinator;
it returns success as soon as the configuration is persisted in ZooKeeper.

>. two distinct implementations for dynamic server-level and table-level
configurations
It's not what this FIP decided. Table-level configurations are already
managed through CREATE TABLE DDL statements — not dynamic configuration
APIs — and are propagated across nodes via TableRegistration. As such, they
are outside the scope of runtime dynamic configuration updates and do not
require a separate dynamic mechanism.
Maybe transactions between two admin operation is another thing we need to
discuss.

3. Do we have a RESET command to restore the value of a run-time parameter
to the default value.
I think it's a good idea, maybe we can do it.

Best,
Honshun

On Mon, Aug 11, 2025 at 5:43 PM Mail Delivery Subsystem <
[email protected]> wrote:

> [image: Error Icon]
> Address not found
> Your message wasn't delivered to *[email protected]* because the
> domain qq.com.invalid couldn't be found. Check for typos or unnecessary
> spaces and try again.
> LEARN MORE <https://support.google.com/mail/?p=BadRcptDomain>
> The response was:
>
> DNS Error: DNS type 'mx' lookup of qq.com.invalid responded with code
> NXDOMAIN Domain name not found: qq.com.invalid For more information, go to
> https://support.google.com/mail/?p=BadRcptDomain
>
>
>
> ---------- Forwarded message ----------
> From: Hongshun Wang <[email protected]>
> To: [email protected]
> Cc:
> Bcc:
> Date: Mon, 11 Aug 2025 17:42:53 +0800
> Subject: Re: [DISCUSS] FIP-12: Server Dynamic Config
> Hi Cheng,
> > Since tablet servers lazily apply config changes, could we provide an
> RPC call like IsAlterConfigDone to allow users to track the progress?
> If the RPC contacts every broker — including all tablet servers — the
> overhead would be too high for what is intended to be a lightweight
> tracking operation.
>
> Alternatively, if it only checks the coordinator, it cannot accurately
> reflect the actual progress across all servers.
>
> To enable accurate progress tracking, the coordinator would need to
> maintain a per-operation state machine (with each operation potentially
> involving multiple configuration changes). This introduces significant
> complexity — and consecutive alter config operations would further compound
> this complexity, requiring coordination, state cleanup, and failure
> handling.
>
> Given these costs, we’ve chosen to treat the operation as complete once
> the coordinator has successfully applied the configuration — consistent
> with our guiding principle:
> “We Only Validate Dynamic Configs on the Coordinator.”
>
> For reference, Kafka doesn’t even validate application on the coordinator;
> it returns success as soon as the configuration is persisted in ZooKeeper.
>
> >. two distinct implementations for dynamic server-level and table-level
> configurations
> It's not what this FIP decided. Table-level configurations are already
> managed through CREATE TABLE DDL statements — not dynamic configuration
> APIs — and are propagated across nodes via TableRegistration. As such, they
> are outside the scope of runtime dynamic configuration updates and do not
> require a separate dynamic mechanism.
> Maybe transactions between two admin operation is another thing we need to
> discuss.
>
> 3. Do we have a RESET command to restore the value of a run-time parameter
> to the default value.
> I think it's a good idea, maybe we can do it.
>
> Best,
> Honshun
>
> On Fri, Aug 8, 2025 at 10:39 AM Wang Cheng <[email protected]>
> wrote:
>
>> Hi Hongshun,
>>
>>
>>
>> Since tablet servers lazily apply config changes, could we provide an RPC
>> call like IsAlterConfigDone to allow users ​​to track the progress​​?
>>
>> Do we have a RESET command to restore the value of a run-time parameter
>> to the default value?
>>
>> I'm still unclear about the need for two distinct implementations​​ for
>> dynamic server-level and table-level configurations. In modern distributed
>> databases like ​​PGXL [1] (a distributed PostgreSQL variant)​​, both DDL
>> operations and SET commands ​​are handled uniformly via a two-phase commit
>> protocol​​ to avoid any inconsistency problems across all data nodes and
>> coordinators.
>>
>>
>>
>> [1]
>> https://postgres-x2.github.io/presentation_docs/2014-07-PGXC-Implementation/pgxc.pdf
>>
>>
>>
>> Regards,
>> Cheng
>>
>>
>>
>> &nbsp;
>>
>>
>>
>>
>> ------------------ Original ------------------
>> From:
>>                                                     "dev"
>>                                                                   <
>> [email protected]&gt;;
>> Date:&nbsp;Thu, Aug 7, 2025 11:51 AM
>> To:&nbsp;"dev"<[email protected]&gt;;
>>
>> Subject:&nbsp;[DISCUSS] FIP-12: Server Dynamic Config
>>
>>
>>
>> Hi devs,
>>
>> I'd like to start a discussion about FIP-12: Server Dynamic Config[1].
>> Currently, any changes to the server.yaml configuration in a Fluss cluster
>> require a full restart of the cluster, which negatively impacts stability
>> and availability. To improve operational agility and reduce downtime, we
>> propose introducing dynamic configuration capabilities, enabling runtime
>> modification of key parameters—such as enabling/disabling lake-streaming
>> integration features or managing user accounts—without requiring service
>> interruption.
>>
>> The POC[2] code is provided to enable lake format. You can try and give
>> some advice.
>>
>> Best
>> Hongshun
>>
>>
>> [1]
>>
>> https://cwiki.apache.org/confluence/display/FLUSS/FIP-12%3A+Server+Dynamic+Config
>> [2] https://github.com/loserwang1024/fluss/tree/poc-dymanic-config.
>
>

Reply via email to