Hi Yunhong,

Thank you for the excellent work—your design is clear and well thought out.

I have a couple of questions regarding the leader promotion and replication
process you described:
> Upon receiving this request, the Coordinator recognizes that the
HotStandbyReplica is ready to become the new leader. The Coordinator then
sends a new NotfifyLeaderAndIsrRequest request to promote replica 1 as the
leader, while simultaneously sending a StopReplicaRequest to replica 0 to
take it offline. This completes the full rebalance process for the
PrimaryKey Table.

1. In your example, it appears that the HotStandbyReplica is only selected
and begins syncing during a rebalance event. This differs from the ISR
(In-Sync Replica) model, where replicas maintain sync continuously, even
during normal operation.
Are there any plans to keep the HotStandbyReplica actively synchronized
outside of rebalance scenarios?

2. Suppose replica 0 is still acting as leader and continues to receive
writes from producers. How is data consistency ensured between the time
when AdjustIsrRequest is sent and when NotifyLeaderAndIsrRequest and
StopReplicaRequest are processed?
In log-based tables with ack = -1, a write is only considered successful
once it has been replicated to all ISR members. However, I don’t see an
equivalent durability guarantee in the PrimaryKey Table model.

Best
Hongshun

On Thu, Sep 4, 2025 at 8:54 AM Yunhong Zheng <[email protected]> wrote:

> Hi Jark,
>
> Apologies for the delayed response. Thanks for your suggestions, I've
> implemented the changes.
>
> Yours,
> Yunhong
>
> On 2025/08/26 11:38:52 Jark Wu wrote:
> > Hi Yunhong,
> >
> > Thanks for completing the rebalance story of Fluss. The new design looks
> > good to me in general.
> >
> > Could you please add a comment in the documentation to explain the full
> > form of "issr"? Also, would it be better to use the shorter abbreviation
> > "iss" instead?
> >
> > Best,
> > Jark
> >
> > On Mon, 25 Aug 2025 at 09:44, yunhong Zheng <[email protected]>
> > wrote:
> >
> > > Hi all,
> > >
> > > In FIP-8: Support Cluster Rebalance, we introduced cluster rebalance,
> but
> > > this mechanism was initially designed specifically for Log Tables.
> However,
> > > PrimaryKey Tables come with significant limitations in rebalance as it
> > > needs a lot of time to recover.
> > >
> > > So, I'd like to propose FIP-13: Support rebalance for PrimaryKey
> Table[2].
> > >
> > > Any feedback are suggestions on this proposal are welcome!
> > >
> > > [1]:
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLUSS/FIP-8%3A+Support+Cluster+Reblance
> > > [1]:
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLUSS/FlP-13%3A+Support+rebalance+for+PrimaryKey+Table
> > >
> > > Regards,
> > > Yunhong
> > >
> >
>

Reply via email to