Re: [DISCUSS] FlP-13: Support rebalance for PrimaryKey Table

yunhong Zheng Sun, 09 Nov 2025 17:57:22 -0800

Hi yuxia,

Thanks for your feedback.


I think what you said is reasonable. I've changed hot_standby_replicas to
hot_standby_replica (only one potential replica) in the FIP. As I haven't
been to push this discussion and vote forward for quite some time, I'd
appreciate it if you could help participate in vote. Thank you very much.

The vote link:
https://lists.apache.org/thread/locjg4wodxv4z4q3smpowm12vbsnvksq

Yours,
Yunhong Zheng


yuxia <[email protected]> 于2025年9月15日周一 19:25写道：

> Hi, Yunhong.
>
> Thanks for driving this FIP. This FIP is a great improvement to Fluss
> primary table.
>
> Sorry for jumping into it lately. Not meant to block voting since it looks
> good to me overall. Just one question:
> Does the hot_standby_replicas field really need to be a list. When will it
> contain multiple replicas?
>
> IIUC, the hot_standby_replica means a replica is going to be leader due to
> rebanlance, but not yet since it's not in iss.
> So, in this context, how can there be multiple replicas to going to be
> leader in one round of rebanlance?
> Please correct me if I'm wrong.
>
> Best regards,
> Yuxia
>
> ----- 原始邮件 -----
> 发件人: "Yunhong Zheng" <[email protected]>
> 收件人: "dev" <[email protected]>
> 发送时间: 星期一, 2025年 9 月 15日 下午 5:49:25
> 主题: Re: [DISCUSS] FlP-13: Support rebalance for PrimaryKey Table
>
> Hi Hongshun Wang,
>
> Thanks for you feedback.
>
> Regarding question 1: This FIP solely focuses on adding rebalance support
> for PrimaryKey tables, to make the rebalance story complete. The more
> generalized hotStandby implementation requires detailed consideration
> beyond the scope of this proposal.
>
> Regarding question 2: data consistency are also ensured by isr.  the iss
> remains non-instrusive to the write path.
>
> On 2025/09/09 06:25:47 Hongshun Wang wrote:
> > Hi Yunhong,
> >
> > Thank you for the excellent work—your design is clear and well thought
> out.
> >
> > I have a couple of questions regarding the leader promotion and
> replication
> > process you described:
> > > Upon receiving this request, the Coordinator recognizes that the
> > HotStandbyReplica is ready to become the new leader. The Coordinator then
> > sends a new NotfifyLeaderAndIsrRequest request to promote replica 1 as
> the
> > leader, while simultaneously sending a StopReplicaRequest to replica 0 to
> > take it offline. This completes the full rebalance process for the
> > PrimaryKey Table.
> >
> > 1. In your example, it appears that the HotStandbyReplica is only
> selected
> > and begins syncing during a rebalance event. This differs from the ISR
> > (In-Sync Replica) model, where replicas maintain sync continuously, even
> > during normal operation.
> > Are there any plans to keep the HotStandbyReplica actively synchronized
> > outside of rebalance scenarios?
> >
> > 2. Suppose replica 0 is still acting as leader and continues to receive
> > writes from producers. How is data consistency ensured between the time
> > when AdjustIsrRequest is sent and when NotifyLeaderAndIsrRequest and
> > StopReplicaRequest are processed?
> > In log-based tables with ack = -1, a write is only considered successful
> > once it has been replicated to all ISR members. However, I don’t see an
> > equivalent durability guarantee in the PrimaryKey Table model.
> >
> > Best
> > Hongshun
> >
> > On Thu, Sep 4, 2025 at 8:54 AM Yunhong Zheng <[email protected]> wrote:
> >
> > > Hi Jark,
> > >
> > > Apologies for the delayed response. Thanks for your suggestions, I've
> > > implemented the changes.
> > >
> > > Yours,
> > > Yunhong
> > >
> > > On 2025/08/26 11:38:52 Jark Wu wrote:
> > > > Hi Yunhong,
> > > >
> > > > Thanks for completing the rebalance story of Fluss. The new design
> looks
> > > > good to me in general.
> > > >
> > > > Could you please add a comment in the documentation to explain the
> full
> > > > form of "issr"? Also, would it be better to use the shorter
> abbreviation
> > > > "iss" instead?
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > On Mon, 25 Aug 2025 at 09:44, yunhong Zheng <
> [email protected]>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > In FIP-8: Support Cluster Rebalance, we introduced cluster
> rebalance,
> > > but
> > > > > this mechanism was initially designed specifically for Log Tables.
> > > However,
> > > > > PrimaryKey Tables come with significant limitations in rebalance
> as it
> > > > > needs a lot of time to recover.
> > > > >
> > > > > So, I'd like to propose FIP-13: Support rebalance for PrimaryKey
> > > Table[2].
> > > > >
> > > > > Any feedback are suggestions on this proposal are welcome!
> > > > >
> > > > > [1]:
> > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLUSS/FIP-8%3A+Support+Cluster+Reblance
> > > > > [1]:
> > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLUSS/FlP-13%3A+Support+rebalance+for+PrimaryKey+Table
> > > > >
> > > > > Regards,
> > > > > Yunhong
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FlP-13: Support rebalance for PrimaryKey Table

Reply via email to