Re: [DISCUSS] FIP-17: Streaming KV Scan RPC

Giannis Polyzos Tue, 10 Mar 2026 00:49:11 -0700

Hi devs,
Let me know if there are any comments here, otherwise I would like to start
a vote thread.


Best,
Giannis

On Thu, 5 Mar 2026 at 3:38 PM, Giannis Polyzos <[email protected]>
wrote:

> Hi devs,
>
> After a long time, i will like to reinitiate the discussions on FIP-17.
>
> I made quite a few updates on the FIP, which you can find here:
>
> https://cwiki.apache.org/confluence/display/FLUSS/FIP-17+Primary+Key+Table+Snapshot+Queries
> and updated the title to better reflect the goal. Let me know if it makes
> sense.
>
> Moreover in the end of the proposal, you will find a section as *extras *which
> has a suggestion for a heartbeat mechanism. However, during my PoC, I found
> that this is not really needed, but
> I would like your thoughts and feedback first.
>
> Best,
> Giannis
>
> On Wed, Oct 29, 2025 at 2:45 PM Giannis Polyzos <[email protected]>
> wrote:
>
>> Yang, thank you for your thoughtful comments.
>>
>> Indeed, we are streaming the results to the client; however, it's still a
>> batch operation. We could use "KV store (or PK table) Snapshot Query"  or
>> something similar, since we are querying a RocksDB snapshot. WDYT?
>> The newly introduced KvBatchScanner should be able to be reused from both
>> the client itself - assume a scenario that I want to periodically query the
>> full RocksDB KV store to power real-time dashboards - as well as Flink
>> (with more engines to follow later).
>> It issues requests to fetch the results per bucket and transmit them back
>> to the client.
>>
>> > Could you elaborate on why the new KvBatchScanner isn't reusable?
>> I think the reasoning here is that reach requests create a new
>> KvBatchScanner, which polls the records and then closes automatically. Any
>> reason you see this as a limitation, and we should consider making it
>> reusable?
>>
>> The design aims mainly for the Fluss client API.. Should we add an
>> integration design with Flink? Wang Cheng, WDYT?
>>
>> Best,
>> Giannis
>>
>>
>>
>> On Tue, Oct 28, 2025 at 4:44 AM Yang Wang <[email protected]>
>> wrote:
>>
>>> Hi Cheng,
>>>
>>> Thank you for driving this excellent work! Your FIP document shows great
>>> thought and initiative. I've gone through it and have some questions and
>>> suggestions that I hope can further enhance this valuable contribution.
>>>
>>> 1、Regarding the Title, I believe we could consider changing it to
>>> "Support
>>> full scan in batch mode for PrimaryKey Table". The term "Streaming" might
>>> cause confusion with Flink's streaming/batch modes, and this revised
>>> title
>>> would provide better clarity.
>>>
>>> 2、In the Motivation section, I think there are two particularly important
>>> benefits worth highlighting: (1) OLAP engines will be able to perform
>>> full
>>> snapshot reads on Fluss primary-key tables. (2) This approach can replace
>>> the current KvSnapshotBatchScanner, allowing the Fluss client to
>>> eliminate
>>> its RocksDB dependency entirely.
>>>
>>> 3、Concerning the Proposed Changes, could you clarify when exactly the
>>> client creates a KV snapshot on the server side, and when we send the
>>> bucket_scan_req?
>>>
>>> Let me share my thinking on this: When Flink attempts to read from a
>>> PrimaryKey table, the FlinkSourceEnumerator in the JobMaster generates
>>> HybridSnapshotLogSplit and dispatches them to SplitReaders running on the
>>> TaskManager. The JobMaster doesn't actually read data—it merely defines
>>> and
>>> manages the splits. Therefore, we need to ensure the JM has sufficient
>>> information to determine the boundary of the KV snapshot and the
>>> startOffset of the LogSplit.
>>>
>>> I suggest we explicitly create a snapshot (or as you've termed it, a
>>> new_scan_request) on the server side. This way, the FlinkSourceEnumerator
>>> can use it to define a HybridSnapshotLogSplit, and the SplitReaders can
>>> perform pollBatch operations on this snapshot (which would be bound to
>>> the
>>> specified scanner_id).
>>>
>>> 4、 Could you elaborate on why the new KvBatchScanner isn't reusable?
>>> What's
>>> the reasoning behind this limitation? (I believe RocksDB iterators do
>>> support the seekToFirst operation.) If a TaskManager fails over before a
>>> checkpoint, rescanning an existing snapshot seems like a natural
>>> requirement.
>>>
>>> 5、I think it would be beneficial to include some detailed design aspects
>>> regarding Flink's integration with the new BatchScanner.
>>>
>>> Overall, this is a solid foundation for an important enhancement. Looking
>>> forward to discussing these points further!
>>>
>>> Best regards, Yang
>>>
>>> Wang Cheng <[email protected]> 于2025年10月22日周三 17:09写道：
>>>
>>> > Hi all,
>>> >
>>> >
>>> > As of v0.8, Fluss only supports KV snapshot batch scan and limit KV
>>> batch
>>> > scan. The former approach is constrained by snapshot availability and
>>> > remote storage performance, while the later one is only applicable to
>>> > queries with LIMIT clause and risks high memory pressure.
>>> >
>>> >
>>> > To address those limitations, Giannis Polyzos and I are writing to
>>> propose
>>> > FIP-17: a general-purpose streaming KV scan for Fluss [1].
>>> >
>>> >
>>> > Any feedback and suggestions on this proposal are welcome!
>>> >
>>> >
>>> > [1]:
>>> >
>>> https://cwiki.apache.org/confluence/display/FLUSS/FIP-17+Streaming+KV+Scan+RPC
>>> >
>>> > Regards,
>>> > Cheng
>>> >
>>> >
>>> >
>>> > &nbsp;
>>>
>>

Re: [DISCUSS] FIP-17: Streaming KV Scan RPC

Reply via email to