Hello Fluss Community,
I propose initiating discussion on FIP-10: Support Log RecordBatch Filter 
Pushdown (FIP Page <https://github.com/platinumhamburg/fluss >). This 
optimization aims to improve the performance of Log table queries and is now 
ready for community feedback.
Core Motivation
Currently, filtering on non-PK/non-partition keys requires:

 * 
Transferring full RecordBatches from storage,

 * 
Transmitting irrelevant records over the network,

 * 
Decompressing non-matching Arrow data.
This results in unnecessary network/memory overhead, especially for 
low-selectivity queries.
FIP-10 introduces RecordBatch-level filter pushdown to enable early filtering 
at the storage layer, reducing:

 * 
Network transfer by skipping non-matching batches,

 * 
Memory pressure via pre-deserialization filtering,

 * 
CPU cost from decompression of discarded data.
Implementation Status
A proof-of-concept (PoC) has been implemented in the logfilter branch 
<https://github.com/platinumhamburg/fluss > and is ready for testing and 
preview.
Any feedback and suggestions on this proposal are welcome! Looking forward to 
your insights.
Best regards,
Yang Wang

Reply via email to