platinumhamburg opened a new issue, #2950: URL: https://github.com/apache/fluss/issues/2950
### Search before asking - [x] I searched in the [issues](https://github.com/apache/fluss/issues) and found nothing similar. ### Description Introduce server-side record batch filtering using batch-level statistics (min/max values, null counts) that are already available in the V1 log batch format. When a client sends a fetch request with a filter predicate, the server evaluates the predicate against each batch's statistics and skips batches that cannot contain matching records. Key points: - **Batch-level filtering, not row-level**: the server uses batch statistics to skip entire batches. The client still performs row-level filtering on the returned batches. - **ARROW format only**: only ARROW log format includes batch-level statistics (V1+ magic). COMPACTED/INDEXED formats fall back to unfiltered reads. - **Schema evolution safe**: a `PredicateSchemaResolver` adapts the predicate when the batch schema differs from the predicate schema, with safe fallback (include the batch) on any failure. - **Offset advancement**: when all batches in a fetch are filtered out, the server returns a `filteredEndOffset` so the client can advance past the filtered range without re-fetching. ### Willingness to contribute - [x] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
