zhuqi-lucas commented on PR #21828: URL: https://github.com/apache/datafusion/pull/21828#issuecomment-4342436907
Updated per review feedback: **Simplified scope** (per @alamb suggestion): - Offset pushdown now only for **single file + no filter** (Case 1 only) - Multi-file and filtered queries keep `GlobalLimitExec` (follow-up) - Removed `offset_fully_handled` — `with_offset` returning `Some` means fully handled - `with_offset` returns `None` when: not parquet, has filter, or multi-file **Code changes**: 1. Moved `offset_fully_handled` check into `with_offset` guard (single file + no filter) 2. Removed `offset_fully_handled` from `ExecutionPlan` / `DataSource` traits 3. Optimizer: `with_offset` returns `Some` → eliminate `GlobalLimitExec`, `None` → keep it 4. Updated `supports_offset` docs to not reference `GlobalLimitExec` (per alamb suggestion) 5. Kept `Arc<AtomicUsize>` shared counter in opener for future multi-file support **Still TODO** (will address in next push): - Move RowSelection logic into `PreparedAccessPlan` method (per alamb suggestion) - Mark PR as API change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
