yashmayya opened a new pull request, #18482:
URL: https://github.com/apache/pinot/pull/18482

   ## Summary
   
   Adds support for the SQL standard `EXCLUDE` clause on window functions, 
covering all four options:
   - `EXCLUDE NO OTHERS` (default; existing behavior preserved)
   - `EXCLUDE CURRENT ROW`
   - `EXCLUDE GROUP`
   - `EXCLUDE TIES`
   
   Supported for the window functions where it is semantically meaningful — 
`SUM`, `COUNT`, `AVG`, `MIN`, `MAX`, `BOOL_AND`, `BOOL_OR`, `FIRST_VALUE`, 
`LAST_VALUE` — across both `ROWS` and `RANGE` frames. Ranking functions and 
`LAG`/`LEAD` continue to be framed implicitly per the SQL standard (Calcite 
rejects `EXCLUDE` on these at parse time).
   
   ## Implementation
   
   **Plan side**:
   - New `WindowExclusion` proto enum on `WindowNode` (field 8, default 0 = 
`EXCLUDE_NO_OTHERS` so old serialized plans round-trip safely).
   - `RelToPlanNodeConverter` / `PRelToPlanNodeConverter` propagate the 
exclusion through; the previous `Preconditions.checkState` rejecting 
non-default exclusions is removed.
   - `PlanNodeToRelConverter`, both serde sides, and `PlanNodeMerger` 
round-trip the new field.
   
   **Runtime side**:
   - `WindowFrame` carries the exclusion; `WindowFunction` base gains O(n) 
`computePeerBoundaries` + O(1) `firstNonExcluded` / `lastNonExcluded` helpers. 
The default `EXCLUDE NO OTHERS` path branches out early so the hot path is 
unchanged.
   - `AggregateWindowFunction` handles ROWS and all four supported RANGE shapes 
(UU / UC / CU / CC) using a sliding aggregator with per-row apply / unapply 
correction. Peer bounds are skipped for `EXCLUDE CURRENT ROW` when frame shape 
allows.
   - `FirstValueWindowFunction` / `LastValueWindowFunction` compute the 
effective first / last index in O(1) per row from peer bounds; `IGNORE NULLS` 
continues to work.
   - Existing monotonic-deque MIN/MAX aggregators don't support arbitrary 
removal, so a new `SortedMultisetMinMaxWindowValueAggregator` (TreeMap-backed, 
O(log K) per op) is selected when EXCLUDE forces per-row corrections. SUM / 
COUNT / AVG / BOOL_AND / BOOL_OR are commutative under add / remove and reuse 
the existing aggregators.
   
   Semantics were cross-verified against PostgreSQL.
   
   ## Test plan
   
   - [x] 11 new EXCLUDE cases in 
`pinot-query-runtime/src/test/resources/queries/WindowFunctions.json` 
exercising each of the four EXCLUDE options across `SUM` / `COUNT` / `AVG` / 
`MIN` / `FIRST_VALUE` / `LAST_VALUE`, plus ROWS / all four RANGE shapes / 
no-`ORDER BY`. Each expected output was generated from PostgreSQL.
   - [x] Unit tests for `SortedMultisetMinMaxWindowValueAggregator` (min / max 
with duplicates, out-of-order removal, no-op removal of an unknown value, null 
handling, `BigDecimal`).
   - [x] Full `ResourceBasedQueriesTest`, `WindowAggregateOperatorTest`, and 
`WindowValueAggregatorTest` suites pass.
   - [x] `spotless:apply` / `checkstyle:check` / `license:check` clean.
   
   ## Backwards / rolling-upgrade notes
   
   The proto field is additive with the standard proto3 zero-default 
(`EXCLUDE_NO_OTHERS`). New brokers will continue to plan queries without 
`EXCLUDE` to the same wire shape as today. A new broker that plans a 
non-default `EXCLUDE` and dispatches to an old server will see the server 
silently default the field to `EXCLUDE_NO_OTHERS`; servers should be upgraded 
before brokers if operators expect the new SQL syntax to take effect.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to