yashmayya opened a new pull request, #15630: URL: https://github.com/apache/pinot/pull/15630
- Adds support for `ASOF JOIN` (and `LEFT ASOF JOIN`) in the multi-stage engine with syntax and semantics similar to Snowflake's - https://docs.snowflake.com/en/sql-reference/constructs/asof-join. - Calcite recently added support for this join type (in the parser, validator, and rel logic) in `1.38.0` - https://github.com/apache/calcite/pull/3883 (we recently upgraded from `1.37.0` to `1.39.0` - https://github.com/apache/pinot/pull/15263). - Note that Calcite currently doesn't support `ASOF JOIN`s without an `ON` clause even though Snowflake does - https://docs.snowflake.com/en/sql-reference/constructs/asof-join#parameters. Currently, the workaround would be to use a clause like `ON TRUE` - https://github.com/apache/calcite/pull/3883#discussion_r2057922563. Note that Calcite enforces the `ON` clause in these joins to be a conjunction of equalities (apart from literal conditions). - The `MATCH_CONDITION` clause is mandatory (in both Calcite and Snowflake). - We don't use the `MATCH_CONDITION` to determine the exchange / distribution logic here in Pinot, we only use the actual join keys (from the `ON` clause) to implement the usual hash exchange. The reason is that `MATCH_CONDITION` can only be one out of `>`, `>=`, `<`, `<=` and the semantics of the join (finding the closest match) make it such that we can't use buckets to distribute the data accurately either. The hash exchange will be based on the join keys from the `ON` clause. If the clause is `ON TRUE`, we'll use a random + broadcast distribution strategy, similar to regular joins without equality based join conditions. - This patch also refactors the physical `BaseJoinOperator` and existing implementations to increase reused logic and adds a new `AsofJoinOperator` implementation that uses a tree map for efficient (binary search based) computation of the closest match as per the `MATCH_CONDITION`. - Since H2 doesn't support `ASOF JOIN` queries, the test cases added here only have manually validated logical correctness. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org