mbutrovich commented on issue #3510: URL: https://github.com/apache/datafusion-comet/issues/3510#issuecomment-4329308481
**AQE DPP for V1 (addressed by #4112):** The main challenge was Spark's `PlanAdaptiveDynamicPruningFilters` running as a built-in `queryStageOptimizerRule` before custom rules. Since Comet replaces `BroadcastHashJoinExec` with `CometBroadcastHashJoinExec` in `queryStagePreparationRules`, Spark's rule can't find a match and converts DPP to `Literal.TrueLiteral` (disabling it). Custom `queryStageOptimizerRules` registered via `injectQueryStageOptimizerRule` run after the built-in ones, so the SABs are already destroyed by the time our rule sees them. The solution is a two-phase approach: wrap SABs in `CometSubqueryAdaptiveBroadcastExec` during `queryStagePreparationRules` (so Spark's pattern match doesn't recognize them), then convert to `CometSubqueryBroadcastExec` in a custom `queryStageOptimizerRule` after broadcast stages are materialized. This approach should transfer directly to **AQE DPP for Iceberg** (#4033), since the conversion rule is scan-type-agnostic. #4033's `postColumnarTransitions` approach works for Iceberg because `runtimeFilters` are hidden from Spark's expression tree, but for V1 Parquet where `partitionFilters` are visible, the wrapping approach is required. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
