mbutrovich commented on issue #3510:
URL: 
https://github.com/apache/datafusion-comet/issues/3510#issuecomment-4329308481

   **AQE DPP for V1 (addressed by #4112):**
   
   The main challenge was Spark's `PlanAdaptiveDynamicPruningFilters` running 
as a built-in `queryStageOptimizerRule` before custom rules. Since Comet 
replaces `BroadcastHashJoinExec` with `CometBroadcastHashJoinExec` in 
`queryStagePreparationRules`, Spark's rule can't find a match and converts DPP 
to `Literal.TrueLiteral` (disabling it). Custom `queryStageOptimizerRules` 
registered via `injectQueryStageOptimizerRule` run after the built-in ones, so 
the SABs are already destroyed by the time our rule sees them.
   
   The solution is a two-phase approach: wrap SABs in 
`CometSubqueryAdaptiveBroadcastExec` during `queryStagePreparationRules` (so 
Spark's pattern match doesn't recognize them), then convert to 
`CometSubqueryBroadcastExec` in a custom `queryStageOptimizerRule` after 
broadcast stages are materialized.
   
   This approach should transfer directly to **AQE DPP for Iceberg** (#4033), 
since the conversion rule is scan-type-agnostic. #4033's 
`postColumnarTransitions` approach works for Iceberg because `runtimeFilters` 
are hidden from Spark's expression tree, but for V1 Parquet where 
`partitionFilters` are visible, the wrapping approach is required.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to