parthchandra commented on code in PR #4010: URL: https://github.com/apache/datafusion-comet/pull/4010#discussion_r3139225743
########## spark/src/main/scala/org/apache/comet/rules/CometExecRule.scala: ########## @@ -100,19 +107,70 @@ case class CometExecRule(session: SparkSession) private lazy val showTransformations = CometConf.COMET_EXPLAIN_TRANSFORMATIONS.get() + /** + * Revert any `CometShuffleExchangeExec` with `CometColumnarShuffle` whose parent and child are + * both non-Comet `HashAggregateExec` / `ObjectHashAggregateExec` operators back to the original + * Spark `ShuffleExchangeExec`. This is the partial-final-aggregate pattern where Comet couldn't + * convert either aggregate; keeping a columnar shuffle between them only adds + * row->arrow->shuffle->arrow->row conversion overhead with no Comet consumer on either side. + * See https://github.com/apache/datafusion-comet/issues/4004. + * + * The match is intentionally narrow (both sides must be row-based aggregates that remained JVM + * after the main transform pass). Running the revert post-transform means we only fire when the + * main conversion already decided to keep both aggregates JVM - we never create the dangerous + * mixed mode where a Comet partial feeds a JVM final (see issue #1389). + * + * Also tag the reverted shuffle so AQE stage-isolated re-planning does not convert it back to a + * Comet shuffle when the outer aggregate context is no longer visible. + */ Review Comment: Maybe also add that correctness depends on running as `preColumnarTransitions` (before Spark inserts `ColumnarToRowExec`), and that the `SKIP_COMET_SHUFFLE_TAG` guards against re-conversion during both AQE `QueryStagePrepRule` and `ColumnarRule` passes. ########## spark/src/test/scala/org/apache/comet/rules/CometExecRuleSuite.scala: ########## @@ -229,4 +229,37 @@ class CometExecRuleSuite extends CometTestBase { } } + test("CometExecRule should not wrap shuffle in CometColumnarShuffle when both sides are JVM") { Review Comment: Claude recommends: Two additional tests would round out coverage: (a) assert the revert does NOT fire when `revertRedundantColumnar.enabled=false`, and (b) assert the revert does NOT fire when aggregates are successfully converted to Comet native. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
