andygrove commented on code in PR #4010:
URL: https://github.com/apache/datafusion-comet/pull/4010#discussion_r3139755478


##########
spark/src/main/scala/org/apache/comet/rules/CometExecRule.scala:
##########
@@ -100,19 +107,70 @@ case class CometExecRule(session: SparkSession)
 
   private lazy val showTransformations = 
CometConf.COMET_EXPLAIN_TRANSFORMATIONS.get()
 
+  /**
+   * Revert any `CometShuffleExchangeExec` with `CometColumnarShuffle` whose 
parent and child are
+   * both non-Comet `HashAggregateExec` / `ObjectHashAggregateExec` operators 
back to the original
+   * Spark `ShuffleExchangeExec`. This is the partial-final-aggregate pattern 
where Comet couldn't
+   * convert either aggregate; keeping a columnar shuffle between them only 
adds
+   * row->arrow->shuffle->arrow->row conversion overhead with no Comet 
consumer on either side.
+   * See https://github.com/apache/datafusion-comet/issues/4004.
+   *
+   * The match is intentionally narrow (both sides must be row-based 
aggregates that remained JVM
+   * after the main transform pass). Running the revert post-transform means 
we only fire when the
+   * main conversion already decided to keep both aggregates JVM - we never 
create the dangerous
+   * mixed mode where a Comet partial feeds a JVM final (see issue #1389).
+   *
+   * Also tag the reverted shuffle so AQE stage-isolated re-planning does not 
convert it back to a
+   * Comet shuffle when the outer aggregate context is no longer visible.
+   */

Review Comment:
   Expanded the docstring in 2241ccbe2 to call out the preColumnarTransitions 
correctness dependency and the SKIP_COMET_SHUFFLE_TAG guarding both the AQE 
QueryStagePrepRule and ColumnarRule re-entries.



##########
spark/src/test/scala/org/apache/comet/rules/CometExecRuleSuite.scala:
##########
@@ -229,4 +229,37 @@ class CometExecRuleSuite extends CometTestBase {
     }
   }
 
+  test("CometExecRule should not wrap shuffle in CometColumnarShuffle when 
both sides are JVM") {

Review Comment:
   Added both tests in 2241ccbe2: one asserts the revert does not fire when 
`COMET_EXEC_SHUFFLE_REVERT_REDUNDANT_COLUMNAR_ENABLED=false` (shuffle stays 
CometColumnarShuffle), and one asserts it does not fire when both aggregates 
convert to Comet native (shuffle stays CometColumnarShuffle with 
CometHashAggregate on both sides).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to