andygrove opened a new issue, #4191:
URL: https://github.com/apache/datafusion-comet/issues/4191

   Sub-issue of #4098.
   
   ## Description
   
   Three tests in `DataFrameSetOperationsSuite` (Spark 4.1) fail under Comet:
   
   - \`SPARK-52921: union partitioning - reused shuffle\`
   - \`SPARK-52921: union partitioning - semantic equality\`
   - \`SPARK-52921: union partitioning - range partitioning\`
   
   These tests inspect the executed plan with collectors like:
   
   \`\`\`scala
   val unionExec = union.queryExecution.executedPlan.collect {
     case u: UnionExec => u
   }
   assert(unionExec.size == 1)
   \`\`\`
   
   Comet replaces \`UnionExec\` with \`CometUnionExec\` (which extends 
\`CometExec\`, **not** \`UnionExec\`) and \`ShuffleExchangeExec\` with a Comet 
equivalent. The collectors find zero matches, so \`unionExec.size == 0\` and 
the assertions fail.
   
   ## Root cause
   
   Test-side incompatibility with Comet's operator-replacement strategy. No 
production code change needed — the fix is to patch the test matchers in 
\`dev/diffs/4.1.1.diff\` to also accept Comet's wrapper classes.
   
   ## Where
   
   
\`sql/core/src/test/scala/org/apache/spark/sql/DataFrameSetOperationsSuite.scala\`,
 currently tagged 
\`IgnoreComet("https://github.com/apache/datafusion-comet/issues/4098";)\`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to