andygrove opened a new pull request, #4044:
URL: https://github.com/apache/datafusion-comet/pull/4044

   ## Which issue does this PR close?
   
   Related to #4042.
   
   ## Rationale for this change
   
   In #4041 the `SubquerySuite` "SPARK-43402: FileSourceScanExec supports push 
down data filter with scalar subquery" test was retagged from the umbrella 
#3321 to the broader #3315 ("plan structure differences"). A closer look shows 
the first failure in this test is just a missed pattern match: the test 
collects `FileSourceScanLike` and `CometScanExec` but not 
`CometNativeScanExec`. After adding `CometNativeScanExec` to the match, that 
assertion passes, but the test still fails on a genuine behavior gap where the 
pushed scalar subquery ends up as a plain `Subquery` rather than 
`ReusedSubqueryExec`, so reuse does not happen. That narrower problem is now 
tracked in #4042.
   
   ## What changes are included in this PR?
   
   Updates `dev/diffs/4.0.1.diff` to:
   
   - Add a `CometNativeScanExec` branch to both pattern matches in the 
SPARK-43402 test (the `collect` over the executed plan, and the subsequent 
match extracting `dataFilters`). `CometNativeScanExec` already exposes 
`dataFilters`, `numFiles`, and `numOutputRows`, so the rest of the assertions 
work with the added case without further changes.
   - Retag the `IgnoreCometNativeDataFusion` reference from #3315 to the 
narrower #4042.
   
   ## How are these changes tested?
   
   Ran the test locally against Spark 4.0.1 with `native_datafusion` in auto 
scan mode via `ENABLE_COMET=true ENABLE_COMET_ONHEAP=true build/sbt 
"sql/testOnly org.apache.spark.sql.SubquerySuite -- -z \"SPARK-43402: 
FileSourceScanExec supports push down data filter with scalar subquery\""`. 
Confirmed the first assertion (`dataSourceScanExec.size === 1`) now passes with 
the extended match, and the remaining failure (`was not instance of 
org.apache.spark.sql.execution.ReusedSubqueryExec`) matches the description in 
#4042.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to