andygrove opened a new issue, #4042:
URL: https://github.com/apache/datafusion-comet/issues/4042
## Summary
With `native_datafusion`, a scalar subquery pushed down as a data filter on
`CometNativeScanExec` does not produce a `ReusedSubqueryExec` the way Spark's
vectorized reader (and `CometScanExec`) do. The pushed subquery is a plain
`Subquery`, so subsequent references to the same subquery do not share the
result.
## Failing Test
`SubquerySuite`: "SPARK-43402: FileSourceScanExec supports push down data
filter with scalar subquery"
## Reproduction
Updating the test's plan-match to include `CometNativeScanExec`:
```scala
val dataSourceScanExec = collect(df.queryExecution.executedPlan) {
case f: FileSourceScanLike => f
case c: CometScanExec => c
case n: CometNativeScanExec => n
}
```
makes the first assertion (`dataSourceScanExec.size == 1`) pass. The next
assertion still fails:
```
was not instance of org.apache.spark.sql.execution.ReusedSubqueryExec
(SubquerySuite.scala:2716)
```
with the plan showing a plain `Subquery` rather than `ReusedSubqueryExec`:
```
Subquery subquery#295, [id=#166]
+- AdaptiveSparkPlan isFinalPlan=true
+- == Final Plan ==
ResultQueryStage 2
+- CometNativeColumnarToRow
+- CometHashAggregate [min#303], Final, [min(c2#297)]
+- ShuffleQueryStage 0
+- CometExchange SinglePartition, ...
+- CometHashAggregate [c2#297], Partial,
[partial_min(c2#297)]
+- CometNativeScan parquet ...
```
The `dataFilters` on the `CometNativeScanExec` carry the subquery reference
but aren't wired into the reused-subquery machinery.
## Related
Split from #3315 while triaging the tests previously ignored under #3321.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]