andygrove opened a new issue, #4080: URL: https://github.com/apache/datafusion-comet/issues/4080
## What is the problem the feature request solves? Spark 4.0 enables ANSI mode by default. Comet has good ANSI coverage across arithmetic, aggregates (`Sum`, `Average`), and most cast operations, but a small number of `Cast` source/target type combinations are still only marked `Compatible` for `EvalMode.LEGACY`. In ANSI mode they currently fall back to Spark. This issue tracks the remaining ANSI cast gaps so they can be closed individually. ## Remaining gaps The following casts are `Compatible` in `EvalMode.LEGACY` but fall back to Spark when `EvalMode.ANSI` (see [`CometCast.scala`](https://github.com/apache/datafusion-comet/blob/main/spark/src/main/scala/org/apache/comet/expressions/CometCast.scala)): - `ByteType` → `BinaryType` - `ShortType` → `BinaryType` - `IntegerType` → `BinaryType` - `LongType` → `BinaryType` For each pair, the work is to validate that the existing native cast produces results that match Spark in ANSI mode, then drop the `evalMode == CometEvalMode.LEGACY` guard in `CometCast.canCastFromByte` / `canCastFromShort` / `canCastFromInt` / `canCastFromLong`, with tests in `CometCastSuite`. ## Cases that are intentionally LEGACY-only For completeness, the following LEGACY-only cases are not gaps. Spark itself disallows them at analysis time when ANSI mode is enabled, so the LEGACY-only marking matches Spark's behavior: - `BooleanType` → `TimestampType` - `DateType` → `BooleanType` / `ByteType` / `ShortType` / `IntegerType` / `LongType` / `FloatType` / `DoubleType` / `DecimalType` ## Out of scope Cases that are `Incompatible` regardless of eval mode (`FloatType`/`DoubleType` → `DecimalType` rounding differences, etc.) are tracked separately and are not part of this issue. ## Describe the potential solution Per-pair issues or a single PR that lifts the LEGACY guard for the four numeric → binary casts above and adds ANSI-mode tests. ## Additional context Issue #313 (the original ANSI epic) has been closed. This issue focuses specifically on the residual cast gaps surfaced by the Spark Version Compatibility documentation in #4079. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
