andygrove opened a new issue, #4080:
URL: https://github.com/apache/datafusion-comet/issues/4080

   ## What is the problem the feature request solves?
   
   Spark 4.0 enables ANSI mode by default. Comet has good ANSI coverage across 
arithmetic, aggregates (`Sum`, `Average`), and most cast operations, but a 
small number of `Cast` source/target type combinations are still only marked 
`Compatible` for `EvalMode.LEGACY`. In ANSI mode they currently fall back to 
Spark.
   
   This issue tracks the remaining ANSI cast gaps so they can be closed 
individually.
   
   ## Remaining gaps
   
   The following casts are `Compatible` in `EvalMode.LEGACY` but fall back to 
Spark when `EvalMode.ANSI` (see 
[`CometCast.scala`](https://github.com/apache/datafusion-comet/blob/main/spark/src/main/scala/org/apache/comet/expressions/CometCast.scala)):
   
   - `ByteType` → `BinaryType`
   - `ShortType` → `BinaryType`
   - `IntegerType` → `BinaryType`
   - `LongType` → `BinaryType`
   
   For each pair, the work is to validate that the existing native cast 
produces results that match Spark in ANSI mode, then drop the `evalMode == 
CometEvalMode.LEGACY` guard in `CometCast.canCastFromByte` / `canCastFromShort` 
/ `canCastFromInt` / `canCastFromLong`, with tests in `CometCastSuite`.
   
   ## Cases that are intentionally LEGACY-only
   
   For completeness, the following LEGACY-only cases are not gaps. Spark itself 
disallows them at analysis time when ANSI mode is enabled, so the LEGACY-only 
marking matches Spark's behavior:
   
   - `BooleanType` → `TimestampType`
   - `DateType` → `BooleanType` / `ByteType` / `ShortType` / `IntegerType` / 
`LongType` / `FloatType` / `DoubleType` / `DecimalType`
   
   ## Out of scope
   
   Cases that are `Incompatible` regardless of eval mode 
(`FloatType`/`DoubleType` → `DecimalType` rounding differences, etc.) are 
tracked separately and are not part of this issue.
   
   ## Describe the potential solution
   
   Per-pair issues or a single PR that lifts the LEGACY guard for the four 
numeric → binary casts above and adds ANSI-mode tests.
   
   ## Additional context
   
   Issue #313 (the original ANSI epic) has been closed. This issue focuses 
specifically on the residual cast gaps surfaced by the Spark Version 
Compatibility documentation in #4079.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to