kosiew opened a new pull request, #21493:
URL: https://github.com/apache/datafusion/pull/21493
## Which issue does this PR close?
* Part of #20164
---
## Rationale for this change
The current adapter emits `CastColumnExpr`, duplicating functionality
already provided by `CastExpr`. Maintaining two cast representations introduces
unnecessary complexity, branching logic, and potential inconsistencies in
behavior depending on where casts are constructed.
With recent improvements to `CastExpr` (field-aware casting), it is now
capable of preserving logical field metadata, nullability, and type semantics.
This enables the adapter to emit a single, unified cast representation.
This change simplifies the expression layer, reduces maintenance overhead,
and ensures consistent casting behavior across the execution pipeline.
---
## What changes are included in this PR?
* Replace all usages of `CastColumnExpr` in `schema_rewriter.rs` with
`CastExpr`.
* Remove the `create_cast_column_expr` helper and inline its logic using
`CastExpr::new_with_target_field`.
* Add validation via `validate_data_type_compatibility` before constructing
cast expressions.
* Improve rewrite logic:
* Avoid unnecessary rewrites when both index and field match.
* Allow direct column substitution when fields match but index differs.
* Ensure physical column resolution is based on column name rather than
index.
* Update tests to:
* Assert usage of `CastExpr` instead of `CastColumnExpr`.
* Validate inner column resolution and target field correctness.
* Verify logical metadata and nullability propagation via `return_field`.
* Improve robustness by checking expression structure instead of string
equality.
* Add helper assertions for validating cast expressions in tests.
---
## Are these changes tested?
Yes.
* Existing adapter and schema evolution tests have been updated to use
`CastExpr`.
* New assertions validate:
* Correct physical column resolution by name.
* Proper wrapping of expressions in `CastExpr` when required.
* Preservation of logical schema metadata and nullability.
* Correct structure of rewritten expressions.
* Regression coverage added for stale column index scenarios.
---
## Are there any user-facing changes?
No direct user-facing changes.
This is an internal refactor that unifies cast expression handling. However,
it improves consistency and correctness of schema evolution and expression
rewriting, which may indirectly benefit users.
---
## LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content
has been manually reviewed and tested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]