[PR] feat: add MapSort expression support for Spark 4.0 [datafusion-comet]

via GitHub Fri, 24 Apr 2026 16:59:49 -0700


andygrove opened a new pull request, #4076:
URL: https://github.com/apache/datafusion-comet/pull/4076


   ## Which issue does this PR close?
   
   Closes #1941.
   
   ## Rationale for this change
   
   Spark 4.0 introduces `MapSort`, used for normalizing map values when they 
appear in shuffle hash partitioning keys, in `try_element_at`, and in other 
contexts where map ordering must be deterministic. Without native support, 
queries that touch maps in any of these positions fall back to Spark, which 
forces the entire enclosing operator off Comet (e.g. an entire shuffle 
exchange).
   
   ## What changes are included in this PR?
   
   - New native scalar function `map_sort` in 
`native/spark-expr/src/map_funcs/map_sort.rs` that sorts map entries by key in 
ascending order, registered via `comet_scalar_funcs.rs`.
   - Wire `MapSort` into the Spark 4.0 `CometExprShim` so the expression is 
converted to the new scalar function during serde.
   - The `columnar shuffle on map array element` test in 
`CometColumnarShuffleSuite` now expects shuffle fallback on Spark 4.0+: the new 
shuffle-key normalization wraps `mapsort` inside `transform(arr, x -> 
mapsort(x))`, and Comet does not currently support `ArrayTransform` with a 
lambda body. Answer correctness is still verified via `checkSparkAnswer`.
   
   ## How are these changes tested?
   
   - New unit tests in `native/spark-expr/src/map_funcs/map_sort.rs` cover 
sorting on each supported key type, null handling, and empty maps.
   - Existing `CometColumnarShuffleSuite` tests for map shuffle keys all pass 
under the Spark 4.0 profile (41/41).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] feat: add MapSort expression support for Spark 4.0 [datafusion-comet]

Reply via email to