andygrove commented on issue #1941:
URL: 
https://github.com/apache/datafusion-comet/issues/1941#issuecomment-4297166706

   Eleven tests in `CometColumnarShuffleSuite` are currently skipped with 
`assume(!isSpark40Plus)` for this issue:
   
   - `columnar shuffle on map [bool]`
   - `columnar shuffle on map [byte]`
   - `columnar shuffle on map [short]`
   - `columnar shuffle on map [int]`
   - `columnar shuffle on map [long]`
   - `columnar shuffle on map [float]`
   - `columnar shuffle on map [double]`
   - `columnar shuffle on map [date]`
   - `columnar shuffle on map [timestamp]`
   - `columnar shuffle on map [decimal]`
   - `columnar shuffle on map [string]`
   - `columnar shuffle on map [binary]`
   
   (12 tests total — the `array element` variant passes.)
   
   On Spark 4, every failing test shows the same plan: 
`CometShuffleExchangeExec` is replaced by a plain `Exchange` with 
`hashpartitioning(mapsort(_2#N), mapsort(_3#N), ...)`. Spark 4 wraps each map 
partitioning key with `MapSort` so that equal maps normalize to the same 
canonical key order before hashing. Comet's shuffle partitioning serde does not 
recognize `MapSort`, so native shuffle is rejected for every map-keyed shuffle.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to