andygrove opened a new issue, #4123:
URL: https://github.com/apache/datafusion-comet/issues/4123

   ## Describe the bug
   
   When the sort key is a struct containing a Map, Comet's native sort fails 
with:
   
   ```
   org.apache.comet.CometNativeException
   Not yet implemented: Row format support not yet implemented for: [SortField {
     options: SortOptions { descending: false, nulls_first: true },
     data_type: Struct([Field {
       name: "data",
       data_type: Map(Field { name: "entries", data_type: Struct([
         Field { name: "key", data_type: Utf8 },
         Field { name: "value", data_type: Utf8 }
       ]) }, false)
     }])
   }]
   ```
   
   This surfaces in Spark 4.1.1's new 
`having-and-order-by-recursive-type-name-resolution.sql` at query #38:
   
   ```sql
   SELECT col1.data['key']
   FROM VALUES (NAMED_STRUCT('data', MAP('key', 'value', 'num', '42'))) t (col1)
   GROUP BY col1
   HAVING col1.data['num'] IS NOT NULL
   ORDER BY col1.data['key'];
   ```
   
   ## Expected behavior
   
   Comet should fall back to Spark when the sort key includes types not 
supported by the Arrow row format (Struct/Map combinations are a known gap 
upstream).
   
   ## Workaround
   
   The file is currently disabled when Comet is enabled via `--SET 
spark.comet.enabled = false` at the top of the file in `dev/diffs/4.1.1.diff`.
   
   ## Additional context
   
   PR #4093 enables Spark 4.1.1 in the `Spark SQL Tests` workflow. The 
underlying limitation lives in `arrow-row` in DataFusion / Arrow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to