robhod opened a new issue, #44417:
URL: https://github.com/apache/arrow/issues/44417

   ### Describe the enhancement requested
   
   Can support for struct type columns be added to the hash_list aggregation 
function.
   For my particular use case I'd use it to create nested structures for json 
type output.
   
   ```python
   import pyarrow as pa
   
   # source data
   table = pa.table(
       {
           "col1": [1, 1, 2, 2, 3],
           "struct_col": [
               {"a": 1, "b": "testa"},
               {"a": 1, "b": "testb"},
               {"a": 2, "b": "testc"},
               {"a": 2, "b": "testd"},
               {"a": 3, "b": "teste"},
           ],
       }
   )
   
   # desired output
   grouped_table = pa.table(
       {
           "grouped": [1, 2, 3],
           "agg_struct_col": [
               [{"a": 1, "b": "testa"}, {"a": 1, "b": "testb"}],
               [{"a": 2, "b": "testc"}, {"a": 2, "b": "testd"}],
               [{"a": 3, "b": "teste"}],
           ],
       }
   )
   
   # using group_by ** Can this be supported?
   grouped = table.group_by("col1").aggregate([("struct_col", "list")])
   
   
   ```
   
   
   This is supported in polars/duckdb etc.
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to