robhod opened a new issue, #44383:
URL: https://github.com/apache/arrow/issues/44383

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   I'm trying to create a list of structs based on a grouping column. See below:
   Hitting not implemented error but hash_list docs 
(https://arrow.apache.org/docs/cpp/compute.html#aggregations) suggest it 
supports any input type so wasn't sure if there was an issue in how I set this 
up or if I should raise a feature request/bug?
   
   ```python
   import pyarrow as pa
   
   # source data
   table = pa.table(
       {
           "col1": [1, 1, 2, 2, 3],
           "struct_col": [
               {"a": 1, "b": "testa"},
               {"a": 1, "b": "testb"},
               {"a": 2, "b": "testc"},
               {"a": 2, "b": "testd"},
               {"a": 3, "b": "teste"},
           ],
       }
   )
   
   # required output
   grouped_table = pa.table(
       {
           "grouped": [1, 2, 3],
           "agg_struct_col": [
               [{"a": 1, "b": "testa"}, {"a": 1, "b": "testb"}],
               [{"a": 2, "b": "testc"}, {"a": 2, "b": "testd"}],
               [{"a": 3, "b": "teste"}],
           ],
       }
   )
   
   # using group_by
   grouped = table.group_by("col1").aggregate([("struct_col", "list")])
   
   
   ```
     File "scratch/pyarrowexample.py", line 30, in <module>
       grouped = table.group_by("col1").aggregate([("struct_col", "hash_list")])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "pyarrow/table.pxi", line 6359, in pyarrow.lib.TableGroupBy.aggregate
     File "l/.venv/lib/python3.11/site-packages/pyarrow/acero.py", line 403, in 
_group_by
       return decl.to_table(use_threads=use_threads)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "pyarrow/_acero.pyx", line 590, in pyarrow._acero.Declaration.to_table
     File "pyarrow/error.pxi", line 155, in 
pyarrow.lib.pyarrow_internal_check_status
     File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
   pyarrow.lib.ArrowNotImplementedError: Function 'hash_list' has no kernel 
matching input types (struct<a: int64, b: string>, uint32)
   
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to