WillAyd opened a new issue, #2504:
URL: https://github.com/apache/arrow-adbc/issues/2504

   ### What happened?
   
   When trying to use first/last aggregations with TableGroupBy, PyArrow throws 
`ArrowNotImplementedError`:
   
   ```python
   In [48]: import pyarrow as pa
   
   In [49]: tbl = pa.Table.from_pydict({"key": range(3), "val": ["foo", "bar", 
"baz"]})
   
   In [50]: pa.TableGroupBy(tbl1, "key").aggregate([("val", "first")])
   ArrowNotImplementedError                  Traceback (most recent call last)
   Cell In[50], line 1
   ----> 1 pa.TableGroupBy(tbl, "key").aggregate([("val", "first")])
   
   File 
~/miniforge3/envs/scratchpad/lib/python3.13/site-packages/pyarrow/table.pxi:6560,
 in pyarrow.lib.TableGroupBy.aggregate()
   
   File 
~/miniforge3/envs/scratchpad/lib/python3.13/site-packages/pyarrow/acero.py:410, 
in _group_by(table, aggregates, keys, use_threads)
       404 def _group_by(table, aggregates, keys, use_threads=True):
       406     decl = Declaration.from_sequence([
       407         Declaration("table_source", TableSourceNodeOptions(table)),
       408         Declaration("aggregate", AggregateNodeOptions(aggregates, 
keys=keys))
       409     ])
   --> 410     return decl.to_table(use_threads=use_threads)
   
   File 
~/miniforge3/envs/scratchpad/lib/python3.13/site-packages/pyarrow/_acero.pyx:590,
 in pyarrow._acero.Declaration.to_table()
   
   File 
~/miniforge3/envs/scratchpad/lib/python3.13/site-packages/pyarrow/error.pxi:155,
 in pyarrow.lib.pyarrow_internal_check_status()
   
   File 
~/miniforge3/envs/scratchpad/lib/python3.13/site-packages/pyarrow/error.pxi:92, 
in pyarrow.lib.check_status()
   
   ArrowNotImplementedError: Using ordered aggregator in multiple threaded 
execution is not supported
   ````
   
   Interestingly, the first_last aggregation returns without error:
   
   ```python
   In [52]: pa.TableGroupBy(tbl, "key").aggregate([("val", "first_last")])
   Out[52]: 
   pyarrow.Table
   key: int64
   val_first_last: struct<first: string, last: string>
     child 0, first: string
     child 1, last: string
   ----
   key: [[0,1,2]]
   val_first_last: [
     -- is_valid: all not null
     -- child 0 type: string
   ["foo","bar","baz"]
     -- child 1 type: string
   ["foo","bar","baz"]]
   ```
   
   ### Stack Trace
   
   ```
   ----> 1 pa.TableGroupBy(tbl, "key").aggregate([("val", "first")])
   
   File 
~/miniforge3/envs/scratchpad/lib/python3.13/site-packages/pyarrow/table.pxi:6560,
 in pyarrow.lib.TableGroupBy.aggregate()
   
   File 
~/miniforge3/envs/scratchpad/lib/python3.13/site-packages/pyarrow/acero.py:410, 
in _group_by(table, aggregates, keys, use_threads)
       404 def _group_by(table, aggregates, keys, use_threads=True):
       406     decl = Declaration.from_sequence([
       407         Declaration("table_source", TableSourceNodeOptions(table)),
       408         Declaration("aggregate", AggregateNodeOptions(aggregates, 
keys=keys))
       409     ])
   --> 410     return decl.to_table(use_threads=use_threads)
   
   File 
~/miniforge3/envs/scratchpad/lib/python3.13/site-packages/pyarrow/_acero.pyx:590,
 in pyarrow._acero.Declaration.to_table()
   
   File 
~/miniforge3/envs/scratchpad/lib/python3.13/site-packages/pyarrow/error.pxi:155,
 in pyarrow.lib.pyarrow_internal_check_status()
   
   File 
~/miniforge3/envs/scratchpad/lib/python3.13/site-packages/pyarrow/error.pxi:92, 
in pyarrow.lib.check_status()
   
   ArrowNotImplementedError: Using ordered aggregator in multiple threaded 
execution is not supported
   ```
   
   ### How can we reproduce the bug?
   
   See code sampel above
   
   ### Environment/Setup
   
   PyArrow 19.0.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to