[I] [Python] Missing `utf8_zfill` in pyarrow.compute to support `str.zfill` behavior [arrow]

via GitHub Mon, 02 Jun 2025 17:31:53 -0700


iabhi4 opened a new issue, #46683:
URL: https://github.com/apache/arrow/issues/46683


   ### Describe the enhancement requested
   
   ### Feature Request
   There’s currently no `utf8_zfill` kernel in `pyarrow.compute`, so Python’s 
`str.zfill()` behavior can't be reproduced efficiently with Arrow arrays.
   While fixing 
[pandas-dev/pandas#61485](https://github.com/pandas-dev/pandas/issues/61485), I 
noticed `Series.str.zfill()` breaks when used on `ArrowDtype(pa.string())` 
because the backend expects a string-padding kernel like `utf8_rjust`, but 
nothing exists for zfill. For now, it has to fall back to element-wise Python 
ops which aren't ideal
   
   ### Reproduction
   
   ```
   import pandas as pd
   import pyarrow as pa
   
   s = pd.Series(["A", "AB", "ABC"], dtype=pd.ArrowDtype(pa.string()))
   s.str.zfill(3)  # Currently falls back to Python and works via slow path
   ```
   
   ### Expected behavior would be
   `'A' → '00A'`
   `'AB' → '0AB'`
   `'ABC' → 'ABC'` (no change since it's already 3 chars)
   
   ### What we need
   A kernel like `pc.utf8_zfill(array, width)` that mimics Python’s 
`str.zfill()`:
   
   - Pad strings with '0' from the left to reach width
   
   - Optional enhancement: handle signs (+, -) same as Python
   
   ### Why it matters
   This will help pandas fully support `.str.zfill()` for Arrow-backed string 
arrays, similar to how `utf8_rjust`, `binary_join`, etc., already work 
natively. It'll avoid falling back to slower Python paths and ensure parity 
with standard Python string behavior
   
   ### Notes
   I’ve temporarily added a `TODO` in the pandas code to switch over once this 
is available.
   
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[I] [Python] Missing `utf8_zfill` in pyarrow.compute to support `str.zfill` behavior [arrow]

Reply via email to