kazeno opened a new issue, #49826:
URL: https://github.com/apache/arrow/issues/49826

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Version: pyarrow 24.0.0 (regression vs. 23.0.1)
   Platform: Linux (python 3.12), but not platform-specific
   
   In pyarrow 24.0.0, `pyarrow.lib.Scalar` gained arithmetic dunder methods 
(`__add__`, `__sub__`, `__mul__`, `__truediv__`, `__pow__`, `__neg__`, and 
bitwise ops) in `python/pyarrow/scalar.pxi`:
   
   ```cython
   def __add__(self, object other):
       return _pc().call_function('add_checked', [self, other])
   ```
   
   These implementations unconditionally dispatch to 
`pyarrow.compute.call_function`, which raises `TypeError` via 
`_pack_compute_args` when `other` is not a recognized pyarrow / list / tuple / 
ndarray type:
   
   ```
   TypeError: Got unexpected argument type <class 'MyCustomColumn'> for compute 
function
   ```
   
   Because a raised `TypeError` does NOT trigger Python's reflected-operator 
fallback (only a returned `NotImplemented` does, as you can see in [Python data 
model, 
ยง3.3.8](https://docs.python.org/3/reference/datamodel.html#object.__radd__)), 
any custom class that previously relied on its own `__radd__` / `__rmul__` / 
`__rsub__` / `__rtruediv__` to handle `pyarrow.Scalar + my_obj` is now broken. 
The user has no workaround from their side, as `pyarrow.lib.Scalar` is an 
immutable extension type and cannot be monkey-patched, and virtual subclass 
registration is not honored by CPython's binary-op dispatch (which uses 
`PyType_IsSubtype` at the C level).
   
   ## Reproducer
   
   ```python
   import pyarrow
   
   class MyCol:
       def __radd__(self, other):
           return "MyCol.__radd__ called"
   
   s = pyarrow.scalar(5)
   c = MyCol()
   
   # Works on pyarrow <= 23 (Scalar had no __add__, so Python dispatches to 
MyCol.__radd__)
   # Fails on pyarrow >= 24 with:
   #   TypeError: Got unexpected argument type <class '__main__.MyCol'> for 
compute function
   print(s + c)
   ```
   
   Expected: `"MyCol.__radd__ called"` (or at least a `NotImplemented` return 
from `Scalar.__add__` so Python can fall back).
   Actual: `TypeError` from `_pack_compute_args`.
   
   ## Why this matters
   
   Libraries that wrap pyarrow arrays with a richer Python class (like our 
[Data Curator](https://github.com/KaxaNuk/Data-Curator) library with its  
`DataColumn`  class , but also other downstream projects) have historically 
been able to make `pyarrow.Scalar + custom_column` work by implementing 
`__radd__` on their class (and the same for the other reflected-operators). 
This pattern is now silently broken by an upgrade to 24.0.0, with no opt-out 
and no Python-level workaround.
   
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to