kazeno opened a new issue, #49826:
URL: https://github.com/apache/arrow/issues/49826
### Describe the bug, including details regarding any error messages,
version, and platform.
Version: pyarrow 24.0.0 (regression vs. 23.0.1)
Platform: Linux (python 3.12), but not platform-specific
In pyarrow 24.0.0, `pyarrow.lib.Scalar` gained arithmetic dunder methods
(`__add__`, `__sub__`, `__mul__`, `__truediv__`, `__pow__`, `__neg__`, and
bitwise ops) in `python/pyarrow/scalar.pxi`:
```cython
def __add__(self, object other):
return _pc().call_function('add_checked', [self, other])
```
These implementations unconditionally dispatch to
`pyarrow.compute.call_function`, which raises `TypeError` via
`_pack_compute_args` when `other` is not a recognized pyarrow / list / tuple /
ndarray type:
```
TypeError: Got unexpected argument type <class 'MyCustomColumn'> for compute
function
```
Because a raised `TypeError` does NOT trigger Python's reflected-operator
fallback (only a returned `NotImplemented` does, as you can see in [Python data
model,
ยง3.3.8](https://docs.python.org/3/reference/datamodel.html#object.__radd__)),
any custom class that previously relied on its own `__radd__` / `__rmul__` /
`__rsub__` / `__rtruediv__` to handle `pyarrow.Scalar + my_obj` is now broken.
The user has no workaround from their side, as `pyarrow.lib.Scalar` is an
immutable extension type and cannot be monkey-patched, and virtual subclass
registration is not honored by CPython's binary-op dispatch (which uses
`PyType_IsSubtype` at the C level).
## Reproducer
```python
import pyarrow
class MyCol:
def __radd__(self, other):
return "MyCol.__radd__ called"
s = pyarrow.scalar(5)
c = MyCol()
# Works on pyarrow <= 23 (Scalar had no __add__, so Python dispatches to
MyCol.__radd__)
# Fails on pyarrow >= 24 with:
# TypeError: Got unexpected argument type <class '__main__.MyCol'> for
compute function
print(s + c)
```
Expected: `"MyCol.__radd__ called"` (or at least a `NotImplemented` return
from `Scalar.__add__` so Python can fall back).
Actual: `TypeError` from `_pack_compute_args`.
## Why this matters
Libraries that wrap pyarrow arrays with a richer Python class (like our
[Data Curator](https://github.com/KaxaNuk/Data-Curator) library with its
`DataColumn` class , but also other downstream projects) have historically
been able to make `pyarrow.Scalar + custom_column` work by implementing
`__radd__` on their class (and the same for the other reflected-operators).
This pattern is now silently broken by an upgrade to 24.0.0, with no opt-out
and no Python-level workaround.
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]