ben-freist opened a new issue, #45312: URL: https://github.com/apache/arrow/issues/45312
### Describe the enhancement requested The type inference for schema detection that's implemented here https://github.com/apache/arrow/blob/9801801df3f75339f705264252c0eac189f5f2a3/python/pyarrow/src/arrow/python/inference.cc#L493 does not distinguish between signed and unsigned integer types. This leads to the following behaviour, I think it would be nice if that was more consistent. ``` import pyarrow as pa import pandas as pd data_1 = [{"a": pow(2, 63) - 1}] schema_1 = pa.Schema.from_pandas(pd.DataFrame(data_1)) print(schema_1) # takes a different codepath, correctly infers uint64 data_2 = [{"a": [pow(2, 63) - 1]}] schema_2 = pa.Schema.from_pandas(pd.DataFrame(data_2)) # crashes ``` Here's the backtrace that you get when trying to compute `schema_2`. ``` Traceback (most recent call last): File "/work/arrow/foo.py", line 5, in <module> schema = pa.Schema.from_pandas(pd.DataFrame(data)) File "pyarrow/types.pxi", line 3104, in pyarrow.lib.Schema.from_pandas File "/work/arrow/pyarrow-dev/lib/python3.10/site-packages/pyarrow/pandas_compat.py", line 562, in dataframe_to_types type_ = pa.array(c, from_pandas=True).type File "pyarrow/array.pxi", line 360, in pyarrow.lib.array File "pyarrow/array.pxi", line 87, in pyarrow.lib._ndarray_to_array File "pyarrow/error.pxi", line 89, in pyarrow.lib.check_status OverflowError: Python int too large to convert to C long ``` Is that something that can be changed or would that likely have too many unintended consequences? I've tested this with pyarrow version 19.0.0 on ubuntu 24.04. ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org