ben-freist opened a new issue, #45312:
URL: https://github.com/apache/arrow/issues/45312

   ### Describe the enhancement requested
   
   The type inference for schema detection that's implemented here 
https://github.com/apache/arrow/blob/9801801df3f75339f705264252c0eac189f5f2a3/python/pyarrow/src/arrow/python/inference.cc#L493
 does not distinguish between signed and unsigned integer types.
   
   This leads to the following behaviour, I think it would be nice if that was 
more consistent.
   ```
   import pyarrow as pa
   import pandas as pd
   
   data_1 = [{"a": pow(2, 63) - 1}]
   schema_1 = pa.Schema.from_pandas(pd.DataFrame(data_1))
   print(schema_1) # takes a different codepath, correctly infers uint64
   data_2 = [{"a": [pow(2, 63) - 1]}]
   schema_2 = pa.Schema.from_pandas(pd.DataFrame(data_2)) # crashes
   ```
   
   Here's the backtrace that you get when trying to compute `schema_2`.
   ```
   Traceback (most recent call last):
     File "/work/arrow/foo.py", line 5, in <module>
       schema = pa.Schema.from_pandas(pd.DataFrame(data))
     File "pyarrow/types.pxi", line 3104, in pyarrow.lib.Schema.from_pandas
     File 
"/work/arrow/pyarrow-dev/lib/python3.10/site-packages/pyarrow/pandas_compat.py",
 line 562, in dataframe_to_types
       type_ = pa.array(c, from_pandas=True).type
     File "pyarrow/array.pxi", line 360, in pyarrow.lib.array
     File "pyarrow/array.pxi", line 87, in pyarrow.lib._ndarray_to_array
     File "pyarrow/error.pxi", line 89, in pyarrow.lib.check_status
   OverflowError: Python int too large to convert to C long
   ```
   Is that something that can be changed or would that likely have too many 
unintended consequences?
   
   I've tested this with pyarrow version 19.0.0 on ubuntu 24.04.
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to