aiudirog opened a new issue, #47966:
URL: https://github.com/apache/arrow/issues/47966

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   
https://github.com/apache/arrow/commit/0d72f7e8f33a79ced28f42017bf07910c85c6d74 
automatically includes the Pandas DataFrame attrs in the common metadata, 
however it doesn't handle the case where values are not JSON serializable.
   
   Reproducing example:
   ```python
   from datetime import datetime
   
   import pandas as pd
   import pyarrow as pa
   
   df = pd.DataFrame({'x': [1, 2, 3]})
   df.attrs['timestamp'] = datetime.fromisoformat('2025-10-27T11:12:13')
   
   pa.table(df)
   ```
   
   Output:
   ```python-traceback
   Traceback (most recent call last):
     File "/venv/python312/Lib/site-packages/IPython/core/interactiveshell.py", 
line 3577, in run_code
       exec(code_obj, self.user_global_ns, self.user_ns)
     File "<ipython-input-11-5bf84f5fe292>", line 9, in <module>
       pa.table(df)
     File "pyarrow/table.pxi", line 6216, in pyarrow.lib.table
     File "pyarrow/table.pxi", line 4795, in pyarrow.lib.Table.from_pandas
     File "/venv/python312/Lib/site-packages/pyarrow/pandas_compat.py", line 
663, in dataframe_to_arrays
       pandas_metadata = construct_metadata(
                         ^^^^^^^^^^^^^^^^^^^
     File "/venv/python312/Lib/site-packages/pyarrow/pandas_compat.py", line 
281, in construct_metadata
       b'pandas': json.dumps({
                  ^^^^^^^^^^^^
     File "/venv/python312/Lib/json/__init__.py", line 231, in dumps
       return _default_encoder.encode(obj)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/venv/python312/Lib/json/encoder.py", line 200, in encode
       chunks = self.iterencode(o, _one_shot=True)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/venv/python312/Lib/json/encoder.py", line 258, in iterencode
       return _iterencode(o, 0)
              ^^^^^^^^^^^^^^^^^
     File "/venv/python312/Lib/json/encoder.py", line 180, in default
       raise TypeError(f'Object of type {o.__class__.__name__} '
   TypeError: Object of type datetime is not JSON serializable
   ```
   
   Versions Used:
   - Python: 3.12.7
   - PyArrow: 22.0.0
   - Pandas: 2.3.3
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to