johnasiano opened a new issue, #45153:
URL: https://github.com/apache/arrow/issues/45153

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   https://pandas.pydata.org/pandas-docs/stable/reference/arrays.html#pyarrow
   
   
   This issue was first brought up: 
https://github.com/pandas-dev/pandas/issues/50074
   
   The documentation mentions two different approaches.
   
   First is using StringDtype: 
   
   ```
   import pandas as pd
   import pyarrow as pa
   df = pd.DataFrame({"x": ["foo", "bar", "baz"]}, 
dtype=pd.StringDtype("pyarrow"))
   df_pa = pa.Table.from_pandas(df).to_pandas()
   pd.testing.assert_frame_equal(df, df_pa)
   ```
   
   Second is using ArrowDtype:
   
   ```
   import pandas as pd
   import pyarrow as pa
   df = pd.DataFrame({"x": ["foo", "bar", "baz"]}, 
dtype=pd.ArrowDtype(pa.string()))
   df_pa = pa.Table.from_pandas(df).to_pandas()
   pd.testing.assert_frame_equal(df, df_pa)
   ```
   
   However these both have assertion errors. 
   
   Using astype as shown below doesn't have the assertion error.
   
   ```
   import pandas as pd
   import pyarrow as pa
   df = pd.DataFrame({"x": ["foo", "bar", "baz"]}, dtype="string[pyarrow]")
   df_pa = pa.Table.from_pandas(df).to_pandas().astype("string[pyarrow]")
   pd.testing.assert_frame_equal(df, df_pa)
   ```
   
   The two approaches mentioned in the documentation are also mentioned in the 
issue from 2022 as working versions / fixes. However I think these approaches 
may not work with the current version of pandas. 
   
   ### Suggested fix for documentation
   
   Documentation should be updated to reflect the `.astype("string[pyarrow]")` 
as possibly being the best practice approach for this situation. 
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to