[I] Confusion about the function signature of python from_pandas [arrow]

via GitHub Mon, 23 Dec 2024 23:32:24 -0800


zhuwenxing opened a new issue, #45105:
URL: https://github.com/apache/arrow/issues/45105


   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   ```
   
       @classmethod
       def from_pandas(cls, cls_1, df, Schema_schema=None, preserve_index=None, 
nthreads=None, columns=None, bool_safe=True): # real signature unknown; 
restored from __doc__
           """
           Table.from_pandas(cls, df, Schema schema=None, preserve_index=None, 
nthreads=None, columns=None, bool safe=True)
           
                   Convert pandas.DataFrame to an Arrow Table.
           
                   The column types in the resulting Arrow Table are inferred 
from the
                   dtypes of the pandas.Series in the DataFrame. In the case of 
non-object
                   Series, the NumPy dtype is translated to its Arrow 
equivalent. In the
                   case of `object`, we need to guess the datatype by looking 
at the
                   Python objects in this Series.
           
                   Be aware that Series of the `object` dtype don't carry enough
                   information to always lead to a meaningful Arrow type. In 
the case that
                   we cannot infer a type, e.g. because the DataFrame is of 
length 0 or
                   the Series only contains None/nan objects, the type is set to
                   null. This behavior can be avoided by constructing an 
explicit schema
                   and passing it to this function.
           
                   Parameters
                   ----------
                   df : pandas.DataFrame
                   schema : pyarrow.Schema, optional
                       The expected schema of the Arrow Table. This can be used 
to
                       indicate the type of columns if we cannot infer it 
automatically.
                       If passed, the output will have exactly this schema. 
Columns
                       specified in the schema that are not found in the 
DataFrame columns
                       or its index will raise an error. Additional columns or 
index
                       levels in the DataFrame which are not specified in the 
schema will
                       be ignored.
                   preserve_index : bool, optional
                       Whether to store the index as an additional column in 
the resulting
                       ``Table``. The default of None will store the index as a 
column,
                       except for RangeIndex which is stored as metadata only. 
Use
                       ``preserve_index=True`` to force it to be stored as a 
column.
                   nthreads : int, default None
                       If greater than 1, convert columns to Arrow in parallel 
using
                       indicated number of threads. By default, this follows
                       :func:`pyarrow.cpu_count` (may use up to system CPU 
count threads).
                   columns : list, optional
                      List of column to be converted. If None, use all columns.
                   safe : bool, default True
                      Check for overflows or other unsafe conversions.
           
                   Returns
                   -------
                   Table
           
                   Examples
                   --------
                   >>> import pyarrow as pa
                   >>> import pandas as pd
                   >>> df = pd.DataFrame({'n_legs': [2, 4, 5, 100],
                   ...                    'animals': ["Flamingo", "Horse", 
"Brittle stars", "Centipede"]})
                   >>> pa.Table.from_pandas(df)
                   pyarrow.Table
                   n_legs: int64
                   animals: string
                   ----
                   n_legs: [[2,4,5,100]]
                   animals: [["Flamingo","Horse","Brittle stars","Centipede"]]
           """
           pass
   ```
   
   
![image](https://github.com/user-attachments/assets/99e0485b-0bc3-406a-905b-c850d00562b2)
   
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[I] Confusion about the function signature of python from_pandas [arrow]

Reply via email to