(spark) branch master updated: [SPARK-50717][PS][DOCS] Update doc and add examples for `from_pandas`

haejoon Thu, 02 Jan 2025 23:53:45 -0800

This is an automated email from the ASF dual-hosted git repository.

haejoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 2acc969a4891 [SPARK-50717][PS][DOCS] Update doc and add examples for 
`from_pandas`
2acc969a4891 is described below

commit 2acc969a489134d00b8179c3a1ae2f5fa2dc0417
Author: Haejoon Lee <[email protected]>
AuthorDate: Fri Jan 3 16:53:22 2025 +0900

    [SPARK-50717][PS][DOCS] Update doc and add examples for `from_pandas`
    
    ### What changes were proposed in this pull request?
    
    This PR proposes to update doc and add examples for `from_pandas`
    
    ### Why are the changes needed?
    
    Improve documentation
    
    ### Does this PR introduce _any_ user-facing change?
    
    No API changes
    
    ### How was this patch tested?
    
    The existing CI should pass
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #49349 from itholic/SPARK-50717.
    
    Authored-by: Haejoon Lee <[email protected]>
    Signed-off-by: Haejoon Lee <[email protected]>
---
 python/pyspark/pandas/namespace.py | 38 ++++++++++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/python/pyspark/pandas/namespace.py 
b/python/pyspark/pandas/namespace.py
index c77cdf51a2f6..d31bc1f48d11 100644
--- a/python/pyspark/pandas/namespace.py
+++ b/python/pyspark/pandas/namespace.py
@@ -138,14 +138,44 @@ def from_pandas(pobj: Union[pd.DataFrame, pd.Series, 
pd.Index]) -> Union[Series,
 
     Parameters
     ----------
-    pobj : pandas.DataFrame or pandas.Series
-        pandas DataFrame or Series to read.
+    pobj : pandas.DataFrame, pandas.Series or pandas.Index
+        pandas DataFrame, Series or Index to read.
 
     Returns
     -------
-    Series or DataFrame
-        If a pandas Series is passed in, this function returns a 
pandas-on-Spark Series.
+    DataFrame, Series or Index
         If a pandas DataFrame is passed in, this function returns a 
pandas-on-Spark DataFrame.
+        If a pandas Series is passed in, this function returns a 
pandas-on-Spark Series.
+        If a pandas Index is passed in, this function returns a 
pandas-on-Spark Index.
+
+    Examples
+    --------
+    >>> import pandas as pd
+    >>> import pyspark.pandas as ps
+
+    Convert a pandas DataFrame:
+    >>> pdf = pd.DataFrame({'a': [1, 2, 3]})
+    >>> psdf = ps.from_pandas(pdf)
+    >>> psdf
+       a
+    0  1
+    1  2
+    2  3
+
+    Convert a pandas Series:
+    >>> pser = pd.Series([1, 2, 3])
+    >>> psser = ps.from_pandas(pser)
+    >>> psser
+    0    1
+    1    2
+    2    3
+    dtype: int64
+
+    Convert a pandas Index:
+    >>> pidx = pd.Index([1, 2, 3])
+    >>> psidx = ps.from_pandas(pidx)
+    >>> psidx
+    Index([1, 2, 3], dtype='int64')
     """
     if isinstance(pobj, pd.Series):
         return Series(pobj)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-50717][PS][DOCS] Update doc and add examples for `from_pandas`

Reply via email to