This is an automated email from the ASF dual-hosted git repository.
haejoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 2acc969a4891 [SPARK-50717][PS][DOCS] Update doc and add examples for
`from_pandas`
2acc969a4891 is described below
commit 2acc969a489134d00b8179c3a1ae2f5fa2dc0417
Author: Haejoon Lee <[email protected]>
AuthorDate: Fri Jan 3 16:53:22 2025 +0900
[SPARK-50717][PS][DOCS] Update doc and add examples for `from_pandas`
### What changes were proposed in this pull request?
This PR proposes to update doc and add examples for `from_pandas`
### Why are the changes needed?
Improve documentation
### Does this PR introduce _any_ user-facing change?
No API changes
### How was this patch tested?
The existing CI should pass
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #49349 from itholic/SPARK-50717.
Authored-by: Haejoon Lee <[email protected]>
Signed-off-by: Haejoon Lee <[email protected]>
---
python/pyspark/pandas/namespace.py | 38 ++++++++++++++++++++++++++++++++++----
1 file changed, 34 insertions(+), 4 deletions(-)
diff --git a/python/pyspark/pandas/namespace.py
b/python/pyspark/pandas/namespace.py
index c77cdf51a2f6..d31bc1f48d11 100644
--- a/python/pyspark/pandas/namespace.py
+++ b/python/pyspark/pandas/namespace.py
@@ -138,14 +138,44 @@ def from_pandas(pobj: Union[pd.DataFrame, pd.Series,
pd.Index]) -> Union[Series,
Parameters
----------
- pobj : pandas.DataFrame or pandas.Series
- pandas DataFrame or Series to read.
+ pobj : pandas.DataFrame, pandas.Series or pandas.Index
+ pandas DataFrame, Series or Index to read.
Returns
-------
- Series or DataFrame
- If a pandas Series is passed in, this function returns a
pandas-on-Spark Series.
+ DataFrame, Series or Index
If a pandas DataFrame is passed in, this function returns a
pandas-on-Spark DataFrame.
+ If a pandas Series is passed in, this function returns a
pandas-on-Spark Series.
+ If a pandas Index is passed in, this function returns a
pandas-on-Spark Index.
+
+ Examples
+ --------
+ >>> import pandas as pd
+ >>> import pyspark.pandas as ps
+
+ Convert a pandas DataFrame:
+ >>> pdf = pd.DataFrame({'a': [1, 2, 3]})
+ >>> psdf = ps.from_pandas(pdf)
+ >>> psdf
+ a
+ 0 1
+ 1 2
+ 2 3
+
+ Convert a pandas Series:
+ >>> pser = pd.Series([1, 2, 3])
+ >>> psser = ps.from_pandas(pser)
+ >>> psser
+ 0 1
+ 1 2
+ 2 3
+ dtype: int64
+
+ Convert a pandas Index:
+ >>> pidx = pd.Index([1, 2, 3])
+ >>> psidx = ps.from_pandas(pidx)
+ >>> psidx
+ Index([1, 2, 3], dtype='int64')
"""
if isinstance(pobj, pd.Series):
return Series(pobj)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]