(spark) branch master updated: [MINOR][PYTHON][DOCS] Fix a pandas UDF example

gurwls223 Wed, 30 Jul 2025 22:31:23 -0700

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 388335d0d72c [MINOR][PYTHON][DOCS] Fix a pandas UDF example
388335d0d72c is described below

commit 388335d0d72c01e74ef887a89906f4ec735fedea
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Thu Jul 31 14:31:05 2025 +0900

    [MINOR][PYTHON][DOCS] Fix a pandas UDF example
    
    ### What changes were proposed in this pull request?
    Fix a pandas UDF example
    
    ### Why are the changes needed?
    the original output is not correct
    
    ### Does this PR introduce _any_ user-facing change?
    yes, doc-only
    
    ### How was this patch tested?
    manually check
    
    ### Was this patch authored or co-authored using generative AI tooling?
    no
    
    Closes #51738 from zhengruifeng/minor_pandas_example.
    
    Authored-by: Ruifeng Zheng <[email protected]>
    Signed-off-by: Hyukjin Kwon <[email protected]>
---
 python/pyspark/sql/pandas/functions.py | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/python/pyspark/sql/pandas/functions.py 
b/python/pyspark/sql/pandas/functions.py
index 4a2e6db3b99f..1a07ea0deac3 100644
--- a/python/pyspark/sql/pandas/functions.py
+++ b/python/pyspark/sql/pandas/functions.py
@@ -388,21 +388,26 @@ def pandas_udf(f=None, returnType=None, 
functionType=None):
     `pandas.DataFrame` as below:
 
     >>> @pandas_udf("col1 string, col2 long")
-    >>> def func(s1: pd.Series, s2: pd.Series, s3: pd.DataFrame) -> 
pd.DataFrame:
+    ... def func(s1: pd.Series, s2: pd.Series, s3: pd.DataFrame) -> 
pd.DataFrame:
     ...     s3['col2'] = s1 + s2.str.len()
     ...     return s3
-    ...
-    >>> # Create a Spark DataFrame that has three columns including a struct 
column.
-    ... df = spark.createDataFrame(
+
+
+    Create a Spark DataFrame that has three columns including a struct column.
+
+    >>> df = spark.createDataFrame(
     ...     [[1, "a string", ("a nested string",)]],
     ...     "long_col long, string_col string, struct_col struct<col1:string>")
+
     >>> df.printSchema()
     root
-    |-- long_column: long (nullable = true)
-    |-- string_column: string (nullable = true)
-    |-- struct_column: struct (nullable = true)
+    |-- long_col: long (nullable = true)
+    |-- string_col: string (nullable = true)
+    |-- struct_col: struct (nullable = true)
     |    |-- col1: string (nullable = true)
+
     >>> df.select(func("long_col", "string_col", "struct_col")).printSchema()
+    root
     |-- func(long_col, string_col, struct_col): struct (nullable = true)
     |    |-- col1: string (nullable = true)
     |    |-- col2: long (nullable = true)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [MINOR][PYTHON][DOCS] Fix a pandas UDF example

Reply via email to