This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new ef1eec4c6d9f [SPARK-53858][PYTHON][TESTS] Skip doctests in 
`pyspark.sql.functions.builtin` if pyarrow is not installed
ef1eec4c6d9f is described below

commit ef1eec4c6d9f1b8e71e581833f3c43a03a8586b1
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Thu Oct 9 22:11:54 2025 -0700

    [SPARK-53858][PYTHON][TESTS] Skip doctests in 
`pyspark.sql.functions.builtin` if pyarrow is not installed
    
    ### What changes were proposed in this pull request?
    Skip `udf` doctests without pyarrow
    
    ### Why are the changes needed?
    to make python 3.14 scheduled workflow work
    
    ### Does this PR introduce _any_ user-facing change?
    no
    
    ### How was this patch tested?
    manually check without pyarrow
    
    ```
    (spark_dev_313) ➜  spark git:(py_314_udf) pip uninstall pyarrow
    Found existing installation: pyarrow 21.0.0
    Uninstalling pyarrow-21.0.0:
      Would remove:
        
/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/site-packages/pyarrow-21.0.0.dist-info/*
        
/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/site-packages/pyarrow/*
    Proceed (Y/n)? y
      Successfully uninstalled pyarrow-21.0.0
    
    (spark_dev_313) ➜  spark git:(py_314_udf) ✗ python/run-tests -k --testnames 
'pyspark.sql.functions.builtin'
    Running PySpark tests. Output is in 
/Users/ruifeng.zheng/spark/python/unit-tests.log
    Will test against the following Python executables: 
['/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/bin/python3']
    Will test the following Python tests: ['pyspark.sql.functions.builtin']
    /Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/bin/python3 
python_implementation is CPython
    /Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/bin/python3 version 
is: Python 3.13.5
    Starting 
test(/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/bin/python3): 
pyspark.sql.functions.builtin (temp output: 
/Users/ruifeng.zheng/spark/python/target/cff4b76c-ff9c-4226-89dd-e1eabe4ebbad/Users_ruifeng.zheng_.dev_miniconda3_envs_spark_dev_313_bin_python3__pyspark.sql.functions.builtin__unaf2g6y.log)
    Finished 
test(/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/bin/python3): 
pyspark.sql.functions.builtin (64s)
    Tests passed in 64 seconds
    ```
    
    ### Was this patch authored or co-authored using generative AI tooling?
    no
    
    Closes #52569 from zhengruifeng/py_314_udf.
    
    Authored-by: Ruifeng Zheng <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 python/pyspark/sql/connect/functions/builtin.py | 5 +++++
 python/pyspark/sql/functions/builtin.py         | 6 ++++++
 2 files changed, 11 insertions(+)

diff --git a/python/pyspark/sql/connect/functions/builtin.py 
b/python/pyspark/sql/connect/functions/builtin.py
index 71865816b49a..127e1d74dbba 100644
--- a/python/pyspark/sql/connect/functions/builtin.py
+++ b/python/pyspark/sql/connect/functions/builtin.py
@@ -4703,9 +4703,14 @@ def _test() -> None:
     import doctest
     from pyspark.sql import SparkSession as PySparkSession
     import pyspark.sql.connect.functions.builtin
+    from pyspark.testing.utils import have_pandas, have_pyarrow
 
     globs = pyspark.sql.connect.functions.builtin.__dict__.copy()
 
+    if not have_pandas or not have_pyarrow:
+        del pyspark.sql.connect.functions.builtin.udf.__doc__
+        del pyspark.sql.connect.functions.builtin.arrow_udtf.__doc__
+
     globs["spark"] = (
         PySparkSession.builder.appName("sql.connect.functions tests")
         .remote(os.environ.get("SPARK_CONNECT_TESTING_REMOTE", "local[4]"))
diff --git a/python/pyspark/sql/functions/builtin.py 
b/python/pyspark/sql/functions/builtin.py
index cf54fd23e818..a0813e0fc2cb 100644
--- a/python/pyspark/sql/functions/builtin.py
+++ b/python/pyspark/sql/functions/builtin.py
@@ -27895,8 +27895,14 @@ def _test() -> None:
     import doctest
     from pyspark.sql import SparkSession
     import pyspark.sql.functions.builtin
+    from pyspark.testing.utils import have_pandas, have_pyarrow
 
     globs = pyspark.sql.functions.builtin.__dict__.copy()
+
+    if not have_pandas or not have_pyarrow:
+        del pyspark.sql.functions.builtin.udf.__doc__
+        del pyspark.sql.functions.builtin.arrow_udtf.__doc__
+
     spark = (
         SparkSession.builder.master("local[4]").appName("sql.functions.builtin 
tests").getOrCreate()
     )


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to