This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new ef1eec4c6d9f [SPARK-53858][PYTHON][TESTS] Skip doctests in
`pyspark.sql.functions.builtin` if pyarrow is not installed
ef1eec4c6d9f is described below
commit ef1eec4c6d9f1b8e71e581833f3c43a03a8586b1
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Thu Oct 9 22:11:54 2025 -0700
[SPARK-53858][PYTHON][TESTS] Skip doctests in
`pyspark.sql.functions.builtin` if pyarrow is not installed
### What changes were proposed in this pull request?
Skip `udf` doctests without pyarrow
### Why are the changes needed?
to make python 3.14 scheduled workflow work
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
manually check without pyarrow
```
(spark_dev_313) ➜ spark git:(py_314_udf) pip uninstall pyarrow
Found existing installation: pyarrow 21.0.0
Uninstalling pyarrow-21.0.0:
Would remove:
/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/site-packages/pyarrow-21.0.0.dist-info/*
/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/site-packages/pyarrow/*
Proceed (Y/n)? y
Successfully uninstalled pyarrow-21.0.0
(spark_dev_313) ➜ spark git:(py_314_udf) ✗ python/run-tests -k --testnames
'pyspark.sql.functions.builtin'
Running PySpark tests. Output is in
/Users/ruifeng.zheng/spark/python/unit-tests.log
Will test against the following Python executables:
['/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/bin/python3']
Will test the following Python tests: ['pyspark.sql.functions.builtin']
/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/bin/python3
python_implementation is CPython
/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/bin/python3 version
is: Python 3.13.5
Starting
test(/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/bin/python3):
pyspark.sql.functions.builtin (temp output:
/Users/ruifeng.zheng/spark/python/target/cff4b76c-ff9c-4226-89dd-e1eabe4ebbad/Users_ruifeng.zheng_.dev_miniconda3_envs_spark_dev_313_bin_python3__pyspark.sql.functions.builtin__unaf2g6y.log)
Finished
test(/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/bin/python3):
pyspark.sql.functions.builtin (64s)
Tests passed in 64 seconds
```
### Was this patch authored or co-authored using generative AI tooling?
no
Closes #52569 from zhengruifeng/py_314_udf.
Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
python/pyspark/sql/connect/functions/builtin.py | 5 +++++
python/pyspark/sql/functions/builtin.py | 6 ++++++
2 files changed, 11 insertions(+)
diff --git a/python/pyspark/sql/connect/functions/builtin.py
b/python/pyspark/sql/connect/functions/builtin.py
index 71865816b49a..127e1d74dbba 100644
--- a/python/pyspark/sql/connect/functions/builtin.py
+++ b/python/pyspark/sql/connect/functions/builtin.py
@@ -4703,9 +4703,14 @@ def _test() -> None:
import doctest
from pyspark.sql import SparkSession as PySparkSession
import pyspark.sql.connect.functions.builtin
+ from pyspark.testing.utils import have_pandas, have_pyarrow
globs = pyspark.sql.connect.functions.builtin.__dict__.copy()
+ if not have_pandas or not have_pyarrow:
+ del pyspark.sql.connect.functions.builtin.udf.__doc__
+ del pyspark.sql.connect.functions.builtin.arrow_udtf.__doc__
+
globs["spark"] = (
PySparkSession.builder.appName("sql.connect.functions tests")
.remote(os.environ.get("SPARK_CONNECT_TESTING_REMOTE", "local[4]"))
diff --git a/python/pyspark/sql/functions/builtin.py
b/python/pyspark/sql/functions/builtin.py
index cf54fd23e818..a0813e0fc2cb 100644
--- a/python/pyspark/sql/functions/builtin.py
+++ b/python/pyspark/sql/functions/builtin.py
@@ -27895,8 +27895,14 @@ def _test() -> None:
import doctest
from pyspark.sql import SparkSession
import pyspark.sql.functions.builtin
+ from pyspark.testing.utils import have_pandas, have_pyarrow
globs = pyspark.sql.functions.builtin.__dict__.copy()
+
+ if not have_pandas or not have_pyarrow:
+ del pyspark.sql.functions.builtin.udf.__doc__
+ del pyspark.sql.functions.builtin.arrow_udtf.__doc__
+
spark = (
SparkSession.builder.master("local[4]").appName("sql.functions.builtin
tests").getOrCreate()
)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]