This is an automated email from the ASF dual-hosted git repository.
ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 49dc0a7ab72a [SPARK-53059][PYTHON] Arrow UDF no need to depend on
pandas
49dc0a7ab72a is described below
commit 49dc0a7ab72ad4e94b53d92e79fc66ada06dd120
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Fri Aug 1 23:24:25 2025 +0800
[SPARK-53059][PYTHON] Arrow UDF no need to depend on pandas
### What changes were proposed in this pull request?
Arrow UDF no need to depend on pandas
### Why are the changes needed?
Arrow UDF doesn't have to `require_minimum_pandas_version`
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
ci
### Was this patch authored or co-authored using generative AI tooling?
no
Closes #51767 from zhengruifeng/arrow_udf_dep.
Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
---
python/pyspark/sql/pandas/functions.py | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/python/pyspark/sql/pandas/functions.py
b/python/pyspark/sql/pandas/functions.py
index e45ef049f9a9..09e283ba21da 100644
--- a/python/pyspark/sql/pandas/functions.py
+++ b/python/pyspark/sql/pandas/functions.py
@@ -322,6 +322,8 @@ def arrow_udf(f=None, returnType=None, functionType=None):
pyspark.sql.PandasCogroupedOps.applyInArrow
pyspark.sql.UDFRegistration.register
"""
+ require_minimum_pyarrow_version()
+
return vectorized_udf(f, returnType, functionType, "arrow")
@@ -660,6 +662,9 @@ def pandas_udf(f=None, returnType=None, functionType=None):
# Note: Python 3.11.9, Pandas 2.2.3 and PyArrow 17.0.0 are used.
# Note: Timezone is KST.
# Note: 'X' means it throws an exception during the conversion.
+ require_minimum_pandas_version()
+ require_minimum_pyarrow_version()
+
return vectorized_udf(f, returnType, functionType, "pandas")
@@ -669,9 +674,6 @@ def vectorized_udf(
functionType=None,
kind: str = "pandas",
):
- require_minimum_pandas_version()
- require_minimum_pyarrow_version()
-
assert kind in ["pandas", "arrow"], "kind should be either 'pandas' or
'arrow'"
# decorator @pandas_udf(returnType, functionType)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]