This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 21fad7756b98 [SPARK-40353][SPARK-51599][PS][FOLLOW-UP] Fix failure in
Python PS with old dependencies
21fad7756b98 is described below
commit 21fad7756b98d13fc0550f079bb57ad9c3fe058c
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Mon Mar 31 07:52:20 2025 +0900
[SPARK-40353][SPARK-51599][PS][FOLLOW-UP] Fix failure in Python PS with old
dependencies
### What changes were proposed in this pull request?
Fix failure in Python PS with old dependencies
https://github.com/apache/spark/actions/runs/14103817104/job/39506718320
### Why are the changes needed?
excel tests requires `openpyxl`, when not installed, should skip the tests
### Does this PR introduce _any_ user-facing change?
no, test-only
### How was this patch tested?
ci
### Was this patch authored or co-authored using generative AI tooling?
no
Closes #50448 from zhengruifeng/fix_test_read_excel.
Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
python/pyspark/pandas/tests/io/test_dataframe_spark_io.py | 3 +++
1 file changed, 3 insertions(+)
diff --git a/python/pyspark/pandas/tests/io/test_dataframe_spark_io.py
b/python/pyspark/pandas/tests/io/test_dataframe_spark_io.py
index 0308d22b6a5c..af77ea8aa64f 100644
--- a/python/pyspark/pandas/tests/io/test_dataframe_spark_io.py
+++ b/python/pyspark/pandas/tests/io/test_dataframe_spark_io.py
@@ -23,6 +23,7 @@ import pandas as pd
from pyspark import pandas as ps
from pyspark.loose_version import LooseVersion
from pyspark.testing.pandasutils import PandasOnSparkTestCase, TestUtils
+from pyspark.testing.utils import have_openpyxl, openpyxl_requirement_message
class DataFrameSparkIOTestsMixin:
@@ -253,6 +254,7 @@ class DataFrameSparkIOTestsMixin:
expected_idx.sort_values(by="f").to_spark().toPandas(),
)
+ @unittest.skipIf(not have_openpyxl, openpyxl_requirement_message)
def test_read_excel(self):
with self.temp_dir() as tmp:
path1 = "{}/file1.xlsx".format(tmp)
@@ -344,6 +346,7 @@ class DataFrameSparkIOTestsMixin:
pd.concat([pdfs1["Sheet_name_2"],
pdfs2["Sheet_name_2"]]).sort_index(),
)
+ @unittest.skipIf(not have_openpyxl, openpyxl_requirement_message)
def test_read_large_excel(self):
n = 20000
pdf = pd.DataFrame(
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]