This is an automated email from the ASF dual-hosted git repository.
ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 8fe006b20877 [SPARK-54882][PYTHON] Remove legacy
PYARROW_IGNORE_TIMEZONE
8fe006b20877 is described below
commit 8fe006b20877671c75e4650a27d268b496294299
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Fri Jan 2 17:06:48 2026 +0800
[SPARK-54882][PYTHON] Remove legacy PYARROW_IGNORE_TIMEZONE
### What changes were proposed in this pull request?
Remove legacy PYARROW_IGNORE_TIMEZONE
### Why are the changes needed?
it was added in spark 3.0 for integration with pyarrow 2.0
https://github.com/apache/spark/commit/5e331553726f838f2f14788c135f3497319b4714
seems it is no longer needed for pyspark tests
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
ci
### Was this patch authored or co-authored using generative AI tooling?
no
Closes #53660 from zhengruifeng/test_without_PYARROW_IGNORE_TIMEZONE.
Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
---
binder/postBuild | 3 ---
python/pyspark/pandas/__init__.py | 10 ----------
python/run-tests.py | 2 --
3 files changed, 15 deletions(-)
diff --git a/binder/postBuild b/binder/postBuild
index 92f4c70ffff5..224ed723c766 100755
--- a/binder/postBuild
+++ b/binder/postBuild
@@ -39,9 +39,6 @@ fi
pip install plotly "pandas<2.0.0"
"pyspark[sql,ml,mllib,pandas_on_spark,connect]$SPECIFIER$VERSION"
-# Set 'PYARROW_IGNORE_TIMEZONE' to suppress warnings from PyArrow.
-echo "export PYARROW_IGNORE_TIMEZONE=1" >> ~/.profile
-
# Add sbin to PATH to run `start-connect-server.sh`.
SPARK_HOME=$(python -c "from pyspark.find_spark_home import _find_spark_home;
print(_find_spark_home())")
echo "export PATH=${PATH}:${SPARK_HOME}/sbin" >> ~/.profile
diff --git a/python/pyspark/pandas/__init__.py
b/python/pyspark/pandas/__init__.py
index 65366f544092..ac749c195a1e 100644
--- a/python/pyspark/pandas/__init__.py
+++ b/python/pyspark/pandas/__init__.py
@@ -39,16 +39,6 @@ except ImportError as e:
else:
raise
-if "PYARROW_IGNORE_TIMEZONE" not in os.environ:
- warnings.warn(
- "'PYARROW_IGNORE_TIMEZONE' environment variable was not set. It is
required to "
- "set this environment variable to '1' in both driver and executor
sides if you use "
- "pyarrow>=2.0.0. "
- "pandas-on-Spark will set it for you but it does not work if there is
a Spark context "
- "already launched."
- )
- os.environ["PYARROW_IGNORE_TIMEZONE"] = "1"
-
from pyspark.pandas.frame import DataFrame
from pyspark.pandas.indexes.base import Index
from pyspark.pandas.indexes.category import CategoricalIndex
diff --git a/python/run-tests.py b/python/run-tests.py
index 9c58f1dcda5f..8bed5c7ff106 100755
--- a/python/run-tests.py
+++ b/python/run-tests.py
@@ -196,8 +196,6 @@ def run_individual_python_test(target_dir, test_name,
pyspark_python, keep_test_
'SPARK_PREPEND_CLASSES': '1',
'PYSPARK_PYTHON': which(pyspark_python),
'PYSPARK_DRIVER_PYTHON': which(pyspark_python),
- # Preserve legacy nested timezone behavior for pyarrow>=2, remove
after SPARK-32285
- 'PYARROW_IGNORE_TIMEZONE': '1',
})
if "SPARK_CONNECT_TESTING_REMOTE" in os.environ:
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]