This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 5a5308695304 [SPARK-54930][PYTHON] Remove redundant
_accumulatorRegistry.clear() call in worker.py
5a5308695304 is described below
commit 5a530869530471cc3b10b1a42e539a287425f9e0
Author: Yicong-Huang <[email protected]>
AuthorDate: Thu Jan 8 08:18:21 2026 +0900
[SPARK-54930][PYTHON] Remove redundant _accumulatorRegistry.clear() call in
worker.py
### What changes were proposed in this pull request?
Remove a redundant `_accumulatorRegistry.clear()` call in `worker.py`.
Currently there are two consecutive `clear()` calls with no
accumulator-modifying code in between:
```python
shuffle.MemoryBytesSpilled = 0
shuffle.DiskBytesSpilled = 0
_accumulatorRegistry.clear() # first call
setup_spark_files(infile)
setup_broadcasts(infile)
_accumulatorRegistry.clear() # second call
```
Neither `setup_spark_files` nor `setup_broadcasts` adds anything to
`_accumulatorRegistry`, so the first `clear()` is redundant.
### Why are the changes needed?
This is dead code cleanup. The redundant call was introduced when:
- SPARK-3463 (2014) added the first `clear()` after shuffle initialization
- SPARK-3030 (2014) added the second `clear()` after broadcasts setup
- SPARK-44533 (2023) refactored to extract `setup_spark_files` and
`setup_broadcasts`, but preserved both `clear()` calls
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Existing tests. This is a simple dead code removal with no functional
change.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #53708 from Yicong-Huang/SPARK-54930/refactor/remove-redundant-clear.
Authored-by: Yicong-Huang <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
python/pyspark/worker.py | 1 -
1 file changed, 1 deletion(-)
diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py
index 7326f24718e0..5b0b8e5883fc 100644
--- a/python/pyspark/worker.py
+++ b/python/pyspark/worker.py
@@ -3474,7 +3474,6 @@ def main(infile, outfile):
shuffle.MemoryBytesSpilled = 0
shuffle.DiskBytesSpilled = 0
- _accumulatorRegistry.clear()
setup_spark_files(infile)
setup_broadcasts(infile)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]