This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 5a5308695304 [SPARK-54930][PYTHON] Remove redundant 
_accumulatorRegistry.clear() call in worker.py
5a5308695304 is described below

commit 5a530869530471cc3b10b1a42e539a287425f9e0
Author: Yicong-Huang <[email protected]>
AuthorDate: Thu Jan 8 08:18:21 2026 +0900

    [SPARK-54930][PYTHON] Remove redundant _accumulatorRegistry.clear() call in 
worker.py
    
    ### What changes were proposed in this pull request?
    
    Remove a redundant `_accumulatorRegistry.clear()` call in `worker.py`.
    
    Currently there are two consecutive `clear()` calls with no 
accumulator-modifying code in between:
    
    ```python
    shuffle.MemoryBytesSpilled = 0
    shuffle.DiskBytesSpilled = 0
    _accumulatorRegistry.clear()  # first call
    
    setup_spark_files(infile)
    setup_broadcasts(infile)
    
    _accumulatorRegistry.clear()  # second call
    ```
    
    Neither `setup_spark_files` nor `setup_broadcasts` adds anything to 
`_accumulatorRegistry`, so the first `clear()` is redundant.
    
    ### Why are the changes needed?
    
    This is dead code cleanup. The redundant call was introduced when:
    - SPARK-3463 (2014) added the first `clear()` after shuffle initialization
    - SPARK-3030 (2014) added the second `clear()` after broadcasts setup
    - SPARK-44533 (2023) refactored to extract `setup_spark_files` and 
`setup_broadcasts`, but preserved both `clear()` calls
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Existing tests. This is a simple dead code removal with no functional 
change.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No.
    
    Closes #53708 from Yicong-Huang/SPARK-54930/refactor/remove-redundant-clear.
    
    Authored-by: Yicong-Huang <[email protected]>
    Signed-off-by: Hyukjin Kwon <[email protected]>
---
 python/pyspark/worker.py | 1 -
 1 file changed, 1 deletion(-)

diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py
index 7326f24718e0..5b0b8e5883fc 100644
--- a/python/pyspark/worker.py
+++ b/python/pyspark/worker.py
@@ -3474,7 +3474,6 @@ def main(infile, outfile):
 
         shuffle.MemoryBytesSpilled = 0
         shuffle.DiskBytesSpilled = 0
-        _accumulatorRegistry.clear()
 
         setup_spark_files(infile)
         setup_broadcasts(infile)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to