(spark) branch master updated: [SPARK-54874][TESTS][INFRA] Avoid interleave failed test logs with test outputs

ruifengz Sat, 03 Jan 2026 16:34:16 -0800

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 735eda8a5b6b [SPARK-54874][TESTS][INFRA] Avoid interleave failed test 
logs with test outputs
735eda8a5b6b is described below

commit 735eda8a5b6b9394686ec97a586e6aa25d0d8771
Author: Tian Gao <[email protected]>
AuthorDate: Sun Jan 4 08:33:54 2026 +0800

    [SPARK-54874][TESTS][INFRA] Avoid interleave failed test logs with test 
outputs
    
    ### What changes were proposed in this pull request?
    
    1. `FAILURE_REPORTING_LOCK` is only for the unified logger file, we don't 
need that for `per_test_output`
    2. Use `LOGGER` instead of `print` to print data because `LOGGER` has an 
internal lock to avoid interleave
    3. Put all the lines together and print it once to avoid interleave
    
    ### Why are the changes needed?
    
    We have a thread pool to run different individual tests and the test output 
is interleaved with error messages.
    
    https://github.com/apache/spark/actions/runs/20594052053/job/59144974177
    
    It's difficult to tell which test the debugging message belongs to.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No
    
    ### How was this patch tested?
    
    Locally it works
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #53648 from gaogaotiantian/avoid-interleave-logs.
    
    Authored-by: Tian Gao <[email protected]>
    Signed-off-by: Ruifeng Zheng <[email protected]>
---
 python/run-tests.py | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/python/run-tests.py b/python/run-tests.py
index c7348ec34e86..b3522a13df4a 100755
--- a/python/run-tests.py
+++ b/python/run-tests.py
@@ -301,16 +301,21 @@ def run_individual_python_test(target_dir, test_name, 
pyspark_python, keep_test_
     # Exit on the first failure but exclude the code 5 for no test ran, see 
SPARK-46801.
     if retcode != 0 and retcode != 5:
         try:
+            per_test_output.seek(0)
             with FAILURE_REPORTING_LOCK:
                 with open(LOG_FILE, 'ab') as log_file:
-                    per_test_output.seek(0)
                     log_file.writelines(per_test_output)
-                per_test_output.seek(0)
-                for line in per_test_output:
-                    decoded_line = line.decode("utf-8", "replace")
-                    if not re.match('[0-9]+', decoded_line):
-                        print(decoded_line, end='')
-                per_test_output.close()
+
+            # We don't want the logging lines interleave with the test output, 
so we read the
+            # full file and output with LOGGER which has internal locking.
+            per_test_output.seek(0)
+            lines = []
+            for line in per_test_output:
+                line = line.decode("utf-8", "replace")
+                if not re.match('[0-9]+', line):
+                    lines.append(line)
+            LOGGER.error(f"{test_name} with {pyspark_python} 
failed:\n{''.join(lines)}")
+            per_test_output.close()
         except BaseException:
             LOGGER.exception("Got an exception while trying to print failed 
test output")
         finally:


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-54874][TESTS][INFRA] Avoid interleave failed test logs with test outputs

Reply via email to