This is an automated email from the ASF dual-hosted git repository. yangjie01 pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-4.0 by this push: new 508edd4c4232 [SPARK-51630][CORE][TESTS] Remove `pids` size check from "SPARK-45907: Use ProcessHandle APIs to computeProcessTree in ProcfsMetricsGetter" 508edd4c4232 is described below commit 508edd4c4232e811efa4924068246dde95d565e4 Author: yangjie01 <yangji...@baidu.com> AuthorDate: Thu Apr 10 15:24:53 2025 +0800 [SPARK-51630][CORE][TESTS] Remove `pids` size check from "SPARK-45907: Use ProcessHandle APIs to computeProcessTree in ProcfsMetricsGetter" ### What changes were proposed in this pull request? This PR removes the size check for `pids` from the test case titled "SPARK-45907: Use ProcessHandle APIs to computeProcessTree in ProcfsMetricsGetter". ### Why are the changes needed? To avoid potential test instability, the test case 'SPARK-45907: Use ProcessHandle APIs to computeProcessTree in ProcfsMetricsGetter' may fail when tested in the following environment: ``` Apple M3 macOS 15.4 zulu 17.0.14 ``` run `build/sbt "core/testOnly org.apache.spark.ui.UISeleniumSuite org.apache.spark.executor.ProcfsMetricsGetterSuite"` ``` [info] UISeleniumSuite: [info] - all jobs page should be rendered even though we configure the scheduling mode to fair (4 seconds, 202 milliseconds) [info] - effects of unpersist() / persist() should be reflected (2 seconds, 845 milliseconds) [info] - failed stages should not appear to be active (2 seconds, 455 milliseconds) [info] - spark.ui.killEnabled should properly control kill button display (8 seconds, 610 milliseconds) [info] - jobs page should not display job group name unless some job was submitted in a job group (2 seconds, 546 milliseconds) [info] - job progress bars should handle stage / task failures (2 seconds, 610 milliseconds) [info] - job details page should display useful information for stages that haven't started (2 seconds, 292 milliseconds) [info] - job progress bars / cells reflect skipped stages / tasks (2 seconds, 304 milliseconds) [info] - stages that aren't run appear as 'skipped stages' after a job finishes (2 seconds, 201 milliseconds) [info] - jobs with stages that are skipped should show correct link descriptions on all jobs page (2 seconds, 188 milliseconds) [info] - attaching and detaching a new tab (2 seconds, 268 milliseconds) [info] - kill stage POST/GET response is correct (173 milliseconds) [info] - kill job POST/GET response is correct (141 milliseconds) [info] - stage & job retention (2 seconds, 661 milliseconds) [info] - live UI json application list (2 seconds, 187 milliseconds) [info] - job stages should have expected dotfile under DAG visualization (2 seconds, 126 milliseconds) [info] - stages page should show skipped stages (2 seconds, 651 milliseconds) [info] - Staleness of Spark UI should not last minutes or hours (2 seconds, 167 milliseconds) [info] - description for empty jobs (2 seconds, 242 milliseconds) [info] - Support disable event timeline (4 seconds, 585 milliseconds) [info] - SPARK-41365: Stage page can be accessed if URI was encoded twice (2 seconds, 306 milliseconds) [info] - SPARK-44895: Add 'daemon', 'priority' for ThreadStackTrace (2 seconds, 219 milliseconds) [info] ProcfsMetricsGetterSuite: [info] - testGetProcessInfo (1 millisecond) OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended [info] - SPARK-34845: partial metrics shouldn't be returned (493 milliseconds) [info] - SPARK-45907: Use ProcessHandle APIs to computeProcessTree in ProcfsMetricsGetter *** FAILED *** (10 seconds, 149 milliseconds) [info] The code passed to eventually never returned normally. Attempted 102 times over 10.036665625 seconds. Last failure message: 1 did not equal 3. (ProcfsMetricsGetterSuite.scala:87) [info] org.scalatest.exceptions.TestFailedDueToTimeoutException: [info] at org.scalatest.enablers.Retrying$$anon$4.tryTryAgain$2(Retrying.scala:219) [info] at org.scalatest.enablers.Retrying$$anon$4.retry(Retrying.scala:226) [info] at org.scalatest.concurrent.Eventually.eventually(Eventually.scala:313) [info] at org.scalatest.concurrent.Eventually.eventually$(Eventually.scala:312) [info] at org.scalatest.concurrent.Eventually$.eventually(Eventually.scala:457) [info] at org.apache.spark.executor.ProcfsMetricsGetterSuite.$anonfun$new$3(ProcfsMetricsGetterSuite.scala:87) ``` After conducting an investigation, I discovered that the `eventually` block does not always capture the stage where `pids.size` is 3. Due to timing issues, it may directly capture the scenario where `pids.size` is 4. Furthermore, since the checks for `pids.contains(currentPid)` and `pids.contains(child)` are more crucial, this PR removes the check for the size of `pids`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #50545 from LuciferYang/ProcfsMetricsGetterSuite. Lead-authored-by: yangjie01 <yangji...@baidu.com> Co-authored-by: YangJie <yangji...@baidu.com> Signed-off-by: yangjie01 <yangji...@baidu.com> (cherry picked from commit 6cdf54b341ee63039cb71734daafce7e628793e2) Signed-off-by: yangjie01 <yangji...@baidu.com> --- .../test/scala/org/apache/spark/executor/ProcfsMetricsGetterSuite.scala | 1 - 1 file changed, 1 deletion(-) diff --git a/core/src/test/scala/org/apache/spark/executor/ProcfsMetricsGetterSuite.scala b/core/src/test/scala/org/apache/spark/executor/ProcfsMetricsGetterSuite.scala index 573540180e6c..77d782461a2e 100644 --- a/core/src/test/scala/org/apache/spark/executor/ProcfsMetricsGetterSuite.scala +++ b/core/src/test/scala/org/apache/spark/executor/ProcfsMetricsGetterSuite.scala @@ -86,7 +86,6 @@ class ProcfsMetricsGetterSuite extends SparkFunSuite { val child = process.toHandle.pid() eventually(timeout(10.seconds), interval(100.milliseconds)) { val pids = p.computeProcessTree() - assert(pids.size === 3) assert(pids.contains(currentPid)) assert(pids.contains(child)) } --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org