[spark] branch master updated: [SPARK-31379][CORE][TEST] Fix flaky o.a.s.scheduler.CoarseGrainedSchedulerBackendSuite.extra resources from executor

gurwls223 Wed, 08 Apr 2020 01:56:46 -0700

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new a2789c2  [SPARK-31379][CORE][TEST] Fix flaky 
o.a.s.scheduler.CoarseGrainedSchedulerBackendSuite.extra resources from executor
a2789c2 is described below

commit a2789c2a5179f6ed5613c24130bc80bf1bbfd753
Author: yi.wu <[email protected]>
AuthorDate: Wed Apr 8 17:54:28 2020 +0900

    [SPARK-31379][CORE][TEST] Fix flaky 
o.a.s.scheduler.CoarseGrainedSchedulerBackendSuite.extra resources from executor
    
    ### What changes were proposed in this pull request?
    
    This PR (SPARK-31379) adds one line 
`when(ts.resourceOffers(any[IndexedSeq[WorkerOffer]])).thenReturn(Seq.empty)` 
to avoid allocating resources.
    
    ### Why are the changes needed?
    
    The test is flaky and here's part of error stack:
    
    ```
    sbt.ForkMain$ForkError: 
org.scalatest.exceptions.TestFailedDueToTimeoutException:
    The code passed to eventually never returned normally. Attempted 325 times 
over 5.01070979 seconds.
    Last failure message: ArrayBuffer("1", "3") did not equal Array("0", "1", 
"3").
    ...
    
org.apache.spark.scheduler.CoarseGrainedSchedulerBackendSuite.eventually(CoarseGrainedSchedulerBackendSuite.scala:45)
    ```
    
    You can check 
[here](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120786/testReport/org.apache.spark.scheduler/CoarseGrainedSchedulerBackendSuite/extra_resources_from_executor/)
 for details.
    
    And it is flaky because:  after sending `StatusUpdate` to 
`CoarseGrainedSchedulerBackend`, `CoarseGrainedSchedulerBackend` will call 
`makeOffer` immediately once releasing the resources. So, it's possible that 
`availableAddrs` has allocated again before we assert 
`execResources(GPU).availableAddrs.sorted === Array("0", "1", "3")`.
    
    ### Does this PR introduce any user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    The issue can be stably reproduced by inserting `Thread.sleep(3000)` after 
the line of sending `StatusUpdate`. After applying this fix, the issue is gone.
    
    Closes #28145 from Ngone51/fix_flaky.
    
    Authored-by: yi.wu <[email protected]>
    Signed-off-by: HyukjinKwon <[email protected]>
---
 .../apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala    | 3 +++
 1 file changed, 3 insertions(+)

diff --git 
a/core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala
 
b/core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala
index f4745db..adfcbb8 100644
--- 
a/core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala
@@ -258,6 +258,9 @@ class CoarseGrainedSchedulerBackendSuite extends 
SparkFunSuite with LocalSparkCo
       assert(execResources(GPU).assignedAddrs === Array("0"))
     }
 
+    // To avoid allocating any resources immediately after releasing the 
resource from the task to
+    // make sure that `availableAddrs` below won't change
+    when(ts.resourceOffers(any[IndexedSeq[WorkerOffer]])).thenReturn(Seq.empty)
     backend.driverEndpoint.send(
       StatusUpdate("1", 1, TaskState.FINISHED, buffer, taskResources))
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[spark] branch master updated: [SPARK-31379][CORE][TEST] Fix flaky o.a.s.scheduler.CoarseGrainedSchedulerBackendSuite.extra resources from executor

Reply via email to