This is an automated email from the ASF dual-hosted git repository.

dataroaring pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new d65264275aa [fix](be) keep FlushToken running count symmetric on 
cancel (#61684)
d65264275aa is described below

commit d65264275aa740a4baffd67c5157aa5bd6725aef
Author: Xin Liao <[email protected]>
AuthorDate: Fri Mar 27 11:12:29 2026 +0800

    [fix](be) keep FlushToken running count symmetric on cancel (#61684)
    
    ## Problem
    When a queued memtable flush task starts after `FlushToken::cancel()`
    has already marked the token shutdown, `_flush_memtable()` can return
    before the old `flush_running_count++` path but still run the deferred
    `flush_running_count--`. This can drive the counter below zero and make
    `cancel()` wait forever for `flush_running_count == 0`.
    
    Related PR: #53481
    
    ## Fix
    Move `flush_running_count++` to the top of `_flush_memtable()` before
    registering the deferred cleanup so the running-count accounting stays
    symmetric on every exit path.
    
    ## Validation
    - Reasoned from production stack and gdb evidence showing
    `flush_running_count = -1` while cancel was blocked in
    `_wait_running_task_finish()`
    - Not run locally: the BE UT environment in this workspace would require
    a full initial `ut_build_ASAN` build
---
 be/src/load/memtable/memtable_flush_executor.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/be/src/load/memtable/memtable_flush_executor.cpp 
b/be/src/load/memtable/memtable_flush_executor.cpp
index 382235d5942..b36d602c903 100644
--- a/be/src/load/memtable/memtable_flush_executor.cpp
+++ b/be/src/load/memtable/memtable_flush_executor.cpp
@@ -222,6 +222,9 @@ void FlushToken::_flush_memtable(std::shared_ptr<MemTable> 
memtable_ptr, int32_t
                                  int64_t submit_task_time) {
     signal::set_signal_task_id(_rowset_writer->load_id());
     signal::tablet_id = memtable_ptr->tablet_id();
+    // Count the task as running before registering the deferred cleanup so
+    // cancel/shutdown paths keep flush_running_count symmetric on every exit.
+    _stats.flush_running_count++;
     Defer defer {[&]() {
         std::lock_guard<std::mutex> lock(_mutex);
         _stats.flush_submit_count--;
@@ -240,7 +243,6 @@ void FlushToken::_flush_memtable(std::shared_ptr<MemTable> 
memtable_ptr, int32_t
     }
     DBUG_EXECUTE_IF("FlushToken.flush_memtable.wait_after_first_shutdown",
                     { std::this_thread::sleep_for(std::chrono::milliseconds(10 
* 1000)); });
-    _stats.flush_running_count++;
     // double check if shutdown to avoid wait running task finish count not 
accurate
     if (_is_shutdown()) {
         return;


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to