[GitHub] [doris] xinyiZzz commented on a diff in pull request #12716: [Enhancement](load) Refine the load channel flush policy on mem limit

GitBox Wed, 21 Sep 2022 01:00:07 -0700


xinyiZzz commented on code in PR #12716:
URL: https://github.com/apache/doris/pull/12716#discussion_r976170463



##########
be/src/runtime/load_channel_mgr.h:
##########
@@ -143,36 +151,75 @@ Status LoadChannelMgr::add_batch(const 
TabletWriterAddRequest& request,
 
 template <typename TabletWriterAddResult>
 Status LoadChannelMgr::_handle_mem_exceed_limit(TabletWriterAddResult* 
response) {
-    // lock so that only one thread can check mem limit
-    std::lock_guard<std::mutex> l(_lock);
-    if (!_mem_tracker->limit_exceeded()) {
-        return Status::OK();
-    }
-
-    int64_t max_consume = 0;
+    _pending_if_hard_limit_exceeded();
+    // Check limit and pick load channel to reduce memory.
     std::shared_ptr<LoadChannel> channel;
-    for (auto& kv : _load_channels) {
-        if (kv.second->is_high_priority()) {
-            // do not select high priority channel to reduce memory
-            // to avoid blocking them.
-            continue;
+    {
+        std::lock_guard<std::mutex> l(_lock);
+        // Check the soft limit.
+        if (_mem_tracker->consumption() < _load_process_soft_limit) {
+            return Status::OK();
         }
-        if (kv.second->mem_consumption() > max_consume) {
-            max_consume = kv.second->mem_consumption();
-            channel = kv.second;
+
+        // Some other thread is flushing data, and not reached hard limit now,
+        // we don't need to handle mem limit in current thread.
+        if (_reduce_memory_channel != nullptr && 
!_mem_tracker->limit_exceeded()) {
+            return Status::OK();
         }
+
+        // We need to pick a LoadChannel to reduce memory usage.
+        // If `_reduce_memory_channel` is not null, it means the hard limit is
+        // exceed now, we still need to pick a load channel again. Because
+        // `_reduce_memory_channel` might not be the largest consumer now.
+        int64_t max_consume = 0;
+        for (auto& kv : _load_channels) {
+            if (kv.second->is_high_priority()) {
+                // do not select high priority channel to reduce memory
+                // to avoid blocking them.
+                continue;
+            }
+            if (kv.second->mem_consumption() > max_consume) {
+                max_consume = kv.second->mem_consumption();
+                channel = kv.second;
+            }
+        }
+        if (max_consume == 0) {
+            // should not happen, add log to observe
+            LOG(WARNING) << "failed to find suitable load channel when total 
load mem limit exceed";
+            return Status::OK();
+        }
+        DCHECK(channel.get() != nullptr);
+        _reduce_memory_channel = channel;

Review Comment:
   All load channel threads will wait after the exceed hard limit, which is 
very costly. Selecting only the largest load channel reduce mem at a time may 
cause the exceed hard limit to be too frequent.
   
   Therefore, when the hard limit is exceeded, multiple load channels can be 
selected to reduce mem. This is fine when exceed soft limit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

[GitHub] [doris] xinyiZzz commented on a diff in pull request #12716: [Enhancement](load) Refine the load channel flush policy on mem limit

Reply via email to