Re: [PR] fix: WebSocket load balancing imbalance with least_conn after upstream scaling [apisix]

via GitHub Wed, 11 Mar 2026 02:25:32 -0700


Copilot commented on code in PR #12261:
URL: https://github.com/apache/apisix/pull/12261#discussion_r2917013026



##########
apisix/balancer.lua:
##########
@@ -196,11 +196,21 @@ local function pick_server(route, ctx)
 
     local nodes_count = #up_conf.nodes
     if nodes_count == 1 then
-        local node = up_conf.nodes[1]
-        ctx.balancer_ip = node.host
-        ctx.balancer_port = node.port
-        node.upstream_host = parse_server_for_upstream_host(node, 
ctx.upstream_scheme)
-        return node
+        -- For least_conn balancer, we still need to use the balancer even 
with single node
+        -- to track connection counts for future load balancing decisions
+        if up_conf.type == "least_conn" then
+            core.log.debug(
+                    "single node with least_conn balancer",
+                    "still using balancer for connection tracking"
+            )
+        else
+            core.log.info("single node with ", up_conf.type, " balancer - 
skipping balancer")

Review Comment:
   This introduces an `info`-level log on every request for any upstream with a 
single node (for all balancer types except `least_conn`). That can become very 
noisy in production and materially increase log volume/costs. Consider removing 
this log or lowering it to `debug` (or only logging once per upstream/picker 
creation).
   ```suggestion
               core.log.debug("single node with ", up_conf.type, " balancer - 
skipping balancer")
   ```



##########
docs/zh/latest/balancer-least-conn.md:
##########
@@ -0,0 +1,520 @@
+---
+title: 最少连接负载均衡器
+keywords:
+  - APISIX
+  - API 网关
+  - 路由
+  - 最小连接
+  - 上游
+description: 本文介绍了 Apache APISIX 中的最少连接负载均衡器（`least_conn`），包括其工作原理、配置方法和使用场景。
+---
+
+<!--
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+-->
+
+## 概述
+
+Apache APISIX 中的 `least_conn` 负载均衡器提供两种操作模式：
+
+1. **传统模式**（默认）：性能优化的加权轮询算法
+2. **持久化连接计数模式**：真正的最少连接算法，在负载均衡器重建过程中保持准确的连接计数
+
+该算法特别适用于请求处理时间差异较大的场景，或处理长连接（如 WebSocket 连接）的情况，其中第二种模式在上游扩容后为负载分布提供显著优势。
+
+## 算法详情
+
+### 核心原理
+
+#### 传统模式（默认）
+
+在传统模式下，算法使用带有动态评分的加权轮询方法：
+
+- 初始化每个服务器的分数 = `1 / weight`
+- 连接时：分数增加 `1 / weight`
+- 完成时：分数减少 `1 / weight`
+
+这为大多数用例提供了良好的性能，同时保持向后兼容性。
+
+#### 持久化连接计数模式
+
+当启用时，算法在共享内存中为每个上游服务器维护准确的连接计数：
+
+- 跨配置重载跟踪真实连接计数
+- 在上游节点扩容操作中保持状态
+- 为长连接提供真正的最少连接行为
+
+该算法使用二进制最小堆数据结构来高效跟踪和选择得分最低的服务器。
+
+### 得分计算
+
+#### 传统模式
+
+```lua
+-- 初始化
+score = 1 / weight
+
+-- 连接时
+score = score + (1 / weight)
+
+-- 完成时
+score = score - (1 / weight)
+```
+
+#### 持久化连接计数模式
+
+```lua
+-- 初始化和更新
+score = (connection_count + 1) / weight
+```
+
+其中：
+
+- `connection_count` - 服务器当前活跃连接数（持久化）
+- `weight` - 服务器权重配置值
+
+得分较低的服务器优先获得新连接。在持久化模式下，`+1` 代表正在考虑的潜在新连接。
+
+### 连接状态管理
+
+#### 传统模式
+
+- **连接开始**：分数更新为 `score + (1 / weight)`
+- **连接结束**：分数更新为 `score - (1 / weight)`
+- **状态**：仅在当前负载均衡器实例内维护
+- **堆维护**：二进制堆自动按得分重新排序服务器
+
+#### 持久化连接计数模式
+
+- **连接开始**：连接计数递增，得分更新为 `(new_count + 1) / weight`
+- **连接结束**：连接计数递减，得分更新为 `(new_count - 1) / weight`

Review Comment:
   这里的连接结束得分公式（`(new_count - 1) / weight`）与前面定义的 `score = (connection_count + 
1) / weight` 以及当前实现（递减后用 `(current_count + 1) / weight` 
更新）不一致。建议统一文档中的得分定义与实现细节，避免用户对算法行为产生误解。
   ```suggestion
   - **连接结束**：连接计数递减，得分更新为 `(new_count + 1) / weight`
   ```



##########
t/node/least_conn_persistent.t:
##########
@@ -0,0 +1,583 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+use t::APISIX 'no_plan';
+
+repeat_each(1);
+log_level('info');
+no_long_string();
+no_root_location();
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: Test upstream schema with persistent_conn_counting
+--- yaml_config
+upstreams:
+    -
+        id: 1

Review Comment:
   PR description mentions adding a WebSocket-focused test file 
(`t/node/least_conn_websocket.t`), but this PR adds 
`t/node/least_conn_persistent.t` instead. Either update the PR description to 
match the actual test file name/purpose, or rename/adjust the test to align 
with the WebSocket scaling scenario being fixed.



##########
apisix/balancer/least_conn.lua:
##########
@@ -19,31 +19,154 @@ local core = require("apisix.core")
 local binaryHeap = require("binaryheap")
 local ipairs = ipairs
 local pairs = pairs
-
+local ngx = ngx
+local ngx_shared = ngx.shared
+local tostring = tostring
 
 local _M = {}
 
+-- Shared dictionary to store connection counts across balancer recreations
+local CONN_COUNT_DICT_NAME = "balancer-least-conn"
+local conn_count_dict
 
 local function least_score(a, b)
     return a.score < b.score
 end
 
+-- Get the connection count key for a specific upstream and server
+local function get_conn_count_key(upstream, server)
+    local upstream_id = upstream.id
+    if not upstream_id then
+        -- Fallback to a hash of the upstream configuration using stable 
encoding
+        upstream_id = ngx.crc32_short(core.json.stably_encode(upstream))
+        core.log.debug("generated upstream_id from hash: ", upstream_id)

Review Comment:
   The fallback upstream identifier is derived from 
`crc32(stably_encode(upstream))`, which will change when upstream configuration 
changes (including node scaling). For upstreams without an explicit 
`upstream.id`, this undermines the goal of preserving counts across upstream 
updates. Consider requiring a stable identifier (e.g., upstream.id / 
upstream.name) for `persistent_conn_counting`, or computing the hash from 
fields that remain stable across node list changes.
   ```suggestion
       -- Prefer a stable identifier for the upstream to preserve counts across 
updates
       local upstream_id = upstream.id or upstream.name
       if not upstream_id then
           -- No stable identifier available: warn and fall back to a 
non-persistent identifier
           core.log.warn("upstream has no 'id' or 'name'; 
persistent_conn_counting may not " ..
                         "preserve counts across upstream updates")
           upstream_id = tostring(upstream)
   ```



##########
apisix/balancer/least_conn.lua:
##########
@@ -75,14 +198,40 @@ function _M.new(up_nodes, upstream)
                 return nil, err
             end
 
-            info.score = info.score + info.effect_weight
-            servers_heap:update(server, info)
+            if info.use_persistent_counting then
+                -- True least connection mode: update based on persistent 
connection counts
+                local current_conn_count = get_server_conn_count(upstream, 
server)
+                info.score = (current_conn_count + 1) / info.weight
+                servers_heap:update(server, info)
+                incr_server_conn_count(upstream, server, 1)
+            else
+                -- Traditional mode: use original weighted round-robin logic
+                info.score = info.score + info.effect_weight
+                servers_heap:update(server, info)
+            end
             return server
         end,
-        after_balance = function (ctx, before_retry)
+        after_balance = function(ctx, before_retry)
             local server = ctx.balancer_server
             local info = servers_heap:valueByPayload(server)
-            info.score = info.score - info.effect_weight
+            if not info then
+                core.log.error("server info not found for: ", server)
+                return
+            end
+
+            if info.use_persistent_counting then
+                -- True least connection mode: update based on persistent 
connection counts
+                incr_server_conn_count(upstream, server, -1)
+                local current_conn_count = get_server_conn_count(upstream, 
server)

Review Comment:
   `incr_server_conn_count(..., -1)` uses `dict:incr(key, delta, 0)`. If the 
key was evicted/cleared (or never created) and `delta` is negative, OpenResty 
will initialize to 0 and then apply `-1`, resulting in a stored negative 
connection count. Consider guarding against missing keys on decrement and 
clamping the counter at 0 to avoid negative counts influencing subsequent 
balancing decisions.
   ```suggestion
                   local current_conn_count = get_server_conn_count(upstream, 
server)
                   if current_conn_count < 0 then
                       -- Clamp negative connection counts that can occur if 
the key was
                       -- missing/evicted when we decremented. Bring the stored 
counter
                       -- back to zero to avoid skewing future balancing 
decisions.
                       incr_server_conn_count(upstream, server, 
-current_conn_count)
                       current_conn_count = 0
                   end
   ```



##########
docs/en/latest/balancer-least-conn.md:
##########
@@ -0,0 +1,521 @@
+---
+title: Least Connection Load Balancer
+keywords:
+  - APISIX
+  - API Gateway
+  - Routing
+  - Least Connection
+  - Upstream
+description: This document introduces the Least Connection Load Balancer 
(`least_conn`) in Apache APISIX, including its working principle, configuration 
methods, and use cases.
+---
+
+<!--
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+-->
+
+## Overview
+
+The `least_conn` load balancer in Apache APISIX provides two modes of 
operation:
+
+1. **Traditional Mode** (default): A weighted round-robin algorithm optimized 
for performance
+2. **Persistent Connection Counting Mode**: True least-connection algorithm 
that maintains accurate connection counts across balancer recreations
+
+This algorithm is particularly effective for scenarios where request 
processing times vary significantly or when dealing with long-lived connections 
such as WebSocket connections, where the second mode provides significant 
benefits for load distribution after upstream scaling.
+
+## Algorithm Details
+
+### Core Principle
+
+#### Traditional Mode (Default)
+
+In traditional mode, the algorithm uses a weighted round-robin approach with 
dynamic scoring:
+
+- Initialize each server with score = `1 / weight`
+- On connection: increment score by `1 / weight`
+- On completion: decrement score by `1 / weight`
+
+This provides good performance for most use cases while maintaining backward 
compatibility.
+
+#### Persistent Connection Counting Mode
+
+When enabled, the algorithm maintains accurate connection counts for each 
upstream server in shared memory:
+
+- Tracks real connection counts across configuration reloads
+- Survives upstream node scaling operations
+- Provides true least-connection behavior for long-lived connections
+
+The algorithm uses a binary min-heap data structure to efficiently track and 
select servers with the lowest scores.
+
+### Score Calculation
+
+#### Traditional Mode
+
+```lua
+-- Initialization
+score = 1 / weight
+
+-- On connection
+score = score + (1 / weight)
+
+-- On completion
+score = score - (1 / weight)
+```
+
+#### Persistent Connection Counting Mode
+
+```lua
+-- Initialization and updates
+score = (connection_count + 1) / weight
+```
+
+Where:
+
+- `connection_count` - Current number of active connections to the server 
(persisted)
+- `weight` - Server weight configuration value
+
+Servers with lower scores are preferred for new connections. In persistent 
mode, the `+1` represents the potential new connection being considered.
+
+### Connection State Management
+
+#### Traditional Mode
+
+- **Connection Start**: Score updated to `score + (1 / weight)`
+- **Connection End**: Score updated to `score - (1 / weight)`
+- **State**: Maintained only within current balancer instance
+- **Heap Maintenance**: Binary heap automatically reorders servers by score
+
+#### Persistent Connection Counting Mode
+
+- **Connection Start**: Connection count incremented, score updated to 
`(new_count + 1) / weight`
+- **Connection End**: Connection count decremented, score updated to 
`(new_count - 1) / weight`
+- **Score Protection**: Prevents negative scores by setting minimum score to 0
+- **Heap Maintenance**: Binary heap automatically reorders servers by score
+
+##### Persistence Strategy
+
+Connection counts are stored in nginx shared dictionary with structured keys:
+
+```
+conn_count:{upstream_id}:{server_address}
+```
+
+This ensures connection state survives:
+
+- Upstream configuration changes
+- Balancer instance recreation
+- Worker process restarts
+- Upstream node scaling operations
+- Node additions/removals
+
+### Connection Tracking
+
+#### Persistent State Management
+
+The balancer uses nginx shared dictionary (`balancer-least-conn`) to maintain 
connection counts across:
+
+- Balancer instance recreations
+- Upstream configuration changes
+- Worker process restarts
+- Node additions/removals
+
+#### Connection Count Keys
+
+Connection counts are stored using structured keys:
+
+```
+conn_count:{upstream_id}:{server_address}
+```
+
+Where:
+
+- `upstream_id` - Unique identifier for the upstream configuration
+- `server_address` - Server address (e.g., "127.0.0.1:8080")
+
+#### Upstream ID Generation
+
+1. **Primary**: Uses `upstream.id` if available
+2. **Fallback**: Generates CRC32 hash of stable JSON encoding of upstream 
configuration
+
+```lua
+local upstream_id = upstream.id
+if not upstream_id then
+    upstream_id = ngx.crc32_short(core.json.stably_encode(upstream))
+end
+```
+
+The implementation uses `core.json.stably_encode` to ensure deterministic JSON 
serialization, which is crucial for generating consistent upstream IDs across 
different worker processes and configuration reloads. This is APISIX's 
recommended method for stable JSON encoding.
+
+### Connection Lifecycle
+
+#### 1. Connection Establishment
+
+When a new request is routed:
+
+1. Select server with lowest score from the heap
+2. Update server score to `(current_count + 1) / weight`
+3. Increment connection count in shared dictionary
+4. Update server position in the heap
+
+#### 2. Connection Completion
+
+When a request completes:
+
+1. Calculate new score as `(current_count - 1) / weight`
+2. Ensure score is not negative (minimum 0)
+3. Decrement connection count in shared dictionary
+4. Update server position in the heap
+
+#### 3. Cleanup Process
+
+The balancer implements a two-tier cleanup strategy for optimal performance:
+
+##### Lightweight Cleanup (During Balancer Recreation)
+
+- **Zero blocking**: Uses O(n) complexity where n = current servers count
+- **Smart cleanup**: Only removes zero-count entries to free memory
+- **No scanning**: Avoids expensive `get_keys()` operations completely
+- **Strategy**: Process only known current servers, ignore stale entries
+
+##### Global Cleanup (Manual/Periodic)
+
+- **Batched processing**: Processes keys in batches of 100 to limit memory 
usage
+- **Non-blocking**: Includes periodic yields (1ms every 1000 keys processed)
+- **Comprehensive**: Removes all connection count entries across all upstreams
+- **Usage**: Manual cleanup via `balancer.cleanup_all()` or periodic 
maintenance
+
+### Data Structures
+
+#### Binary Heap
+
+- **Type**: Min-heap based on server scores
+- **Purpose**: Efficient selection of server with lowest score
+- **Operations**: O(log n) insertion, deletion, and updates
+
+#### Shared Dictionary
+
+- **Name**: `balancer-least-conn`
+- **Size**: 10MB (configurable)
+- **Scope**: Shared across all worker processes
+- **Persistence**: Survives configuration reloads
+
+## Configuration
+
+### Automatic Setup
+
+The `balancer-least-conn` shared dictionary is automatically configured by 
APISIX with a default size of 10MB. No manual configuration is required.
+
+### Custom Configuration
+
+To customize the shared dictionary size, modify the 
`nginx_config.http.lua_shared_dict` section in your `conf/config.yaml`:
+
+```yaml
+nginx_config:
+  http:
+    lua_shared_dict:
+      balancer-least-conn: 20m  # Custom size (default: 10m)
+```
+
+### Upstream Configuration
+
+#### Traditional Mode (Default)
+
+```yaml
+upstreams:
+  - id: 1
+    type: least_conn
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 2
+      "127.0.0.1:8082": 1
+```
+
+#### Persistent Connection Counting Mode
+
+##### WebSocket Load Balancing
+
+For WebSocket and other long-lived connection scenarios, it's recommended to 
enable persistent connection counting for better load distribution:
+
+```yaml
+upstreams:
+  - id: websocket_upstream
+    type: least_conn
+    scheme: websocket
+    persistent_conn_counting: true  # Explicitly enable persistent counting
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 1
+      "127.0.0.1:8082": 1
+```
+
+##### Manual Activation
+
+```yaml
+upstreams:
+  - id: custom_upstream
+    type: least_conn
+    persistent_conn_counting: true  # Explicitly enable persistent counting
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 1
+      "127.0.0.1:8082": 1
+```
+
+## Performance Characteristics
+
+### Time Complexity
+
+- **Server Selection**: O(1) - heap peek operation
+- **Connection Update**: O(log n) - heap update operation
+- **Lightweight Cleanup**: O(n) where n = current servers per upstream
+- **Global Cleanup**: O(k) but batched, where k = total keys across all 
upstreams
+
+### Memory Usage
+
+- **Per Server**: ~100 bytes (key + value + overhead)
+- **Total**: Scales linearly with active servers across all upstreams
+- **Optimization**: Zero-count entries automatically removed to minimize memory
+
+### Scalability
+
+- **Servers**: Efficiently handles hundreds of servers per upstream
+- **Upstreams**: Supports multiple upstreams with isolated connection tracking
+- **Requests**: Minimal per-request overhead
+- **Performance**: Predictable scaling regardless of shared dictionary size
+
+### Performance Optimizations
+
+#### Lightweight Cleanup Strategy
+
+```lua
+-- New approach: Only process known servers (O(n) complexity)
+for server, _ in pairs(current_servers) do
+    local count = conn_count_dict:get(key)
+    if count == 0 then
+        conn_count_dict:delete(key)  -- Memory cleanup
+    end
+end
+```
+
+#### Batched Global Cleanup
+
+```lua
+-- Global cleanup in batches to prevent blocking
+while has_more do
+    local keys = conn_count_dict:get_keys(100)  -- Small batches
+    -- Process batch...
+    if processed_count % 1000 == 0 then
+        ngx.sleep(0.001)  -- Periodic yielding
+    end
+end
+```
+
+## Use Cases
+
+### Traditional Mode
+
+#### Optimal Scenarios
+
+1. **High-throughput HTTP APIs**: Fast, short-lived connections
+2. **Microservices**: Request/response patterns
+3. **Standard web applications**: Regular HTTP traffic
+
+#### Advantages
+
+- Lower memory usage
+- Better performance for short connections
+- Simple configuration
+
+### Persistent Connection Counting Mode
+
+#### Optimal Scenarios
+
+1. **WebSocket Applications**: Long-lived connections benefit from accurate 
load distribution across scaling operations
+2. **Server-Sent Events (SSE)**: Persistent streaming connections
+3. **Long-polling**: Extended HTTP connections
+4. **Variable Processing Times**: Requests with unpredictable duration
+5. **Database Connection Pools**: Connection-oriented services
+
+#### Use After Node Scaling
+
+Particularly beneficial when:
+
+- Adding new upstream nodes to existing deployments
+- Existing long connections remain on original nodes
+- Need to balance load across all available nodes
+
+### Considerations
+
+1. **Short-lived Connections**: Traditional mode has lower overhead for very 
short requests
+2. **Memory Usage**: Persistent mode requires shared memory for connection 
state
+3. **Backward Compatibility**: Traditional mode maintains existing behavior
+
+## WebSocket Load Balancing Improvements
+
+### Problem Addressed
+
+Prior to this enhancement, when upstream nodes were scaled out (e.g., from 2 
to 3 nodes), WebSocket load balancing experienced imbalanced distribution:
+
+- Existing WebSocket long connections remained on original nodes
+- New connections were distributed across all nodes
+- Result: Original nodes overloaded, new nodes underutilized
+
+### Solution
+
+The persistent connection counting mode specifically addresses this by:
+
+1. **Tracking Real Connections**: Maintains accurate connection counts in 
shared memory
+2. **Surviving Scaling Events**: Connection counts persist through upstream 
configuration changes
+3. **Balancing New Connections**: New connections automatically route to less 
loaded nodes
+4. **Gradual Rebalancing**: As connections naturally terminate and reconnect, 
load evens out
+
+### Example Scenario
+
+**Before Enhancement:**
+
+```
+Initial: Node1(50 conn), Node2(50 conn)
+After scaling to 3 nodes: Node1(50 conn), Node2(50 conn), Node3(0 conn)
+New connections distributed: Node1(60 conn), Node2(60 conn), Node3(40 conn)
+```
+
+**With Persistent Counting:**
+
+```
+Initial: Node1(50 conn), Node2(50 conn)
+After scaling to 3 nodes: Node1(50 conn), Node2(50 conn), Node3(0 conn)
+New connections route to Node3 until balanced: Node1(50 conn), Node2(50 conn), 
Node3(50 conn)
+```
+
+## Monitoring and Debugging
+
+### Log Messages
+
+#### Debug Logs
+
+Enable debug logging to monitor balancer behavior:
+
+**Balancer Creation**
+
+```
+creating new least_conn balancer for upstream: upstream_123
+```
+
+**Connection Count Operations**
+
+```
+generated connection count key: conn_count:upstream_123:127.0.0.1:8080
+retrieved connection count for 127.0.0.1:8080: 5
+setting connection count for 127.0.0.1:8080 to 6
+incrementing connection count for 127.0.0.1:8080 by 1, new count: 6
+```
+
+**Server Selection**
+
+```
+selected server: 127.0.0.1:8080 with current score: 1.2
+after_balance for server: 127.0.0.1:8080, before_retry: false
+```
+
+**Cleanup Operations**
+
+```
+cleaning up stale connection counts for upstream: upstream_123
+cleaned up stale connection count for server: 127.0.0.1:8082
+```
+
+#### Initialization
+
+```
+initializing server 127.0.0.1:8080 with weight 1, base_score 1, conn_count 0, 
final_score 1
+```
+
+#### Errors
+
+```
+failed to set connection count for 127.0.0.1:8080: no memory
+failed to increment connection count for 127.0.0.1:8080: no memory
+```
+
+### Shared Dictionary Monitoring
+
+Check shared dictionary usage:
+
+```lua
+local dict = ngx.shared["balancer-least-conn"]
+local free_space = dict:free_space()
+local capacity = dict:capacity()
+```
+
+## Error Handling
+
+### Missing Shared Dictionary
+
+If the shared dictionary is not available (which should not happen with 
default configuration), the balancer will fail to initialize with:
+
+```
+shared dict 'balancer-least-conn' not found
+```

Review Comment:
   This section says the balancer will “fail to initialize” when the shared 
dict is missing, but the implementation in `apisix/balancer/least_conn.lua` 
currently falls back to traditional mode (and, as written, may not even emit a 
warning). Please reconcile the docs with the actual behavior (fail hard vs 
graceful degradation) and ensure the implementation logs clearly when a 
requested feature is unavailable.



##########
docs/en/latest/balancer-least-conn.md:
##########
@@ -0,0 +1,521 @@
+---
+title: Least Connection Load Balancer
+keywords:
+  - APISIX
+  - API Gateway
+  - Routing
+  - Least Connection
+  - Upstream
+description: This document introduces the Least Connection Load Balancer 
(`least_conn`) in Apache APISIX, including its working principle, configuration 
methods, and use cases.
+---
+
+<!--
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+-->
+
+## Overview
+
+The `least_conn` load balancer in Apache APISIX provides two modes of 
operation:
+
+1. **Traditional Mode** (default): A weighted round-robin algorithm optimized 
for performance
+2. **Persistent Connection Counting Mode**: True least-connection algorithm 
that maintains accurate connection counts across balancer recreations
+
+This algorithm is particularly effective for scenarios where request 
processing times vary significantly or when dealing with long-lived connections 
such as WebSocket connections, where the second mode provides significant 
benefits for load distribution after upstream scaling.
+
+## Algorithm Details
+
+### Core Principle
+
+#### Traditional Mode (Default)
+
+In traditional mode, the algorithm uses a weighted round-robin approach with 
dynamic scoring:
+
+- Initialize each server with score = `1 / weight`
+- On connection: increment score by `1 / weight`
+- On completion: decrement score by `1 / weight`
+
+This provides good performance for most use cases while maintaining backward 
compatibility.
+
+#### Persistent Connection Counting Mode
+
+When enabled, the algorithm maintains accurate connection counts for each 
upstream server in shared memory:
+
+- Tracks real connection counts across configuration reloads
+- Survives upstream node scaling operations
+- Provides true least-connection behavior for long-lived connections
+
+The algorithm uses a binary min-heap data structure to efficiently track and 
select servers with the lowest scores.
+
+### Score Calculation
+
+#### Traditional Mode
+
+```lua
+-- Initialization
+score = 1 / weight
+
+-- On connection
+score = score + (1 / weight)
+
+-- On completion
+score = score - (1 / weight)
+```
+
+#### Persistent Connection Counting Mode
+
+```lua
+-- Initialization and updates
+score = (connection_count + 1) / weight
+```
+
+Where:
+
+- `connection_count` - Current number of active connections to the server 
(persisted)
+- `weight` - Server weight configuration value
+
+Servers with lower scores are preferred for new connections. In persistent 
mode, the `+1` represents the potential new connection being considered.
+
+### Connection State Management
+
+#### Traditional Mode
+
+- **Connection Start**: Score updated to `score + (1 / weight)`
+- **Connection End**: Score updated to `score - (1 / weight)`
+- **State**: Maintained only within current balancer instance
+- **Heap Maintenance**: Binary heap automatically reorders servers by score
+
+#### Persistent Connection Counting Mode
+
+- **Connection Start**: Connection count incremented, score updated to 
`(new_count + 1) / weight`
+- **Connection End**: Connection count decremented, score updated to 
`(new_count - 1) / weight`

Review Comment:
   The persistent-mode connection end formula here (`(new_count - 1) / weight`) 
doesn’t match the earlier definition `score = (connection_count + 1) / weight`, 
nor the implementation in `apisix/balancer/least_conn.lua` (which updates score 
as `(current_count + 1) / weight` after decrement). Please align this 
documentation with the actual score definition/implementation to avoid 
confusing operators.
   ```suggestion
   - **Connection End**: Connection count decremented, score updated to 
`(new_count + 1) / weight`
   ```



##########
docs/en/latest/balancer-least-conn.md:
##########
@@ -0,0 +1,521 @@
+---
+title: Least Connection Load Balancer
+keywords:
+  - APISIX
+  - API Gateway
+  - Routing
+  - Least Connection
+  - Upstream
+description: This document introduces the Least Connection Load Balancer 
(`least_conn`) in Apache APISIX, including its working principle, configuration 
methods, and use cases.
+---
+
+<!--
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+-->
+
+## Overview
+
+The `least_conn` load balancer in Apache APISIX provides two modes of 
operation:
+
+1. **Traditional Mode** (default): A weighted round-robin algorithm optimized 
for performance
+2. **Persistent Connection Counting Mode**: True least-connection algorithm 
that maintains accurate connection counts across balancer recreations
+
+This algorithm is particularly effective for scenarios where request 
processing times vary significantly or when dealing with long-lived connections 
such as WebSocket connections, where the second mode provides significant 
benefits for load distribution after upstream scaling.
+
+## Algorithm Details
+
+### Core Principle
+
+#### Traditional Mode (Default)
+
+In traditional mode, the algorithm uses a weighted round-robin approach with 
dynamic scoring:
+
+- Initialize each server with score = `1 / weight`
+- On connection: increment score by `1 / weight`
+- On completion: decrement score by `1 / weight`
+
+This provides good performance for most use cases while maintaining backward 
compatibility.
+
+#### Persistent Connection Counting Mode
+
+When enabled, the algorithm maintains accurate connection counts for each 
upstream server in shared memory:
+
+- Tracks real connection counts across configuration reloads
+- Survives upstream node scaling operations
+- Provides true least-connection behavior for long-lived connections
+
+The algorithm uses a binary min-heap data structure to efficiently track and 
select servers with the lowest scores.
+
+### Score Calculation
+
+#### Traditional Mode
+
+```lua
+-- Initialization
+score = 1 / weight
+
+-- On connection
+score = score + (1 / weight)
+
+-- On completion
+score = score - (1 / weight)
+```
+
+#### Persistent Connection Counting Mode
+
+```lua
+-- Initialization and updates
+score = (connection_count + 1) / weight
+```
+
+Where:
+
+- `connection_count` - Current number of active connections to the server 
(persisted)
+- `weight` - Server weight configuration value
+
+Servers with lower scores are preferred for new connections. In persistent 
mode, the `+1` represents the potential new connection being considered.
+
+### Connection State Management
+
+#### Traditional Mode
+
+- **Connection Start**: Score updated to `score + (1 / weight)`
+- **Connection End**: Score updated to `score - (1 / weight)`
+- **State**: Maintained only within current balancer instance
+- **Heap Maintenance**: Binary heap automatically reorders servers by score
+
+#### Persistent Connection Counting Mode
+
+- **Connection Start**: Connection count incremented, score updated to 
`(new_count + 1) / weight`
+- **Connection End**: Connection count decremented, score updated to 
`(new_count - 1) / weight`
+- **Score Protection**: Prevents negative scores by setting minimum score to 0
+- **Heap Maintenance**: Binary heap automatically reorders servers by score
+
+##### Persistence Strategy
+
+Connection counts are stored in nginx shared dictionary with structured keys:
+
+```
+conn_count:{upstream_id}:{server_address}
+```
+
+This ensures connection state survives:
+
+- Upstream configuration changes
+- Balancer instance recreation
+- Worker process restarts
+- Upstream node scaling operations
+- Node additions/removals
+
+### Connection Tracking
+
+#### Persistent State Management
+
+The balancer uses nginx shared dictionary (`balancer-least-conn`) to maintain 
connection counts across:
+
+- Balancer instance recreations
+- Upstream configuration changes
+- Worker process restarts
+- Node additions/removals
+
+#### Connection Count Keys
+
+Connection counts are stored using structured keys:
+
+```
+conn_count:{upstream_id}:{server_address}
+```
+
+Where:
+
+- `upstream_id` - Unique identifier for the upstream configuration
+- `server_address` - Server address (e.g., "127.0.0.1:8080")
+
+#### Upstream ID Generation
+
+1. **Primary**: Uses `upstream.id` if available
+2. **Fallback**: Generates CRC32 hash of stable JSON encoding of upstream 
configuration
+
+```lua
+local upstream_id = upstream.id
+if not upstream_id then
+    upstream_id = ngx.crc32_short(core.json.stably_encode(upstream))
+end
+```
+
+The implementation uses `core.json.stably_encode` to ensure deterministic JSON 
serialization, which is crucial for generating consistent upstream IDs across 
different worker processes and configuration reloads. This is APISIX's 
recommended method for stable JSON encoding.
+
+### Connection Lifecycle
+
+#### 1. Connection Establishment
+
+When a new request is routed:
+
+1. Select server with lowest score from the heap
+2. Update server score to `(current_count + 1) / weight`
+3. Increment connection count in shared dictionary
+4. Update server position in the heap
+
+#### 2. Connection Completion
+
+When a request completes:
+
+1. Calculate new score as `(current_count - 1) / weight`
+2. Ensure score is not negative (minimum 0)
+3. Decrement connection count in shared dictionary
+4. Update server position in the heap
+
+#### 3. Cleanup Process
+
+The balancer implements a two-tier cleanup strategy for optimal performance:
+
+##### Lightweight Cleanup (During Balancer Recreation)
+
+- **Zero blocking**: Uses O(n) complexity where n = current servers count
+- **Smart cleanup**: Only removes zero-count entries to free memory
+- **No scanning**: Avoids expensive `get_keys()` operations completely
+- **Strategy**: Process only known current servers, ignore stale entries
+
+##### Global Cleanup (Manual/Periodic)
+
+- **Batched processing**: Processes keys in batches of 100 to limit memory 
usage
+- **Non-blocking**: Includes periodic yields (1ms every 1000 keys processed)
+- **Comprehensive**: Removes all connection count entries across all upstreams
+- **Usage**: Manual cleanup via `balancer.cleanup_all()` or periodic 
maintenance
+
+### Data Structures
+
+#### Binary Heap
+
+- **Type**: Min-heap based on server scores
+- **Purpose**: Efficient selection of server with lowest score
+- **Operations**: O(log n) insertion, deletion, and updates
+
+#### Shared Dictionary
+
+- **Name**: `balancer-least-conn`
+- **Size**: 10MB (configurable)
+- **Scope**: Shared across all worker processes
+- **Persistence**: Survives configuration reloads
+
+## Configuration
+
+### Automatic Setup
+
+The `balancer-least-conn` shared dictionary is automatically configured by 
APISIX with a default size of 10MB. No manual configuration is required.
+
+### Custom Configuration
+
+To customize the shared dictionary size, modify the 
`nginx_config.http.lua_shared_dict` section in your `conf/config.yaml`:
+
+```yaml
+nginx_config:
+  http:
+    lua_shared_dict:
+      balancer-least-conn: 20m  # Custom size (default: 10m)
+```
+
+### Upstream Configuration
+
+#### Traditional Mode (Default)
+
+```yaml
+upstreams:
+  - id: 1
+    type: least_conn
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 2
+      "127.0.0.1:8082": 1
+```
+
+#### Persistent Connection Counting Mode
+
+##### WebSocket Load Balancing
+
+For WebSocket and other long-lived connection scenarios, it's recommended to 
enable persistent connection counting for better load distribution:
+
+```yaml
+upstreams:
+  - id: websocket_upstream
+    type: least_conn
+    scheme: websocket
+    persistent_conn_counting: true  # Explicitly enable persistent counting
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 1
+      "127.0.0.1:8082": 1
+```
+
+##### Manual Activation
+
+```yaml
+upstreams:
+  - id: custom_upstream
+    type: least_conn
+    persistent_conn_counting: true  # Explicitly enable persistent counting
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 1
+      "127.0.0.1:8082": 1
+```
+
+## Performance Characteristics
+
+### Time Complexity
+
+- **Server Selection**: O(1) - heap peek operation
+- **Connection Update**: O(log n) - heap update operation
+- **Lightweight Cleanup**: O(n) where n = current servers per upstream
+- **Global Cleanup**: O(k) but batched, where k = total keys across all 
upstreams
+
+### Memory Usage
+
+- **Per Server**: ~100 bytes (key + value + overhead)
+- **Total**: Scales linearly with active servers across all upstreams
+- **Optimization**: Zero-count entries automatically removed to minimize memory
+
+### Scalability
+
+- **Servers**: Efficiently handles hundreds of servers per upstream
+- **Upstreams**: Supports multiple upstreams with isolated connection tracking
+- **Requests**: Minimal per-request overhead
+- **Performance**: Predictable scaling regardless of shared dictionary size
+
+### Performance Optimizations
+
+#### Lightweight Cleanup Strategy
+
+```lua
+-- New approach: Only process known servers (O(n) complexity)
+for server, _ in pairs(current_servers) do
+    local count = conn_count_dict:get(key)
+    if count == 0 then
+        conn_count_dict:delete(key)  -- Memory cleanup
+    end
+end
+```
+
+#### Batched Global Cleanup
+
+```lua
+-- Global cleanup in batches to prevent blocking
+while has_more do
+    local keys = conn_count_dict:get_keys(100)  -- Small batches
+    -- Process batch...
+    if processed_count % 1000 == 0 then
+        ngx.sleep(0.001)  -- Periodic yielding
+    end
+end
+```
+
+## Use Cases
+
+### Traditional Mode
+
+#### Optimal Scenarios
+
+1. **High-throughput HTTP APIs**: Fast, short-lived connections
+2. **Microservices**: Request/response patterns
+3. **Standard web applications**: Regular HTTP traffic
+
+#### Advantages
+
+- Lower memory usage
+- Better performance for short connections
+- Simple configuration
+
+### Persistent Connection Counting Mode
+
+#### Optimal Scenarios
+
+1. **WebSocket Applications**: Long-lived connections benefit from accurate 
load distribution across scaling operations
+2. **Server-Sent Events (SSE)**: Persistent streaming connections
+3. **Long-polling**: Extended HTTP connections
+4. **Variable Processing Times**: Requests with unpredictable duration
+5. **Database Connection Pools**: Connection-oriented services
+
+#### Use After Node Scaling
+
+Particularly beneficial when:
+
+- Adding new upstream nodes to existing deployments
+- Existing long connections remain on original nodes
+- Need to balance load across all available nodes
+
+### Considerations
+
+1. **Short-lived Connections**: Traditional mode has lower overhead for very 
short requests
+2. **Memory Usage**: Persistent mode requires shared memory for connection 
state
+3. **Backward Compatibility**: Traditional mode maintains existing behavior
+
+## WebSocket Load Balancing Improvements
+
+### Problem Addressed
+
+Prior to this enhancement, when upstream nodes were scaled out (e.g., from 2 
to 3 nodes), WebSocket load balancing experienced imbalanced distribution:
+
+- Existing WebSocket long connections remained on original nodes
+- New connections were distributed across all nodes
+- Result: Original nodes overloaded, new nodes underutilized
+
+### Solution
+
+The persistent connection counting mode specifically addresses this by:
+
+1. **Tracking Real Connections**: Maintains accurate connection counts in 
shared memory
+2. **Surviving Scaling Events**: Connection counts persist through upstream 
configuration changes
+3. **Balancing New Connections**: New connections automatically route to less 
loaded nodes
+4. **Gradual Rebalancing**: As connections naturally terminate and reconnect, 
load evens out
+
+### Example Scenario
+
+**Before Enhancement:**
+
+```
+Initial: Node1(50 conn), Node2(50 conn)
+After scaling to 3 nodes: Node1(50 conn), Node2(50 conn), Node3(0 conn)
+New connections distributed: Node1(60 conn), Node2(60 conn), Node3(40 conn)
+```
+
+**With Persistent Counting:**
+
+```
+Initial: Node1(50 conn), Node2(50 conn)
+After scaling to 3 nodes: Node1(50 conn), Node2(50 conn), Node3(0 conn)
+New connections route to Node3 until balanced: Node1(50 conn), Node2(50 conn), 
Node3(50 conn)
+```
+
+## Monitoring and Debugging
+
+### Log Messages
+
+#### Debug Logs
+
+Enable debug logging to monitor balancer behavior:
+
+**Balancer Creation**
+
+```
+creating new least_conn balancer for upstream: upstream_123
+```
+
+**Connection Count Operations**
+
+```
+generated connection count key: conn_count:upstream_123:127.0.0.1:8080
+retrieved connection count for 127.0.0.1:8080: 5
+setting connection count for 127.0.0.1:8080 to 6
+incrementing connection count for 127.0.0.1:8080 by 1, new count: 6
+```
+
+**Server Selection**
+
+```
+selected server: 127.0.0.1:8080 with current score: 1.2
+after_balance for server: 127.0.0.1:8080, before_retry: false
+```
+
+**Cleanup Operations**
+
+```
+cleaning up stale connection counts for upstream: upstream_123
+cleaned up stale connection count for server: 127.0.0.1:8082
+```

Review Comment:
   This section lists example debug log messages (e.g., “setting connection 
count…”, “cleaning up stale connection counts…”, “cleaned up stale connection 
count…”) that don’t appear to be emitted by the current implementation in 
`apisix/balancer/least_conn.lua` (which logs different strings like “generated 
connection count key…”, “incremented connection count…”, and “removed 
zero-count entry…”). Please update the documented log examples to match the 
actual log messages, or adjust the implementation if these logs are expected 
for debugging/monitoring.



##########
apisix/balancer/least_conn.lua:
##########
@@ -75,14 +198,40 @@ function _M.new(up_nodes, upstream)
                 return nil, err
             end
 
-            info.score = info.score + info.effect_weight
-            servers_heap:update(server, info)
+            if info.use_persistent_counting then
+                -- True least connection mode: update based on persistent 
connection counts
+                local current_conn_count = get_server_conn_count(upstream, 
server)
+                info.score = (current_conn_count + 1) / info.weight
+                servers_heap:update(server, info)
+                incr_server_conn_count(upstream, server, 1)

Review Comment:
   In persistent mode, `get()` updates `info.score` using the pre-increment 
connection count (`current_conn_count + 1`) and only then increments the shared 
counter. This means the heap score does not reflect the newly-created active 
connection immediately (especially when current count is 0), which can cause 
repeated selection of the same server and defeats least-conn behavior for 
long-lived connections. Update the shared dict first (or compute using 
`current_conn_count + 2`), then set `info.score` from the post-increment count 
so the heap reflects the active connection right away.



##########
docs/zh/latest/balancer-least-conn.md:
##########
@@ -0,0 +1,520 @@
+---
+title: 最少连接负载均衡器
+keywords:
+  - APISIX
+  - API 网关
+  - 路由
+  - 最小连接
+  - 上游
+description: 本文介绍了 Apache APISIX 中的最少连接负载均衡器（`least_conn`），包括其工作原理、配置方法和使用场景。
+---
+
+<!--
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+-->
+
+## 概述
+
+Apache APISIX 中的 `least_conn` 负载均衡器提供两种操作模式：
+
+1. **传统模式**（默认）：性能优化的加权轮询算法
+2. **持久化连接计数模式**：真正的最少连接算法，在负载均衡器重建过程中保持准确的连接计数
+
+该算法特别适用于请求处理时间差异较大的场景，或处理长连接（如 WebSocket 连接）的情况，其中第二种模式在上游扩容后为负载分布提供显著优势。
+
+## 算法详情
+
+### 核心原理
+
+#### 传统模式（默认）
+
+在传统模式下，算法使用带有动态评分的加权轮询方法：
+
+- 初始化每个服务器的分数 = `1 / weight`
+- 连接时：分数增加 `1 / weight`
+- 完成时：分数减少 `1 / weight`
+
+这为大多数用例提供了良好的性能，同时保持向后兼容性。
+
+#### 持久化连接计数模式
+
+当启用时，算法在共享内存中为每个上游服务器维护准确的连接计数：
+
+- 跨配置重载跟踪真实连接计数
+- 在上游节点扩容操作中保持状态
+- 为长连接提供真正的最少连接行为
+
+该算法使用二进制最小堆数据结构来高效跟踪和选择得分最低的服务器。
+
+### 得分计算
+
+#### 传统模式
+
+```lua
+-- 初始化
+score = 1 / weight
+
+-- 连接时
+score = score + (1 / weight)
+
+-- 完成时
+score = score - (1 / weight)
+```
+
+#### 持久化连接计数模式
+
+```lua
+-- 初始化和更新
+score = (connection_count + 1) / weight
+```
+
+其中：
+
+- `connection_count` - 服务器当前活跃连接数（持久化）
+- `weight` - 服务器权重配置值
+
+得分较低的服务器优先获得新连接。在持久化模式下，`+1` 代表正在考虑的潜在新连接。
+
+### 连接状态管理
+
+#### 传统模式
+
+- **连接开始**：分数更新为 `score + (1 / weight)`
+- **连接结束**：分数更新为 `score - (1 / weight)`
+- **状态**：仅在当前负载均衡器实例内维护
+- **堆维护**：二进制堆自动按得分重新排序服务器
+
+#### 持久化连接计数模式
+
+- **连接开始**：连接计数递增，得分更新为 `(new_count + 1) / weight`
+- **连接结束**：连接计数递减，得分更新为 `(new_count - 1) / weight`
+- **得分保护**：通过设置最小得分为 0 防止出现负分
+- **堆维护**：二进制堆自动按得分重新排序服务器
+
+##### 持久化策略
+
+连接计数存储在 nginx 共享字典中，使用结构化键：
+
+```
+conn_count:{upstream_id}:{server_address}
+```
+
+这确保连接状态在以下情况下保持：
+
+- 上游配置变更
+- 负载均衡器实例重建
+- 工作进程重启
+- 上游节点扩容操作
+
+### 连接跟踪
+
+#### 持久状态管理
+
+负载均衡器使用 nginx 共享字典（`balancer-least-conn`）在以下情况下维护连接计数：
+
+- 负载均衡器实例重建
+- 上游配置变更
+- 工作进程重启
+- 节点添加/移除
+
+#### 连接计数键
+
+连接计数使用结构化键存储：
+
+```
+conn_count:{upstream_id}:{server_address}
+```
+
+其中：
+
+- `upstream_id` - 上游配置的唯一标识符
+- `server_address` - 服务器地址（例如："127.0.0.1:8080"）
+
+#### 上游 ID 生成
+
+1. **主要方式**：如果可用，使用 `upstream.id`
+2. **备用方式**：生成上游配置稳定 JSON 编码的 CRC32 哈希
+
+```lua
+local upstream_id = upstream.id
+if not upstream_id then
+    upstream_id = ngx.crc32_short(core.json.stably_encode(upstream))
+end
+```
+
+实现使用 `core.json.stably_encode` 来确保确定性的 JSON 序列化，这对于在不同工作进程和配置重载之间生成一致的上游 ID 
至关重要。这是 APISIX 推荐的稳定 JSON 编码方法。
+
+### 连接生命周期
+
+#### 1. 连接建立
+
+当路由新请求时：
+
+1. 从堆中选择得分最低的服务器
+2. 将服务器得分更新为 `(current_count + 1) / weight`
+3. 在共享字典中递增连接计数
+4. 更新堆中服务器的位置
+
+#### 2. 连接完成
+
+当请求完成时：
+
+1. 计算新得分为 `(current_count - 1) / weight`
+2. 保证得分不为负（最小为 0）
+3. 在共享字典中递减连接计数
+4. 更新堆中服务器的位置
+
+#### 3. 清理过程
+
+负载均衡器采用双层清理策略以获得最佳性能：
+
+##### 轻量级清理（负载均衡器重建期间）
+
+- **零阻塞**：使用 O(n) 复杂度，其中 n=当前服务器数量
+- **智能清理**：仅移除零计数条目以释放内存
+- **无扫描**：完全避免昂贵的`get_keys()`操作
+- **策略**：仅处理已知当前服务器，忽略过期条目
+
+##### 全局清理（手动/定期）
+
+- **批处理**：以 100 个键的批次处理以限制内存使用
+- **非阻塞**：包含周期性让步（每处理 1000 个键让步 1ms）
+- **全面性**：移除所有上游的所有连接计数条目
+- **使用场景**：通过`balancer.cleanup_all()`手动清理或定期维护
+
+### 数据结构
+
+#### 二进制堆
+
+- **类型**：基于服务器得分的最小堆
+- **目的**：高效选择得分最低的服务器
+- **操作**：O(log n) 插入、删除和更新
+
+#### 共享字典
+
+- **名称**：`balancer-least-conn`
+- **大小**：10MB（可配置）
+- **范围**：在所有工作进程间共享
+- **持久性**：在配置重载后保持
+
+## 配置
+
+### 自动设置
+
+`balancer-least-conn` 共享字典由 APISIX 自动配置，默认大小为 10MB。无需手动配置。
+
+### 自定义配置
+
+要自定义共享字典大小，请修改 `conf/config.yaml` 中的 `nginx_config.http.lua_shared_dict` 部分：
+
+```yaml
+nginx_config:
+  http:
+    lua_shared_dict:
+      balancer-least-conn: 20m  # 自定义大小（默认：10m）
+```
+
+### 上游配置
+
+#### 传统模式（默认）
+
+```yaml
+upstreams:
+  - id: 1
+    type: least_conn
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 2
+      "127.0.0.1:8082": 1
+```
+
+#### 持久化连接计数模式
+
+##### WebSocket 负载均衡
+
+对于 WebSocket 等长连接场景，推荐启用持久化连接计数以获得更好的负载分布：
+
+```yaml
+upstreams:
+  - id: websocket_upstream
+    type: least_conn
+    scheme: websocket
+    persistent_conn_counting: true  # 显式启用持久化计数
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 1
+      "127.0.0.1:8082": 1
+```
+
+##### 手动激活
+
+```yaml
+upstreams:
+  - id: custom_upstream
+    type: least_conn
+    persistent_conn_counting: true  # 显式启用持久化计数
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 1
+      "127.0.0.1:8082": 1
+```
+
+## 性能特征
+
+### 时间复杂度
+
+- **服务器选择**：O(1) - 堆查看操作
+- **连接更新**：O(log n) - 堆更新操作
+- **轻量级清理**：O(n)，其中 n = 每个上游的当前服务器数量
+- **全局清理**：O(k) 但批处理，其中 k = 所有上游的总键数
+
+### 内存使用
+
+- **每个服务器**：约 100 字节（键 + 值 + 开销）
+- **总计**：与所有上游的活跃服务器数量线性扩展
+- **优化**：零计数条目自动移除以最小化内存使用
+
+### 可扩展性
+
+- **服务器**：高效处理每个上游数百个服务器
+- **上游**：支持多个上游，具有隔离的连接跟踪
+- **请求**：最小的每请求开销
+- **性能**：无论共享字典大小如何，都具有可预测的扩展性
+
+### 性能优化
+
+#### 轻量级清理策略
+
+```lua
+-- 新方法：仅处理已知服务器（O(n) 复杂度）
+for server, _ in pairs(current_servers) do
+    local count = conn_count_dict:get(key)
+    if count == 0 then
+        conn_count_dict:delete(key)  -- 内存清理
+    end
+end
+```
+
+#### 批处理全局清理
+
+```lua
+-- 以批处理方式进行全局清理以防止阻塞
+while has_more do
+    local keys = conn_count_dict:get_keys(100)  -- 小批次
+    -- 处理批次...
+    if processed_count % 1000 == 0 then
+        ngx.sleep(0.001)  -- 周期性让步
+    end
+end
+```
+
+## 使用场景
+
+### 传统模式
+
+#### 最佳场景
+
+1. **高吞吐量 HTTP API**：快速、短连接
+2. **微服务**：请求/响应模式
+3. **标准 Web 应用**：常规 HTTP 流量
+
+#### 优势
+
+- 较低的内存使用
+- 短连接的更好性能
+- 简单配置
+
+### 持久化连接计数模式
+
+#### 最佳场景
+
+1. **WebSocket 应用**：长连接在扩容操作中受益于准确的负载分布
+2. **服务器发送事件（SSE）**：持久流连接
+3. **长轮询**：扩展的 HTTP 连接
+4. **可变处理时间**：持续时间不可预测的请求
+5. **数据库连接池**：面向连接的服务
+
+#### 节点扩容后的使用
+
+特别适用于以下情况：
+
+- 向现有部署添加新的上游节点
+- 现有长连接保留在原始节点上
+- 需要在所有可用节点间平衡负载
+
+### 注意事项
+
+1. **短连接**：传统模式对于非常短的请求开销更低
+2. **内存使用**：持久化模式需要共享内存来存储连接状态
+3. **向后兼容性**：传统模式保持现有行为
+
+## WebSocket 负载均衡改进
+
+### 解决的问题
+
+在此增强之前，当上游节点扩容（例如从 2 个节点扩展到 3 个节点）时，WebSocket 负载均衡会出现不平衡分布：
+
+- 现有 WebSocket 长连接保留在原始节点上
+- 新连接分布在所有节点上
+- 结果：原始节点过载，新节点利用不足
+
+### 解决方案
+
+持久化连接计数模式专门通过以下方式解决此问题：
+
+1. **跟踪真实连接**：在共享内存中维护准确的连接计数
+2. **在扩容事件中保持状态**：连接计数在上游配置更改中持续存在
+3. **平衡新连接**：新连接自动路由到负载较轻的节点
+4. **逐步重平衡**：随着连接自然终止和重连，负载逐渐平衡
+
+### 示例场景
+
+**增强前：**
+
+```
+初始：Node1(50连接)，Node2(50连接)
+扩容到3个节点后：Node1(50连接)，Node2(50连接)，Node3(0连接)
+新连接分布：Node1(60连接)，Node2(60连接)，Node3(40连接)
+```
+
+**使用持久化计数：**
+
+```
+初始：Node1(50连接)，Node2(50连接)
+扩容到3个节点后：Node1(50连接)，Node2(50连接)，Node3(0连接)
+新连接路由到Node3直到平衡：Node1(50连接)，Node2(50连接)，Node3(50连接)
+```
+
+## 监控和调试
+
+### 日志消息
+
+#### 调试日志
+
+启用调试日志来监控负载均衡器行为：
+
+**负载均衡器创建**
+
+```
+creating new least_conn balancer for upstream: upstream_123
+```
+
+**连接数操作**
+
+```
+generated connection count key: conn_count:upstream_123:127.0.0.1:8080
+retrieved connection count for 127.0.0.1:8080: 5
+setting connection count for 127.0.0.1:8080 to 6
+incrementing connection count for 127.0.0.1:8080 by 1, new count: 6
+```
+
+**服务器选择**
+
+```
+selected server: 127.0.0.1:8080 with current score: 1.2
+after_balance for server: 127.0.0.1:8080, before_retry: false
+```
+
+**清理操作**
+
+```
+cleaning up stale connection counts for upstream: upstream_123
+cleaned up stale connection count for server: 127.0.0.1:8082
+```
+
+#### 初始化
+
+```
+initializing server 127.0.0.1:8080 with weight 1, base_score 1, conn_count 0, 
final_score 1
+```
+
+#### 错误
+
+```
+failed to set connection count for 127.0.0.1:8080: no memory
+failed to increment connection count for 127.0.0.1:8080: no memory
+```
+
+### 共享字典监控
+
+检查共享字典使用情况：
+
+```lua
+local dict = ngx.shared["balancer-least-conn"]
+local free_space = dict:free_space()
+local capacity = dict:capacity()
+```
+
+## 错误处理
+
+### 缺少共享字典
+
+如果共享字典不可用（在默认配置下不应该发生），负载均衡器将初始化失败并显示：
+
+```
+shared dict 'balancer-least-conn' not found
+```
+

Review Comment:
   这里描述“共享字典不可用时负载均衡器将初始化失败”，但当前 `apisix/balancer/least_conn.lua` 
的实现是回退到传统模式（并且按现有逻辑可能不会打印 warning）。建议统一文档与实现：明确是 fail-fast 还是优雅降级，并确保在用户显式开启 
`persistent_conn_counting` 但缺少共享字典时有清晰日志提示。
   ```suggestion
   如果共享字典不可用（在默认配置下不应该发生），`least_conn` 负载均衡器不会初始化失败，而是：
   
   - 关闭持久连接计数功能（`persistent_conn_counting` 实际上不生效）
   - 自动回退到传统的 least_conn 行为
   - 继续对外提供服务
   
   因此，当你显式配置了 `persistent_conn_counting: true` 时，应确保在 `nginx.conf` 中正确定义并加载名为 
`balancer-least-conn` 的共享字典，否则不会获得预期的连接计数效果。
   ```



##########
apisix/balancer/least_conn.lua:
##########
@@ -19,31 +19,154 @@ local core = require("apisix.core")
 local binaryHeap = require("binaryheap")
 local ipairs = ipairs
 local pairs = pairs
-
+local ngx = ngx
+local ngx_shared = ngx.shared
+local tostring = tostring
 
 local _M = {}
 
+-- Shared dictionary to store connection counts across balancer recreations
+local CONN_COUNT_DICT_NAME = "balancer-least-conn"
+local conn_count_dict
 
 local function least_score(a, b)
     return a.score < b.score
 end
 
+-- Get the connection count key for a specific upstream and server
+local function get_conn_count_key(upstream, server)
+    local upstream_id = upstream.id
+    if not upstream_id then
+        -- Fallback to a hash of the upstream configuration using stable 
encoding
+        upstream_id = ngx.crc32_short(core.json.stably_encode(upstream))
+        core.log.debug("generated upstream_id from hash: ", upstream_id)
+    end
+    local key = "conn_count:" .. tostring(upstream_id) .. ":" .. server
+    core.log.debug("generated connection count key: ", key)
+    return key
+end
+
+-- Get the current connection count for a server from shared dict
+local function get_server_conn_count(upstream, server)
+    local key = get_conn_count_key(upstream, server)
+    local count, err = conn_count_dict:get(key)
+    if err then
+        core.log.error("failed to get connection count for ", server, ": ", 
err)
+        return 0
+    end
+    local result = count or 0
+    core.log.debug("retrieved connection count for server ", server, ": ", 
result)
+    return result
+end
+
+-- Increment the connection count for a server
+local function incr_server_conn_count(upstream, server, delta)
+    local key = get_conn_count_key(upstream, server)
+    local new_count, err = conn_count_dict:incr(key, delta or 1, 0)
+    if not new_count then
+        core.log.error("failed to increment connection count for ", server, ": 
", err)
+        return 0
+    end
+    core.log.debug("incremented connection count for server ", server, " by ", 
delta or 1,
+            ", new count: ", new_count)
+    return new_count
+end
+
+-- Clean up connection counts for servers that are no longer in the upstream
+-- Uses a lightweight strategy: only cleanup when explicitly needed
+local function cleanup_stale_conn_counts(upstream, current_servers)
+    local upstream_id = upstream.id
+    if not upstream_id then
+        upstream_id = ngx.crc32_short(core.json.stably_encode(upstream))
+    end
+
+    local prefix = "conn_count:" .. tostring(upstream_id) .. ":"
+    core.log.debug("lightweight cleanup for upstream: ", upstream_id)
+
+    -- Strategy: Only clean up entries we know about (current servers)
+    -- This avoids expensive get_keys() calls entirely
+
+    local cleaned_count = 0
+
+    -- For each current server, verify its connection count is still valid
+    -- If count is zero, we can optionally remove it to free memory
+    for server, _ in pairs(current_servers) do
+        local key = prefix .. server
+        local count, err = conn_count_dict:get(key)
+
+        if err then
+            core.log.error("failed to get connection count for ", server, ": 
", err)
+        elseif count and count == 0 then
+            -- Remove zero-count entries to prevent memory accumulation
+            local ok, delete_err = conn_count_dict:delete(key)
+            if ok and not delete_err then
+                cleaned_count = cleaned_count + 1
+                core.log.debug("removed zero-count entry for server: ", server)
+            elseif delete_err then
+                core.log.warn("failed to remove zero-count entry for ", 
server, ": ", delete_err)
+            end
+        end
+    end
+
+    -- Note: Stale entries for removed servers will naturally expire over time
+    -- or can be cleaned up by the global cleanup_all() function
+    if cleaned_count > 0 then
+        core.log.debug("cleaned up ", cleaned_count, " zero-count entries")
+    end
+end
 
 function _M.new(up_nodes, upstream)
+    if not conn_count_dict then
+        conn_count_dict = ngx_shared[CONN_COUNT_DICT_NAME]
+    end
+
+    -- Enable persistent counting only when explicitly requested
+    -- This ensures complete backward compatibility with existing behavior
+    local use_persistent_counting = conn_count_dict ~= nil and
+        upstream.persistent_conn_counting == true
+
+    if not use_persistent_counting and conn_count_dict then
+        core.log.debug("shared dict available but persistent counting not 
enabled for scheme: ",
+                    "http,using traditional least_conn mode")
+    elseif use_persistent_counting and not conn_count_dict then
+        core.log.warn("persistent counting requested but shared dict '",
+        CONN_COUNT_DICT_NAME, "' not found, using traditional least_conn mode")
+        use_persistent_counting = false

Review Comment:
   `use_persistent_counting` is computed as `conn_count_dict ~= nil and 
upstream.persistent_conn_counting == true`, so when persistent counting is 
requested but the shared dict is missing, the code silently falls back to 
traditional mode and the warning branch below can never trigger. Consider 
basing the warning on `upstream.persistent_conn_counting == true` first, then 
checking `conn_count_dict` to decide whether to enable or log a 
warning/fallback.
   ```suggestion
       local use_persistent_counting = upstream.persistent_conn_counting == true
   
       if use_persistent_counting then
           if not conn_count_dict then
               core.log.warn("persistent counting requested but shared dict '",
                              CONN_COUNT_DICT_NAME, "' not found, using 
traditional least_conn mode")
               use_persistent_counting = false
           else
               core.log.debug("persistent counting enabled for scheme: ",
                              "http, using shared dict ", CONN_COUNT_DICT_NAME)
           end
       elseif conn_count_dict then
           core.log.debug("shared dict available but persistent counting not 
enabled for scheme: ",
                          "http, using traditional least_conn mode")
   ```



##########
apisix/balancer/least_conn.lua:
##########
@@ -19,31 +19,154 @@ local core = require("apisix.core")
 local binaryHeap = require("binaryheap")
 local ipairs = ipairs
 local pairs = pairs
-
+local ngx = ngx
+local ngx_shared = ngx.shared
+local tostring = tostring
 
 local _M = {}
 
+-- Shared dictionary to store connection counts across balancer recreations
+local CONN_COUNT_DICT_NAME = "balancer-least-conn"
+local conn_count_dict
 
 local function least_score(a, b)
     return a.score < b.score
 end
 
+-- Get the connection count key for a specific upstream and server
+local function get_conn_count_key(upstream, server)
+    local upstream_id = upstream.id
+    if not upstream_id then
+        -- Fallback to a hash of the upstream configuration using stable 
encoding
+        upstream_id = ngx.crc32_short(core.json.stably_encode(upstream))
+        core.log.debug("generated upstream_id from hash: ", upstream_id)
+    end
+    local key = "conn_count:" .. tostring(upstream_id) .. ":" .. server
+    core.log.debug("generated connection count key: ", key)
+    return key
+end
+
+-- Get the current connection count for a server from shared dict
+local function get_server_conn_count(upstream, server)
+    local key = get_conn_count_key(upstream, server)
+    local count, err = conn_count_dict:get(key)
+    if err then
+        core.log.error("failed to get connection count for ", server, ": ", 
err)
+        return 0
+    end
+    local result = count or 0
+    core.log.debug("retrieved connection count for server ", server, ": ", 
result)
+    return result
+end
+
+-- Increment the connection count for a server
+local function incr_server_conn_count(upstream, server, delta)
+    local key = get_conn_count_key(upstream, server)
+    local new_count, err = conn_count_dict:incr(key, delta or 1, 0)
+    if not new_count then
+        core.log.error("failed to increment connection count for ", server, ": 
", err)
+        return 0
+    end
+    core.log.debug("incremented connection count for server ", server, " by ", 
delta or 1,
+            ", new count: ", new_count)
+    return new_count
+end
+
+-- Clean up connection counts for servers that are no longer in the upstream
+-- Uses a lightweight strategy: only cleanup when explicitly needed
+local function cleanup_stale_conn_counts(upstream, current_servers)
+    local upstream_id = upstream.id
+    if not upstream_id then
+        upstream_id = ngx.crc32_short(core.json.stably_encode(upstream))
+    end
+
+    local prefix = "conn_count:" .. tostring(upstream_id) .. ":"
+    core.log.debug("lightweight cleanup for upstream: ", upstream_id)
+
+    -- Strategy: Only clean up entries we know about (current servers)
+    -- This avoids expensive get_keys() calls entirely
+
+    local cleaned_count = 0
+
+    -- For each current server, verify its connection count is still valid
+    -- If count is zero, we can optionally remove it to free memory
+    for server, _ in pairs(current_servers) do
+        local key = prefix .. server
+        local count, err = conn_count_dict:get(key)
+
+        if err then
+            core.log.error("failed to get connection count for ", server, ": 
", err)
+        elseif count and count == 0 then
+            -- Remove zero-count entries to prevent memory accumulation
+            local ok, delete_err = conn_count_dict:delete(key)
+            if ok and not delete_err then
+                cleaned_count = cleaned_count + 1
+                core.log.debug("removed zero-count entry for server: ", server)
+            elseif delete_err then
+                core.log.warn("failed to remove zero-count entry for ", 
server, ": ", delete_err)
+            end
+        end
+    end
+
+    -- Note: Stale entries for removed servers will naturally expire over time
+    -- or can be cleaned up by the global cleanup_all() function

Review Comment:
   The comment says stale entries for removed servers will “naturally expire 
over time”, but keys created via `dict:incr(..., init=0)` have no TTL and will 
persist until explicit deletion or LRU eviction. This can lead to unbounded 
growth if upstreams churn. Either implement an explicit TTL / periodic cleanup 
for removed servers, or adjust the comment/docs to reflect the actual behavior 
and risks.
   ```suggestion
       -- Note: Stale entries for removed servers are not automatically expired 
by TTL;
       -- they remain until explicitly deleted, evicted by LRU, or cleaned up 
by the global cleanup_all() function
   ```



##########
docs/zh/latest/balancer-least-conn.md:
##########
@@ -0,0 +1,520 @@
+---
+title: 最少连接负载均衡器
+keywords:
+  - APISIX
+  - API 网关
+  - 路由
+  - 最小连接
+  - 上游
+description: 本文介绍了 Apache APISIX 中的最少连接负载均衡器（`least_conn`），包括其工作原理、配置方法和使用场景。
+---
+
+<!--
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+-->
+
+## 概述
+
+Apache APISIX 中的 `least_conn` 负载均衡器提供两种操作模式：
+
+1. **传统模式**（默认）：性能优化的加权轮询算法
+2. **持久化连接计数模式**：真正的最少连接算法，在负载均衡器重建过程中保持准确的连接计数
+
+该算法特别适用于请求处理时间差异较大的场景，或处理长连接（如 WebSocket 连接）的情况，其中第二种模式在上游扩容后为负载分布提供显著优势。
+
+## 算法详情
+
+### 核心原理
+
+#### 传统模式（默认）
+
+在传统模式下，算法使用带有动态评分的加权轮询方法：
+
+- 初始化每个服务器的分数 = `1 / weight`
+- 连接时：分数增加 `1 / weight`
+- 完成时：分数减少 `1 / weight`
+
+这为大多数用例提供了良好的性能，同时保持向后兼容性。
+
+#### 持久化连接计数模式
+
+当启用时，算法在共享内存中为每个上游服务器维护准确的连接计数：
+
+- 跨配置重载跟踪真实连接计数
+- 在上游节点扩容操作中保持状态
+- 为长连接提供真正的最少连接行为
+
+该算法使用二进制最小堆数据结构来高效跟踪和选择得分最低的服务器。
+
+### 得分计算
+
+#### 传统模式
+
+```lua
+-- 初始化
+score = 1 / weight
+
+-- 连接时
+score = score + (1 / weight)
+
+-- 完成时
+score = score - (1 / weight)
+```
+
+#### 持久化连接计数模式
+
+```lua
+-- 初始化和更新
+score = (connection_count + 1) / weight
+```
+
+其中：
+
+- `connection_count` - 服务器当前活跃连接数（持久化）
+- `weight` - 服务器权重配置值
+
+得分较低的服务器优先获得新连接。在持久化模式下，`+1` 代表正在考虑的潜在新连接。
+
+### 连接状态管理
+
+#### 传统模式
+
+- **连接开始**：分数更新为 `score + (1 / weight)`
+- **连接结束**：分数更新为 `score - (1 / weight)`
+- **状态**：仅在当前负载均衡器实例内维护
+- **堆维护**：二进制堆自动按得分重新排序服务器
+
+#### 持久化连接计数模式
+
+- **连接开始**：连接计数递增，得分更新为 `(new_count + 1) / weight`
+- **连接结束**：连接计数递减，得分更新为 `(new_count - 1) / weight`
+- **得分保护**：通过设置最小得分为 0 防止出现负分
+- **堆维护**：二进制堆自动按得分重新排序服务器
+
+##### 持久化策略
+
+连接计数存储在 nginx 共享字典中，使用结构化键：
+
+```
+conn_count:{upstream_id}:{server_address}
+```
+
+这确保连接状态在以下情况下保持：
+
+- 上游配置变更
+- 负载均衡器实例重建
+- 工作进程重启
+- 上游节点扩容操作
+
+### 连接跟踪
+
+#### 持久状态管理
+
+负载均衡器使用 nginx 共享字典（`balancer-least-conn`）在以下情况下维护连接计数：
+
+- 负载均衡器实例重建
+- 上游配置变更
+- 工作进程重启
+- 节点添加/移除
+
+#### 连接计数键
+
+连接计数使用结构化键存储：
+
+```
+conn_count:{upstream_id}:{server_address}
+```
+
+其中：
+
+- `upstream_id` - 上游配置的唯一标识符
+- `server_address` - 服务器地址（例如："127.0.0.1:8080"）
+
+#### 上游 ID 生成
+
+1. **主要方式**：如果可用，使用 `upstream.id`
+2. **备用方式**：生成上游配置稳定 JSON 编码的 CRC32 哈希
+
+```lua
+local upstream_id = upstream.id
+if not upstream_id then
+    upstream_id = ngx.crc32_short(core.json.stably_encode(upstream))
+end
+```
+
+实现使用 `core.json.stably_encode` 来确保确定性的 JSON 序列化，这对于在不同工作进程和配置重载之间生成一致的上游 ID 
至关重要。这是 APISIX 推荐的稳定 JSON 编码方法。
+
+### 连接生命周期
+
+#### 1. 连接建立
+
+当路由新请求时：
+
+1. 从堆中选择得分最低的服务器
+2. 将服务器得分更新为 `(current_count + 1) / weight`
+3. 在共享字典中递增连接计数
+4. 更新堆中服务器的位置
+
+#### 2. 连接完成
+
+当请求完成时：
+
+1. 计算新得分为 `(current_count - 1) / weight`
+2. 保证得分不为负（最小为 0）
+3. 在共享字典中递减连接计数
+4. 更新堆中服务器的位置
+
+#### 3. 清理过程
+
+负载均衡器采用双层清理策略以获得最佳性能：
+
+##### 轻量级清理（负载均衡器重建期间）
+
+- **零阻塞**：使用 O(n) 复杂度，其中 n=当前服务器数量
+- **智能清理**：仅移除零计数条目以释放内存
+- **无扫描**：完全避免昂贵的`get_keys()`操作
+- **策略**：仅处理已知当前服务器，忽略过期条目
+
+##### 全局清理（手动/定期）
+
+- **批处理**：以 100 个键的批次处理以限制内存使用
+- **非阻塞**：包含周期性让步（每处理 1000 个键让步 1ms）
+- **全面性**：移除所有上游的所有连接计数条目
+- **使用场景**：通过`balancer.cleanup_all()`手动清理或定期维护
+
+### 数据结构
+
+#### 二进制堆
+
+- **类型**：基于服务器得分的最小堆
+- **目的**：高效选择得分最低的服务器
+- **操作**：O(log n) 插入、删除和更新
+
+#### 共享字典
+
+- **名称**：`balancer-least-conn`
+- **大小**：10MB（可配置）
+- **范围**：在所有工作进程间共享
+- **持久性**：在配置重载后保持
+
+## 配置
+
+### 自动设置
+
+`balancer-least-conn` 共享字典由 APISIX 自动配置，默认大小为 10MB。无需手动配置。
+
+### 自定义配置
+
+要自定义共享字典大小，请修改 `conf/config.yaml` 中的 `nginx_config.http.lua_shared_dict` 部分：
+
+```yaml
+nginx_config:
+  http:
+    lua_shared_dict:
+      balancer-least-conn: 20m  # 自定义大小（默认：10m）
+```
+
+### 上游配置
+
+#### 传统模式（默认）
+
+```yaml
+upstreams:
+  - id: 1
+    type: least_conn
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 2
+      "127.0.0.1:8082": 1
+```
+
+#### 持久化连接计数模式
+
+##### WebSocket 负载均衡
+
+对于 WebSocket 等长连接场景，推荐启用持久化连接计数以获得更好的负载分布：
+
+```yaml
+upstreams:
+  - id: websocket_upstream
+    type: least_conn
+    scheme: websocket
+    persistent_conn_counting: true  # 显式启用持久化计数
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 1
+      "127.0.0.1:8082": 1
+```
+
+##### 手动激活
+
+```yaml
+upstreams:
+  - id: custom_upstream
+    type: least_conn
+    persistent_conn_counting: true  # 显式启用持久化计数
+    nodes:
+      "127.0.0.1:8080": 1
+      "127.0.0.1:8081": 1
+      "127.0.0.1:8082": 1
+```
+
+## 性能特征
+
+### 时间复杂度
+
+- **服务器选择**：O(1) - 堆查看操作
+- **连接更新**：O(log n) - 堆更新操作
+- **轻量级清理**：O(n)，其中 n = 每个上游的当前服务器数量
+- **全局清理**：O(k) 但批处理，其中 k = 所有上游的总键数
+
+### 内存使用
+
+- **每个服务器**：约 100 字节（键 + 值 + 开销）
+- **总计**：与所有上游的活跃服务器数量线性扩展
+- **优化**：零计数条目自动移除以最小化内存使用
+
+### 可扩展性
+
+- **服务器**：高效处理每个上游数百个服务器
+- **上游**：支持多个上游，具有隔离的连接跟踪
+- **请求**：最小的每请求开销
+- **性能**：无论共享字典大小如何，都具有可预测的扩展性
+
+### 性能优化
+
+#### 轻量级清理策略
+
+```lua
+-- 新方法：仅处理已知服务器（O(n) 复杂度）
+for server, _ in pairs(current_servers) do
+    local count = conn_count_dict:get(key)
+    if count == 0 then
+        conn_count_dict:delete(key)  -- 内存清理
+    end
+end
+```
+
+#### 批处理全局清理
+
+```lua
+-- 以批处理方式进行全局清理以防止阻塞
+while has_more do
+    local keys = conn_count_dict:get_keys(100)  -- 小批次
+    -- 处理批次...
+    if processed_count % 1000 == 0 then
+        ngx.sleep(0.001)  -- 周期性让步
+    end
+end
+```
+
+## 使用场景
+
+### 传统模式
+
+#### 最佳场景
+
+1. **高吞吐量 HTTP API**：快速、短连接
+2. **微服务**：请求/响应模式
+3. **标准 Web 应用**：常规 HTTP 流量
+
+#### 优势
+
+- 较低的内存使用
+- 短连接的更好性能
+- 简单配置
+
+### 持久化连接计数模式
+
+#### 最佳场景
+
+1. **WebSocket 应用**：长连接在扩容操作中受益于准确的负载分布
+2. **服务器发送事件（SSE）**：持久流连接
+3. **长轮询**：扩展的 HTTP 连接
+4. **可变处理时间**：持续时间不可预测的请求
+5. **数据库连接池**：面向连接的服务
+
+#### 节点扩容后的使用
+
+特别适用于以下情况：
+
+- 向现有部署添加新的上游节点
+- 现有长连接保留在原始节点上
+- 需要在所有可用节点间平衡负载
+
+### 注意事项
+
+1. **短连接**：传统模式对于非常短的请求开销更低
+2. **内存使用**：持久化模式需要共享内存来存储连接状态
+3. **向后兼容性**：传统模式保持现有行为
+
+## WebSocket 负载均衡改进
+
+### 解决的问题
+
+在此增强之前，当上游节点扩容（例如从 2 个节点扩展到 3 个节点）时，WebSocket 负载均衡会出现不平衡分布：
+
+- 现有 WebSocket 长连接保留在原始节点上
+- 新连接分布在所有节点上
+- 结果：原始节点过载，新节点利用不足
+
+### 解决方案
+
+持久化连接计数模式专门通过以下方式解决此问题：
+
+1. **跟踪真实连接**：在共享内存中维护准确的连接计数
+2. **在扩容事件中保持状态**：连接计数在上游配置更改中持续存在
+3. **平衡新连接**：新连接自动路由到负载较轻的节点
+4. **逐步重平衡**：随着连接自然终止和重连，负载逐渐平衡
+
+### 示例场景
+
+**增强前：**
+
+```
+初始：Node1(50连接)，Node2(50连接)
+扩容到3个节点后：Node1(50连接)，Node2(50连接)，Node3(0连接)
+新连接分布：Node1(60连接)，Node2(60连接)，Node3(40连接)
+```
+
+**使用持久化计数：**
+
+```
+初始：Node1(50连接)，Node2(50连接)
+扩容到3个节点后：Node1(50连接)，Node2(50连接)，Node3(0连接)
+新连接路由到Node3直到平衡：Node1(50连接)，Node2(50连接)，Node3(50连接)
+```
+
+## 监控和调试
+
+### 日志消息
+
+#### 调试日志
+
+启用调试日志来监控负载均衡器行为：
+
+**负载均衡器创建**
+
+```
+creating new least_conn balancer for upstream: upstream_123
+```
+
+**连接数操作**
+
+```
+generated connection count key: conn_count:upstream_123:127.0.0.1:8080
+retrieved connection count for 127.0.0.1:8080: 5
+setting connection count for 127.0.0.1:8080 to 6
+incrementing connection count for 127.0.0.1:8080 by 1, new count: 6
+```
+
+**服务器选择**
+
+```
+selected server: 127.0.0.1:8080 with current score: 1.2
+after_balance for server: 127.0.0.1:8080, before_retry: false
+```
+
+**清理操作**
+
+```
+cleaning up stale connection counts for upstream: upstream_123
+cleaned up stale connection count for server: 127.0.0.1:8082
+```

Review Comment:
   该段落中的日志示例（如 “setting connection count…/cleaning up stale connection counts…” 
等）与 `apisix/balancer/least_conn.lua` 里实际输出的日志内容不一致（实现里是 “generated connection 
count key…/incremented connection count…/removed zero-count entry…” 
等）。建议更新文档中的日志示例以匹配真实日志，或在实现中补齐这些日志（如果确实需要用于排障/监控）。



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] fix: WebSocket load balancing imbalance with least_conn after upstream scaling [apisix]

Reply via email to