thunguo commented on PR #1423:
URL: https://github.com/apache/dubbo-admin/pull/1423#issuecomment-4147689278

   <html><head></head><body>
   <h2>验证步骤</h2>
   <h3>1. 确认 Pod 运行状态</h3>
   <p><strong>操作</strong>:检查 Deployment 下所有 Pod 的状态。</p>
   <pre><code class="language-bash">kubectl -n dubbo-system get pods -l 
app=dubbo-admin
   </code></pre>
   <p><strong>结果</strong>:2 个 Pod 均处于 Running 状态。</p>
   <pre><code>dubbo-admin-58ccd5767-b5wmh   1/1   Running   node1
   dubbo-admin-58ccd5767-jlrzt   1/1   Running   node2
   </code></pre>
   <hr>
   <h3>2. 验证 Leader Election 初始化</h3>
   <p><strong>操作</strong>:分别检查两个 Pod 的启动日志,过滤 leader election 相关输出。</p>
   <pre><code class="language-bash">kubectl -n dubbo-system logs &lt;pod&gt; | 
grep -iE "leader|election|acquired|became"
   </code></pre>
   <p><strong>结果</strong>:3 个组件均完成 leader election 初始化。</p>
   <p><strong>Pod1(b5wmh)日志</strong>:</p>
   <pre><code>counter: leader election initialized (holder: 
dubbo-admin-58ccd5767-b5wmh-...)
   discovery: leader election initialized (holder: 
dubbo-admin-58ccd5767-b5wmh-...)
   engine: leader election initialized (holder: dubbo-admin-58ccd5767-b5wmh-...)
   leader election: component resource discovery acquired leadership
   discovery: became leader, starting business logic
   leader election: component resource engine acquired leadership
   engine: became leader, starting business logic
   leader election: component counter manager acquired leadership
   counter: became leader, starting business logic
   </code></pre>
   <p><strong>Pod2(jlrzt)日志</strong>:</p>
   <pre><code>counter: leader election initialized (holder: 
dubbo-admin-58ccd5767-jlrzt-...)
   discovery: leader election initialized (holder: 
dubbo-admin-58ccd5767-jlrzt-...)
   engine: leader election initialized (holder: dubbo-admin-58ccd5767-jlrzt-...)
   # 无 acquired / became 日志,Pod2 作为 follower 正常等待
   </code></pre>
   <hr>
   <h3>3. 验证 Lease 表状态</h3>
   <p><strong>操作</strong>:查询 MySQL 中 <code>leader_leases</code> 表的当前记录。</p>
   <pre><code class="language-bash">kubectl -n dubbo-system exec deploy/mysql 
-- mysql -uroot -pdubbo2026 dubbo_admin \
     -e "SELECT component, holder_id, acquired_at, expires_at, version FROM 
leader_leases;"
   </code></pre>
   <p><strong>结果</strong>:表中存在 3 条 lease 记录,全部由 Pod1 持有,<code>version</code> 
字段持续递增,说明 renew 机制运行正常。</p>
   <pre><code>component            holder_id                           version
   resource discovery   dubbo-admin-58ccd5767-b5wmh-...     5
   counter manager      dubbo-admin-58ccd5767-b5wmh-...     5
   resource engine      dubbo-admin-58ccd5767-b5wmh-...     5
   </code></pre>
   <hr>
   <h3>4. 验证 Failover(删除 Leader Pod)</h3>
   <p><strong>操作</strong>:手动删除 leader Pod,观察 follower 是否在 lease 过期后成功接管。</p>
   <pre><code class="language-bash">kubectl -n dubbo-system delete pod 
dubbo-admin-58ccd5767-b5wmh
   </code></pre>
   <p><strong>预期等待时长</strong>:约 35 秒(lease duration 30s + acquire retry 5s)。</p>
   <p><strong>验证 4.1 — Lease 表转移情况</strong>:</p>
   <pre><code class="language-bash">kubectl -n dubbo-system exec deploy/mysql 
-- mysql -u root -p dubbo2026 dubbo_admin \
     -e "SELECT component, holder_id, version FROM leader_leases;"
   </code></pre>
   <p>结果:3 个组件的 lease 已全部转移至 Pod2,<code>version</code> 继续递增。</p>
   <pre><code>component            holder_id                           version
   resource discovery   dubbo-admin-58ccd5767-jlrzt-...     12
   counter manager      dubbo-admin-58ccd5767-jlrzt-...     12
   resource engine      dubbo-admin-58ccd5767-jlrzt-...     12
   </code></pre>
   <p><strong>验证 4.2 — Pod2 日志确认</strong>:</p>
   <p>结果:Pod2 成功获取所有 3 个组件的 leadership,并启动业务逻辑。</p>
   <pre><code>17:07:44  leader election: component counter manager acquired 
leadership (holder: ...jlrzt...)
   17:07:44  counter: became leader, starting business logic
   17:07:44  leader election: component resource engine acquired leadership 
(holder: ...jlrzt...)
   17:07:44  engine: became leader, starting business logic
   17:07:44  leader election: component resource discovery acquired leadership 
(holder: ...jlrzt...)
   17:07:44  discovery: became leader, starting business logic
   </code></pre>
   <hr>
   <h2>Failover 时间线</h2>
   
   时间 | 事件
   -- | --
   17:06:40 | Pod1 获取所有 3 个组件的 leadership
   17:07:08 | Pod1 被删除,lease 停止续约
   17:07:08 ~ 17:07:38 | Pod2 每 5 秒尝试获取 lease,因 lease 未过期,尝试持续失败
   17:07:44 | Lease 过期,Pod2 成功接管所有 3 个组件
   Failover 总耗时 | 约 36 秒(lease 30s + acquire retry 5s + 网络延迟)
   </body></html>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to