This is an automated email from the ASF dual-hosted git repository.

wenjun pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/dolphinscheduler.git


The following commit(s) were added to refs/heads/dev by this push:
     new 09d15a0de6 [Doc-16755] Improve architecture description with JDBC and 
Etcd supportImprove design doc (#17092)
09d15a0de6 is described below

commit 09d15a0de61c9c29e5568c78fcbbb2f0f57d0a80
Author: Oscar Cheng <[email protected]>
AuthorDate: Mon Jun 16 09:47:19 2025 +0800

    [Doc-16755] Improve architecture description with JDBC and Etcd 
supportImprove design doc (#17092)
---
 docs/docs/en/architecture/design.md | 64 ++++++++++++++++++++++++++++++++-----
 1 file changed, 56 insertions(+), 8 deletions(-)

diff --git a/docs/docs/en/architecture/design.md 
b/docs/docs/en/architecture/design.md
index 4bc8b06eed..0e2f5f61e3 100644
--- a/docs/docs/en/architecture/design.md
+++ b/docs/docs/en/architecture/design.md
@@ -22,7 +22,7 @@
 
 ### Architecture Description
 
-* **MasterServer**
+- **MasterServer**
 
   MasterServer adopts a distributed and decentralized design concept. 
MasterServer is mainly responsible for DAG task segmentation, task submission 
monitoring, and monitoring the health status of other MasterServer and 
WorkerServer at the same time.
   When the MasterServer service starts, register a temporary node with 
ZooKeeper, and perform fault tolerance by monitoring changes in the temporary 
node of ZooKeeper.
@@ -44,7 +44,7 @@
 
   - **FailoverExecuteThread** is mainly responsible for the logic of Master 
fault tolerance and Worker fault tolerance;
 
-* **WorkerServer**
+- **WorkerServer**
 
   WorkerServer also adopts a distributed and decentralized design concept. 
WorkerServer is mainly responsible for task execution and providing log 
services.
 
@@ -59,21 +59,70 @@
 
   - **RetryReportTaskStatusThread** is mainly responsible for regularly 
polling to report the task status to the Master until the Master replies to the 
status ack to avoid the loss of the task status;
 
-* **ZooKeeper**
+- **ZooKeeper**
 
-  ZooKeeper service, MasterServer and WorkerServer nodes in the system all use 
ZooKeeper for cluster management and fault tolerance. In addition, the system 
implements event monitoring and distributed locks based on ZooKeeper.
+  ZooKeeper service, MasterServer and WorkerServer nodes in the system all use 
ZooKeeper for cluster management and fault tolerance. With evolving needs and 
modern deployment environments, DolphinScheduler now supports event monitoring 
and distributed locks not only based on ZooKeeper, but also on **JDBC** and 
**Etcd** implementations.
+
+- **JDBC**
+  DolphinScheduler also provides a JDBC-based registry implementation, located 
in the 
`dolphinscheduler-registry/dolphinscheduler-registry-plugins/dolphinscheduler-registry-jdbc`
 module. Unlike external systems such as ZooKeeper or Etcd, the JDBC approach 
leverages a relational database to support event monitoring and distributed 
locking, making it well-suited for environments that already rely on SQL 
databases.
+
+  - **Event Monitoring**
+
+    - **Subscribe Method**
+      The `subscribe(String watchedPath, SubscribeListener listener)` method 
in `JdbcRegistry` registers a data change listener using the 
`JdbcRegistryDataChangeListenerAdapter`. When changes (such as creation, 
update, or deletion) occur for the specified key or path in the database, the 
adapter converts these changes into DolphinScheduler `Event` notifications and 
triggers the `SubscribeListener` callback.
+
+    - **Polling/Trigger Mechanism**
+      Internally, the system uses periodic polling or a trigger-based 
mechanism to detect changes in the registry data stored in the database, 
simulating a Watcher-like behavior similar to ZooKeeper.
+
+  - **Distributed Lock**
+
+    - **Lock Acquisition and Release**
+      The JDBC registry offers both `acquireLock(String key)` and 
`acquireLock(String key, long timeout)` methods, which correspond to blocking 
and timeout-based lock acquisition respectively. These methods internally call 
`JdbcRegistryClient.acquireJdbcRegistryLock(...)` to manage locks via database 
records, ensuring mutual exclusion in a distributed environment.
+
+    - **Ephemeral vs. Persistent Locks**
+      Data entries are classified as either **EPHEMERAL** or **PERSISTENT**. 
For ephemeral locks, if the client disconnects or fails, heartbeat mechanisms 
detect the lapse and clean up the lock record automatically, thus releasing the 
lock.
+
+    - **Lock Management**
+      Under the hood, components like `JdbcRegistryLockManager` (or 
equivalent) use row-level locking or specific database fields to ensure atomic 
lock operations, maintaining consistency even when multiple masters/workers 
compete for the same lock.
+
+    ***
+
+    By leveraging JDBC for both **event monitoring** and **distributed 
locking**, DolphinScheduler can achieve reliable task coordination and 
scheduling without relying on external registry centers, making it an 
attractive option for environments that prefer or already have robust database 
infrastructure.
+
+- **Etcd**
+
+  DolphinScheduler also provides an Etcd-based registry implementation. The 
Etcd-based registry, implemented in the module 
`dolphinscheduler-registry/dolphinscheduler-registry-plugins/dolphinscheduler-registry-etcd`,
 leverages the Jetcd client library to interact with an Etcd cluster. This 
implementation provides several key functionalities:
+
+  - **Event Monitoring**
+    - **Watch API**
+      The `EtcdRegistry` class uses Etcd’s Watch API to observe changes 
(creation, update, or deletion) on specified keys or key prefixes. Low-level 
Etcd watch events are translated into DolphinScheduler’s `Event` objects, 
triggering `SubscribeListener` callbacks for real-time notifications.
+  - **Distributed Lock**
+    - **Lease-Based Locking**
+      The `EtcdKeepAliveLeaseManager` grants a lease with a specified TTL, 
continuously kept alive via Etcd’s keep-alive mechanism. If the client 
disconnects, the lease expires automatically, releasing the lock without manual 
intervention.
+
+    - **Connection Health Monitoring**
+      The `EtcdConnectionStateListener` tracks the connection state between 
DolphinScheduler and the Etcd cluster. Upon disconnection or reconnection, it 
re-establishes locks or re-registers services as needed.
+
+  - **Configuration**
+
+    - **Flexible Configuration**
+      The behavior of the Etcd registry is controlled by 
`EtcdRegistryProperties`, which maps various settings (endpoints, namespace, 
SSL, authentication, etc.) from configuration files. These settings are 
integrated into the Spring Boot auto-configuration process via 
`EtcdRegistryAutoConfiguration`, ensuring that the Etcd registry is 
instantiated automatically when `registry.type` is set to `"etcd"`.
+
+    Together, these components ensure that DolphinScheduler can reliably use 
Etcd as an alternative registry center. This is especially useful in 
cloud-native environments where low latency, high scalability, and ease of 
deployment are critical.
+
+    ***
 
   We have also implemented queues based on Redis, but we hope DolphinScheduler 
depends on as few components as possible, so we finally removed the Redis 
implementation.
 
-* **AlertServer**
+- **AlertServer**
 
   Provides alarm services, and implements rich alarm methods through alarm 
plugins.
 
-* **API**
+- **API**
 
   The API interface layer is mainly responsible for processing requests from 
the front-end UI layer. The service uniformly provides RESTful APIs to provide 
request services to external.
 
-* **UI**
+- **UI**
 
   The front-end page of the system provides various visual operation 
interfaces of the system, see more at [Introduction to 
Functions](../guide/homepage.md) section.
 
@@ -222,4 +271,3 @@ In the early schedule design, if there is no priority 
design and use the fair sc
 ## Sum Up
 
 From the perspective of scheduling, this article preliminarily introduces the 
architecture principles and implementation ideas of the big data distributed 
workflow scheduling system: DolphinScheduler. To be continued.
-

Reply via email to