This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch branch-2.1
in repository https://gitbox.apache.org/repos/asf/doris.git

commit aee49adf1efb3b0584758f1dccabcf0af928b6e9
Author: Mingyu Chen <morning...@163.com>
AuthorDate: Sat Feb 24 08:37:33 2024 +0800

    [opt](compute-node) refactor compute node doc and opt some default config 
(#31325)
    
    * [opt](compute-node) refactor compute node doc and opt some default config
    
    * 1
    
    * 1
---
 docs/en/docs/lakehouse/compute-node.md             | 145 +++++++++++++++++++++
 docs/sidebars.json                                 |   2 +-
 docs/zh-CN/docs/advanced/compute-node.md           | 111 ----------------
 docs/zh-CN/docs/lakehouse/compute-node.md          | 143 ++++++++++++++++++++
 .../main/java/org/apache/doris/common/Config.java  |  29 +++--
 .../org/apache/doris/system/BeSelectionPolicy.java |   8 +-
 6 files changed, 314 insertions(+), 124 deletions(-)

diff --git a/docs/en/docs/lakehouse/compute-node.md 
b/docs/en/docs/lakehouse/compute-node.md
new file mode 100644
index 00000000000..6d6f18a3d85
--- /dev/null
+++ b/docs/en/docs/lakehouse/compute-node.md
@@ -0,0 +1,145 @@
+---
+{
+    "title": "Compute Node",
+    "language": "en"
+}
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Compute Node
+
+<version since="1.2.1">
+</version>
+
+Starting from version 1.2.1, Doris supports the Compute Node.
+
+Starting from this release, BE nodes can be divided into two categories:
+
+- Mix
+
+       Mix node. The default type of BE node. This type of node can not only 
participate in calculations, but also be responsible for the storage of user 
data.
+
+- Computation
+
+       Compute Node. It is not responsible for data storage, only data 
calculation.
+
+As a special type of BE node, the computing node has no data storage 
capability and is only responsible for data calculation.
+Therefore, the compute node can be regarded as a stateless BE node, and nodes 
can be easily added and deleted.
+
+Compute nodes are suitable for the following scenarios:
+
+- Query external data sources
+
+       Compute nodes can be used to query external data sources, such as Hive, 
Iceberg, JDBC, etc. Doris is not responsible for the storage of external data 
source data, so you can use compute nodes to easily expand the computing 
capabilities of external data sources. At the same time, computing nodes can 
also be configured with cache directories to cache hotspot data from external 
data sources to further accelerate data reading.
+
+## Usage of Compute Node
+
+### Add Compute Node
+
+Add configuration in BE's `be.conf` file:
+
+`be_node_role=computation`
+
+Then start the BE node, which will run as a compute node type.
+
+You can then connect to Doris through the MySQL client and execute:
+
+`ALTER SYSTEM ADD BACKEND`
+
+Add this BE node. After the addition is successful, you can see that the node 
type is `computation` in the `NodeRole` column of `SHOW BACKENDS`.
+
+### Use Compute Node
+
+To use compute nodes, the following conditions need to be met:
+
+- The cluster contains compute nodes.
+- Added configuration in `fe.conf`: `prefer_compute_node_for_external_table = 
true`
+
+At the same time, the following FE configuration will affect the usage 
strategy of compute node:
+
+- `min_backend_num_for_external_table`
+
+       Before Doris 2.0 (inclusive), the default value of this parameter is 3. 
After version 2.1, the default parameter is -1.
+       
+       This parameter indicates: the minimum number of BE nodes expected to 
participate in external data query. `-1` means that the value is equal to the 
number of compute nodes in the current cluster.
+       
+       for example. Assume that there are 3 compute nodes and 5 mix nodes in 
the cluster.
+       
+       If `min_backend_num_for_external_table` is set to less than or equal to 
3. Then the external table's query will only use 3 compute nodes. If the 
setting is greater than 3, assuming it is 6, in addition to using 3 compute 
nodes, the external table's query will also select 3 additional mix nodes.
+       
+       In summary, this configuration is mainly used for the minimum number of 
BE nodes that can participate in query process, and will prefer to use compute 
nodes.
+       
+> Note:
+>
+> 1. Only after version 2.1, `min_backend_num_for_external_table` is supported 
to be set to `-1`. In previous versions, this parameter had to be a positive 
number. And this configuration only takes effect when 
`prefer_compute_node_for_external_table = true`.
+>
+> 2. If `prefer_compute_node_for_external_table` is `false`. Then the external 
table's query will select any BE node.
+>
+> 3. If there are no compute nodes in the cluster, none of the above 
configurations will take effect.
+>
+> 4. If the `min_backend_num_for_external_table` value is greater than the 
total number of BE nodes, at most number of BE will be selected.
+>
+> 5. The above configurations can be modified at runtime without restarting 
the FE node. And all FE nodes need to be configured.
+
+## Best Practices
+
+### Resource isolation and elastic scaling for federated queries
+
+In federated query scenarios, users can deploy a dedicated group of compute 
nodes specifically for querying external table data. This allows for the 
isolation of query loads for external tables (such as large-scale analysis on 
Hive) from the query loads for internal tables (such as low-latency, fast data 
analysis).
+
+Moreover, as compute nodes are stateless Backend (BE) nodes, they can be 
easily scaled up or down. For instance, a cluster of elastic compute nodes can 
be deployed using Kubernetes, allowing for the utilization of more compute 
nodes for data lake analysis during peak business periods, and rapid scaling 
down during off-peak times to reduce costs.
+
+## FAQ
+
+1. Can compute nodes and mix nodes be interconverted?
+
+    Compute nodes can be converted to mix nodes. However, mix nodes cannot be 
converted to compute nodes.
+    
+    - Converting compute nodes to mix nodes
+
+        1. Stop the BE node.
+        2. Remove the `be_node_role` configuration from `be.conf`, or set it 
to `be_node_role=mix`.
+        3. Configure the correct `storage_root_path` for data storage 
directory.
+        4. Start the BE node.
+
+    - Converting mix nodes to compute nodes
+
+        In principle, this operation is not supported because mix nodes 
already store data. If conversion is necessary, first perform a safe node 
decommission, then set up as a compute node in the manner of a new node.
+
+               
+2. Do compute nodes need to configure a data storage directory?
+
+    Yes. The data storage directory of a compute node will not store user data 
but will hold some information files of the BE node itself, such as 
`cluster_id`, as well as some temporary files generated during running.
+
+    The storage directory for compute nodes requires very little disk space 
(on the order of MBs) and can be destroyed at any time along with the node 
without affecting user data.
+
+3. Can compute nodes and mix nodes configure a file cache directory?
+
+    [File cache](./filecache.md) accelerates subsequent queries for the same 
data by caching data files from recently accessed remote storage systems (HDFS 
or object storage).
+    
+    Both compute and mix nodes can set up a file cache directory, which needs 
to be created in advance.
+    
+    Additionally, Doris employs strategies like consistent hashing to minimize 
the probability of cache invalidation when nodes are scaled up or down.
+
+       
+4. Do compute nodes need to be decommissioned through the DECOMMISSION 
operation?
+
+    No. Compute nodes can be removed directly using the `DROP BACKEND` 
operation.
diff --git a/docs/sidebars.json b/docs/sidebars.json
index e4a10438eb4..6189b797e60 100644
--- a/docs/sidebars.json
+++ b/docs/sidebars.json
@@ -187,7 +187,6 @@
                 "advanced/sql-mode",
                 "advanced/small-file-mgr",
                 "advanced/cold-hot-separation",
-                "advanced/compute-node",
                 "advanced/lateral-view",
                 "advanced/auto-increment"
             ]
@@ -257,6 +256,7 @@
                 },
                 "lakehouse/file",
                 "lakehouse/filecache",
+                "lakehouse/compute-node",
                 "lakehouse/external-statistics",
                 "lakehouse/sql-dialect",
                 "lakehouse/fs-benchmark-tool",
diff --git a/docs/zh-CN/docs/advanced/compute-node.md 
b/docs/zh-CN/docs/advanced/compute-node.md
deleted file mode 100644
index 20c12413634..00000000000
--- a/docs/zh-CN/docs/advanced/compute-node.md
+++ /dev/null
@@ -1,111 +0,0 @@
----
-{
-    "title": "计算节点",
-    "language": "zh-CN"
-}
----
-
-<!-- 
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# 计算节点
-
-<version since="1.2.1">
-</version>
-
-## 需求场景
-目前Doris是一个典型Share-Nothing的架构, 通过绑定数据和计算资源在同一个节点获得非常好的性能表现.
-但随着Doris计算引擎性能持续提高, 越来越多的用户也开始选择使用Doris直接查询数据湖数据.
-这类场景是一种Share-Disk场景, 数据往往存储在远端的HDFS/S3上, 计算在Doris中, Doris通过网络获取数据, 然后在内存完成计算.
-而如果这两个负载都混合在同一个集群时, 对于目前Doris的架构就会出现以下不足:
-1. 资源隔离差, 两个负载对集群的响应要求不一, 混合部署会有相互的影响.
-2. 集群扩容时, 数据湖查询只需要扩容计算资源, 而目前只能存储计算一起扩容, 导致磁盘使用率变低.
-3. 扩容效率差, 扩容后会启动Tablet数据的迁移, 整体过程比较漫长. 而数据湖查询有着明显的高峰低谷, 需要小时级弹性能力.
-
-## 解决方案
-实现一种专门用于联邦计算的BE节点角色: `计算节点`, 计算节点专门处理数据湖这类远程的联邦查询.
-原来的BE节点类型称为`混合节点`, 这类节点既能做SQL查询, 又有Tablet数据存储管理.
-而`计算节点`只能做SQL查询, 它不会保存任何数据.
-
-有了计算节点后, 集群部署拓扑也会发生变化: 混合节点用于OLAP类型表的数据计算, 这个节点根据存储的需求而扩容, 而计算节点用于联邦查询, 
该节点类型随着计算负载而扩容.
-
-此外, 计算节点由于没有存储, 因此在部署时, 计算节点可以混部在HDD磁盘机器或者部署在容器之中.
-
-## Compute Node的使用
-
-### 配置
-在BE的配置文件be.conf中添加配置项:
-```
-be_node_role=computation
-```
-
-该配置项默认为`mix`, 即原来的BE节点类型, 设置为`computation`后, 该节点为计算节点.
-
-可以通过`show backends\G`命令看到其中`NodeRole`字段的值, 如果是`mix`, 则为混合节点, 如果是`computation`, 
则为计算节点
-
-```sql
-*************************** 1. row ***************************
-              BackendId: 10010
-                Cluster: default_cluster
-                     IP: 10.248.181.219
-          HeartbeatPort: 9050
-                 BePort: 9060
-               HttpPort: 8040
-               BrpcPort: 8060
-          LastStartTime: 2022-11-30 23:01:40
-          LastHeartbeat: 2022-12-05 15:01:18
-                  Alive: true
-   SystemDecommissioned: false
-  ClusterDecommissioned: false
-              TabletNum: 753
-       DataUsedCapacity: 1.955 GB
-          AvailCapacity: 202.987 GB
-          TotalCapacity: 491.153 GB
-                UsedPct: 58.67 %
-         MaxDiskUsedPct: 58.67 %
-     RemoteUsedCapacity: 0.000
-                    Tag: {"location" : "default"}
-                 ErrMsg:
-                Version: doris-0.0.0-trunk-80baca264
-                 Status: {"lastSuccessReportTabletsTime":"2022-12-05 
15:00:38","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false}
-HeartbeatFailureCounter: 0
-               NodeRole: computation
-```
-
-### 使用
-
-在 fe.conf 中添加配置项
-
-```
-prefer_compute_node_for_external_table=true
-min_backend_num_for_external_table=3
-```
-
-> 参数说明请参阅:[FE 配置项](../admin-manual/config/fe-config.md)
-
-当查询时使用[MultiCatalog](../lakehouse/multi-catalog/multi-catalog.md)功能时, 
查询会优先调度到计算节点。
-
-### 一些限制
-
-- 计算节点由配置项控制, 但不要将混合类型节点, 修改配置为计算节点.
-
-## 未尽事项
-
-- 计算外溢: Doris内表查询, 当集群负载高的时候, 上层(TableScan之外)算子调度到计算节点中.
-- 优雅下线: 当节点下线的时候, 任务新任务自动调度到其他节点; 等待老任务后全部完成后节点再下线; 老任务无法按时结束时, 能够让任务能够自己结束.
diff --git a/docs/zh-CN/docs/lakehouse/compute-node.md 
b/docs/zh-CN/docs/lakehouse/compute-node.md
new file mode 100644
index 00000000000..5fb278b4574
--- /dev/null
+++ b/docs/zh-CN/docs/lakehouse/compute-node.md
@@ -0,0 +1,143 @@
+---
+{
+    "title": "计算节点",
+    "language": "zh-CN"
+}
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# 计算节点
+
+<version since="1.2.1">
+</version>
+
+自 1.2.1 版本开始,Doris 支持了计算节点(Compute Node)功能。
+
+从这个版本开始,BE 节点可以分为两类:
+
+- Mix
+
+       混合节点。即 BE 节点的默认类型。该类型的节点既可以参与计算,也负责 Doris 数据的存储。
+
+- Computation
+
+       计算节点。不负责数据的存储,只负责数据计算。
+
+计算节点作为一种特殊类型的 BE 节点,没有数据存储能力,只负责数据计算。
+因此,可以将计算节点看做是无状态的 BE 节点,可以方便的进行节点的增加和删除。
+
+计算节点适用于以下场景:
+
+- 查询外部数据源
+
+       计算节点可以用于查询外部数据源,如 Hive、Iceberg、JDBC 等。Doris 
不负责外部数据源数据的存储,因此,可以使用计算节点方便的扩展对外部数据源的计算能力。同时,计算节点也可以配置缓存目录,用于缓存外部数据源的热点数据,进一步加速数据读取。
+
+## 计算节点的使用
+
+### 添加计算节点
+
+在 BE 的 `be.conf` 配置文件中增加配置:
+
+`be_node_role=computation`
+
+之后启动 BE 节点,该节点就会以 计算节点 类型运行。
+
+之后可以通过 MySQL 客户端链接 Doris 并执行:
+
+`ALTER SYSTEM ADD BACKEND`
+
+添加这个 BE 节点。添加成功后,在 `SHOW BACKENDS` 的 `NodeRole` 列可以看到节点类型为 `computation`。
+
+### 使用计算节点
+
+如需使用计算节点,需要满足以下条件:
+
+- 集群内包含 计算节点。
+- `fe.conf` 中添加了配置项:`prefer_compute_node_for_external_table = true`
+
+同时,以下 FE 配置项,会影响计算节点的使用策略:
+
+- `min_backend_num_for_external_table`
+
+       在Doris 2.0(含)版本之前,该参数的默认值为 3。2.1 版本之后,默认参数为 -1。
+       
+       该参数表示:期望可参与外表数据查询的 BE 节点的最小数量。`-1` 表示该值等同于当前集群内计算节点的数量。
+       
+       举例说明。假设集群内有 3 个计算节点,5 个混合节点。
+       
+       如果 `min_backend_num_for_external_table` 设置小于等于 3。则外表查询只会使用 3 
个计算节点。如果设置大于3,假设为6,则外表查询除了使用 3 个计算节点外,还会额外选择 3 个混合节点参与计算。
+       
+       综上,该参数主要用于可参与外表计算的最少 BE 节点数量,并且会优先选择计算节点。
+       
+> 注:
+> 
+> 1. 2.1 版本之后,才支持 `min_backend_num_for_external_table` 设置为 
`-1`。之前的版本,该参数必须为正数。且该参数只有在 `prefer_compute_node_for_external_table = true` 
的情况下才生效。
+> 
+> 2. 如果 `prefer_compute_node_for_external_table` 为 `false`。则外表查询会选择任意 BE 节点。
+> 
+> 3. 如果集群中没有计算节点,则以上参数均不生效。
+> 
+> 4. 如果 `min_backend_num_for_external_table` 值大于总的 BE 节点数量,则最多只会选择全部的 BE。
+> 
+> 5. 以上参数均支持在运行时修改,不需要重启 FE 节点。且所有 FE 节点都需配置。
+
+## 最佳实践
+
+### 联邦查询的负载隔离和弹性伸缩
+
+在联邦查询场景下,用户可以专门部署一组计算节点,用于外表数据的查询。这样可以将外表的查询负载(如在 hive 
上进行大数量分析)和内表的查询负载(如低延迟的快速数据分析)进行隔离。
+
+同时,计算节点作为无状态的 BE 节点,可以方便的进行扩容和缩容。比如可以使用 k8s 
部署一组弹性计算节点集群,在业务高峰期利用更多的计算节点进行数据湖分析,低谷期可以进行快速缩容以降低成本。
+
+## 常见问题
+
+1. 混合节点和计算节点能否相互转换
+
+       计算节点可以转换为混合节点。但混合节点不可以转换为计算节点。
+       
+       - 计算节点转混合节点
+
+               1. 停止 BE 节点
+               2. 删除 `be.conf` 中的 `be_node_role` 配置,或配置为 `be_node_role=mix`
+               3. 配置正确的 `storage_root_path` 数据存储目录。
+               4. 启动 BE 节点。
+
+       - 混合节点转计算节点
+
+               
原则上不支持这种操作,因为混合节点本身存储了数据。如需转换,请先执行节点安全下线(Decommission)后,在以新节点的方式设置为计算节点。
+               
+2. 计算节点是否需要配置数据存储目录
+
+       需要。计算节点的数据存储目录不会存放用户数据,只会存放一些 BE 节点自身的信息文件,如 `cluster_id` 
等。以及一些运行过程中的临时文件等。
+       
+       计算节点的存储目录只需要很少的磁盘空间即可(MB级别),并且可以随时和节点一起销毁,不会对用户数据造成影响。
+       
+3. 计算节点和混合节点是否可以配置文件缓存目录
+
+       [文件缓存](./filecache.md) 通过缓存最近访问的远端存储系统(HDFS 或对象存储)的数据文件,加速后续访问相同数据的查询。
+       
+       计算节点和混合节点均可设置文件缓存目录。文件缓存目录需事先创建。
+       
+       同时,Doris 也采用了一致性哈希等策略来尽可能降低在节点扩缩容情况下的缓存失效的概率。
+       
+4. 计算节点是否需要通过 DECOMMISION 操作下线
+
+       不需要。计算节点可以直接通过 `DROP BACKEND` 操作删除。
\ No newline at end of file
diff --git a/fe/fe-common/src/main/java/org/apache/doris/common/Config.java 
b/fe/fe-common/src/main/java/org/apache/doris/common/Config.java
index 21e4f7ddcd5..48741df7bb3 100644
--- a/fe/fe-common/src/main/java/org/apache/doris/common/Config.java
+++ b/fe/fe-common/src/main/java/org/apache/doris/common/Config.java
@@ -1803,16 +1803,27 @@ public class Config extends ConfigBase {
      * And the max number of compute node is controlled by 
min_backend_num_for_external_table.
      * If set to false, query on external table will assign to any node.
      */
-    @ConfField(mutable = true, masterOnly = false)
+    @ConfField(mutable = true, description = {"如果设置为true,外部表的查询将优先分配给计算节点。",
+            "并且计算节点的最大数量由min_backend_num_for_external_table控制。",
+            "如果设置为false,外部表的查询将分配给任何节点。"
+                    + "如果集群内没有计算节点,则该参数不生效。",
+            "If set to true, query on external table will prefer to assign to 
compute node. "
+                    + "And the max number of compute node is controlled by 
min_backend_num_for_external_table. "
+                    + "If set to false, query on external table will assign to 
any node. "
+                    + "If there is no compute node in cluster, this config 
takes no effect."})
     public static boolean prefer_compute_node_for_external_table = false;
-    /**
-     * Only take effect when prefer_compute_node_for_external_table is true.
-     * If the compute node number is less than this value, query on external 
table will try to get some mix node
-     * to assign, to let the total number of node reach this value.
-     * If the compute node number is larger than this value, query on external 
table will assign to compute node only.
-     */
-    @ConfField(mutable = true, masterOnly = false)
-    public static int min_backend_num_for_external_table = 3;
+
+    @ConfField(mutable = true, description = 
{"只有当prefer_compute_node_for_external_table为true时生效,"
+            + "如果计算节点数小于这个值,外部表的查询会尝试获取一些混合节点来分配,以使节点总数达到这个值。"
+            + "如果计算节点数大于这个值,外部表的查询将只分配给计算节点。-1表示只是用当前数量的计算节点",
+            "Only take effect when prefer_compute_node_for_external_table is 
true. "
+                    + "If the compute node number is less than this value, "
+                    + "query on external table will try to get some mix de to 
assign, "
+                    + "to let the total number of node reach this value. "
+                    + "If the compute node number is larger than this value, "
+                    + "query on external table will assign to compute de only. 
"
+                    + "-1 means only use current compute node."})
+    public static int min_backend_num_for_external_table = -1;
 
     /**
      * Max query profile num.
diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/system/BeSelectionPolicy.java 
b/fe/fe-core/src/main/java/org/apache/doris/system/BeSelectionPolicy.java
index ace2ab3e1e4..2c766221acb 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/system/BeSelectionPolicy.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/system/BeSelectionPolicy.java
@@ -181,8 +181,10 @@ public class BeSelectionPolicy {
             filterBackends = preLocationFilterBackends;
         }
         Collections.shuffle(filterBackends);
+        int numComputeNode = 
filterBackends.stream().filter(Backend::isComputeNode).collect(Collectors.toList()).size();
         List<Backend> candidates = new ArrayList<>();
-        if (preferComputeNode) {
+        if (preferComputeNode && numComputeNode > 0) {
+            int realExpectBeNum = expectBeNum == -1 ? numComputeNode : 
expectBeNum;
             int num = 0;
             // pick compute node first
             for (Backend backend : filterBackends) {
@@ -192,10 +194,10 @@ public class BeSelectionPolicy {
                 }
             }
             // fill with some mix node.
-            if (num < expectBeNum) {
+            if (num < realExpectBeNum) {
                 for (Backend backend : filterBackends) {
                     if (backend.isMixNode()) {
-                        if (num >= expectBeNum) {
+                        if (num >= realExpectBeNum) {
                             break;
                         }
                         candidates.add(backend);


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to