This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new 703c8cd2f6 update workload-management: summary & concurrency EN (#1658) 703c8cd2f6 is described below commit 703c8cd2f6d41610f384bbb8643a424f7ccccab5 Author: wangtianyi2004 <376612...@qq.com> AuthorDate: Sat Dec 28 20:23:30 2024 +0800 update workload-management: summary & concurrency EN (#1658) ## Versions - [ ] dev - [x] 3.0 - [x] 2.1 - [ ] 2.0 ## Languages - [ ] Chinese - [x] English ## Docs Checklist - [ ] Checked by AI - [ ] Test Cases Built --- .../workload-management/compute-group.md | 4 +- .../workload-management-summary.md | 2 +- .../workload-management/compute-group.md | 142 --------------------- .../concurrency-control-and-queuing.md | 104 +++++++++++++++ .../workload-management-summary.md | 30 ++++- .../workload-management/compute-group.md | 4 +- .../concurrency-control-and-queuing.md | 104 +++++++++++++++ .../workload-management-summary.md | 25 ++++ 8 files changed, 267 insertions(+), 148 deletions(-) diff --git a/docs/admin-manual/workload-management/compute-group.md b/docs/admin-manual/workload-management/compute-group.md index 50123e887c..dbaa506c03 100644 --- a/docs/admin-manual/workload-management/compute-group.md +++ b/docs/admin-manual/workload-management/compute-group.md @@ -1,6 +1,6 @@ --- { -"title": "Managing Compute Groups", +"title": "Compute Groups", "language": "en" } --- @@ -150,4 +150,4 @@ If the database or compute group name contains reserved keywords, the correspond ## Scaling Compute Groups -You can scale compute groups by adding or removing BE using `ALTER SYSTEM ADD BACKEND` and `ALTER SYSTEM DECOMMISION BACKEND`. \ No newline at end of file +You can scale compute groups by adding or removing BE using `ALTER SYSTEM ADD BACKEND` and `ALTER SYSTEM DECOMMISION BACKEND`. diff --git a/docs/admin-manual/workload-management/workload-management-summary.md b/docs/admin-manual/workload-management/workload-management-summary.md index e5ef94b3f7..783e3d3594 100644 --- a/docs/admin-manual/workload-management/workload-management-summary.md +++ b/docs/admin-manual/workload-management/workload-management-summary.md @@ -1,6 +1,6 @@ --- { -"title": "Workload Management Overview", +"title": "Overview", "language": "en" } --- diff --git a/versioned_docs/version-2.1/admin-manual/workload-management/compute-group.md b/versioned_docs/version-2.1/admin-manual/workload-management/compute-group.md deleted file mode 100644 index 1055a274bf..0000000000 --- a/versioned_docs/version-2.1/admin-manual/workload-management/compute-group.md +++ /dev/null @@ -1,142 +0,0 @@ ---- -{ -"title": "Managing Compute Groups", -"language": "en" -} ---- - -<!-- -Licensed to the Apache Software Foundation (ASF) under one -or more contributor license agreements. See the NOTICE file -distributed with this work for additional information -regarding copyright ownership. The ASF licenses this file -to you under the Apache License, Version 2.0 (the -"License"); you may not use this file except in compliance -with the License. You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - -Unless required by applicable law or agreed to in writing, -software distributed under the License is distributed on an -"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -KIND, either express or implied. See the License for the -specific language governing permissions and limitations -under the License. ---> - -In a compute-storage decoupled architecture, one or more compute nodes (BE) can be grouped into a Compute Group. This document describes how to use compute groups, including operations such as: - -- Viewing all compute groups -- Granting compute group access -- Binding compute groups at the user level (`default_compute_group`) for user-level isolation - -*Note* -In versions prior to 3.0.2, this was referred to as a Compute Cluster. - -## Viewing All Compute Groups - -You can view all compute groups owned by the current repository using `SHOW COMPUTE GROUPS`. - -```sql -SHOW COMPUTE GROUPS; -``` - -## Adding Compute Groups - -Using [Add BE ](../sql-manual/sql-statements/Cluster-Management-Statements/ALTER-SYSTEM-ADD-BACKEND.md) to add a BE into a compute group, for example: - -```sql -ALTER SYSTEM ADD BACKEND 'host:9050' PROPERTIES ("tag.compute_group_name" = "new_group"); -``` - -The above sql will add `host:9050` to compute group `new_group`. The BE will be added to compute group `default_compute_group` if you omit PROPERTIES statement, for example: - -```sql -ALTER SYSTEM ADD BACKEND 'host:9050'; -``` - -## Granting Compute Group Access - -```sql -GRANT USAGE_PRIV ON COMPUTE GROUP {compute_group_name} TO {user} -``` - -## Revoking Compute Group Access - -```sql -REVOKE USAGE_PRIV ON COMPUTE GROUP {compute_group_name} FROM {user} -``` - -## Setting Default Compute Group - -To set the default compute group for the current user: - -```sql -SET PROPERTY 'default_compute_group' = '{clusterName}'; -``` - -To set the default compute group for other users (this operation requires Admin privileges): - -```sql -SET PROPERTY FOR {user} 'default_compute_group' = '{clusterName}'; -``` - -To view the current user's default compute group, the value of `default_compute_group` in the returned result is the default compute group: - -```sql -SHOW PROPERTY; -``` - -To view the default compute group of other users, this operation requires the current user to have relevant permissions, and the value of `default_compute_group` in the returned result is the default compute group: - -```sql -SHOW PROPERTY FOR {user}; -``` - -To view all available compute groups in the current repository: - -```sql -SHOW COMPUTE GROUPS; -``` - -:::info Note - -- If the current user has an Admin role, for example: `CREATE USER jack IDENTIFIED BY '123456' DEFAULT ROLE "admin"`, then: - - They can set the default compute group for themselves and other users; - - They can view their own and other users' `PROPERTY`. -- If the current user does not have an Admin role, for example: `CREATE USER jack1 IDENTIFIED BY '123456'`, then: - - They can set the default compute group for themselves; - - They can view their own `PROPERTY`; - - They cannot view all compute groups, as this operation requires `GRANT ADMIN` privileges. -- If the current user has not configured a default compute group, the existing system will trigger an error when performing data read/write operations. To resolve this issue, the user can execute the `use @cluster` command to specify the compute group used by the current context, or use the `SET PROPERTY` statement to set the default compute group. -- If the current user has configured a default compute group, but that cluster is subsequently deleted, an error will also be triggered during data read/write operations. The user can execute the `use @cluster` command to re-specify the compute group used by the current context, or use the `SET PROPERTY` statement to update the default cluster settings. - -::: - -## Default Compute Group Selection Mechanism - -When a user has not explicitly set a default compute group, the system will automatically select a compute group with Active BE that the user has usage permissions for. Once the default compute group is determined in a specific session, it will remain unchanged during that session unless the user explicitly changes the default setting. - -In different sessions, if the following situations occur, the system may automatically change the user's default compute group: - -- The user has lost usage permissions for the default compute group selected in the last session -- A compute group has been added or removed -- The previously selected default compute group no longer has Active BE - -Situations one and two will definitely lead to a change in the automatically selected default compute group, while situation three may lead to a change. - -## Switching Compute Groups - -Users can specify the database and compute group to use in a compute-storage decoupled architecture. - -**Syntax** - -```sql -USE { [catalog_name.]database_name[@compute_group_name] | @compute_group_name } -``` - -If the database or compute group name contains reserved keywords, the corresponding name must be enclosed in backticks ```. - -## Scaling Compute Groups - -You can scale compute groups by adding or removing BE using `ALTER SYSTEM ADD BACKEND` and `ALTER SYSTEM DECOMMISION BACKEND`. diff --git a/versioned_docs/version-2.1/admin-manual/workload-management/concurrency-control-and-queuing.md b/versioned_docs/version-2.1/admin-manual/workload-management/concurrency-control-and-queuing.md index e69de29bb2..3a40fee7a0 100644 --- a/versioned_docs/version-2.1/admin-manual/workload-management/concurrency-control-and-queuing.md +++ b/versioned_docs/version-2.1/admin-manual/workload-management/concurrency-control-and-queuing.md @@ -0,0 +1,104 @@ +--- +{ +"title": "Concurrency Control and Queuing", +"language": "en" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +Concurrency control and queuing is a resource management mechanism. When multiple queries simultaneously request resources and reach the system's concurrency limit, Doris will manage the queries based on predefined strategies and restrictions, ensuring the system can still operate smoothly under high load and avoid issues like OOM (Out of Memory) or system freezes. + +Doris's concurrency control and queuing mechanism is primarily implemented through workload groups. A workload group defines the resource usage limits for queries, including maximum concurrency, queue length, and timeout parameters. By properly configuring these parameters, the goal of resource management can be achieved. + +## Basic usage + +``` +create workload group if not exists queue_group +properties ( + "max_concurrency" = "10", + "max_queue_size" = "20", + "queue_timeout" = "3000" +); +``` + +**Parameter description** + + +| Property | Data type | Default value | Value range | Description | +|-----------------|-----------|---------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| max_concurrency | Integer | 2147483647 | [0, 2147483647] | Optional, the maximum number of concurrent queries. The default value is the maximum integer value, meaning there is no limit on concurrency. When the number of running queries reaches the maximum concurrency, new queries will enter the queuing process. | +| max_queue_size | Integer | 0 | [0, 2147483647] | Optional, the length of the query queue. When the queue is full, new queries will be rejected. The default value is 0, meaning no queuing. | +| queue_timeout | Integer | 0 | [0, 2147483647] | Optional, the maximum wait time for a query in the queue, in milliseconds. If the query exceeds this time in the queue, an exception will be thrown to the client. The default value is 0, meaning no queuing, and queries will immediately fail upon entering the queue. | + + +If there is currently 1 FE in the cluster, the meaning of this configuration is as follows: The maximum number of concurrent queries in the cluster is limited to 10. When the maximum concurrency is reached, new queries will enter the queue, with the queue length limited to 20. The maximum wait time for a query in the queue is 3 seconds, and queries that exceed 3 seconds in the queue will return a failure directly to the client. + +:::tip +The current queuing design does not take into account the number of FEs. The queuing parameters only take effect at the single FE level. For example: + +In a Doris cluster, if a workload group is configured with max_concurrency = 1, +If there is 1 FE in the cluster, the workload group will allow only one SQL query to run at a time in the cluster; +If there are 3 FEs in the cluster, the maximum number of SQL queries in the cluster could be 3. +::: + +## Check the queue status + +**grammar** + +``` +show workload groups +``` + +**example** + +``` +mysql [(none)]>show workload groups\G; +*************************** 1. row *************************** + Id: 1 + Name: normal + cpu_share: 20 + memory_limit: 50% + enable_memory_overcommit: true + max_concurrency: 2147483647 + max_queue_size: 0 + queue_timeout: 0 + cpu_hard_limit: 1% + scan_thread_num: 16 + max_remote_scan_thread_num: -1 + min_remote_scan_thread_num: -1 + memory_low_watermark: 50% + memory_high_watermark: 80% + tag: + read_bytes_per_second: -1 +remote_read_bytes_per_second: -1 + running_query_num: 0 + waiting_query_num: 0 +``` + +```running_query_num```Represents the number of queries currently running, ```waiting_query_num```Represents the number of queries in the queue. + +## Bypass the queuing + +In some operational scenarios, the administrator account needs to bypass the queuing logic to execute SQL for system management tasks. This can be done by setting session variables to bypass the queuing: + +``` +set bypass_workload_group = true; +``` diff --git a/versioned_docs/version-2.1/admin-manual/workload-management/workload-management-summary.md b/versioned_docs/version-2.1/admin-manual/workload-management/workload-management-summary.md index 10244c3d69..783e3d3594 100644 --- a/versioned_docs/version-2.1/admin-manual/workload-management/workload-management-summary.md +++ b/versioned_docs/version-2.1/admin-manual/workload-management/workload-management-summary.md @@ -1,3 +1,28 @@ +--- +{ +"title": "Overview", +"language": "en" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> Workload management is an very important feature of Doris, playing a critical role in the overall operation of the system. Through reasonable workload management strategies, resource utilization can be optimized, system stability enhanced, and response time reduced. It has the following abilities: @@ -16,12 +41,15 @@ Doris can divide resource in the following three ways: - Workload Group: The resource (CPU, Memory, IO) within a BE are divided into multiple resource groups through Cgroup, enabling more fine-grained resource isolation. +- Compute Group: It is a way of resource partitioning method in compute-storage decoupled mode. Similar to Resource Group, it also takes BE as the minimum granularity to divide multiple groups. + The following table records the characteristics and advantageous scenarios of different resource partitioning methods: | Resource Isolation Method | Isolation Granularity | Soft/Hard Limit | Cross Resource Group Query | | ---------- | ----------- |-----|-----| | Resource Group | BE node level, with complete resource isolation, can isolate BE failures | Hard limit |Not support. And it is necessary to ensure that at least one copy of data is stored within the resource group. | | Workload Group | Isolation within BE process; cannot isolate BE failures | Both hard and soft limit | Support | +|Compute Group | BE node level, with complete resource isolation, can isolate BE failures | Hard limit | Not support | ## Soft Limit and Hard Limit @@ -30,4 +58,4 @@ The following table records the characteristics and advantageous scenarios of di - Soft Limit: The soft limit is a resource limit that can be exceeded, usually representing the recommended upper limit of resource usage. When the system is not busy, if a tenant requests more resources than the soft limit, it can borrow resources from other groups. When the system is busy and there is resource contention, if a tenant requests resources exceeding the soft limit, it will not be able to obtain additional resources. -When using Resource Group method to partition resources, only hard limit mode is supported. When using the Workload Group method to partition resources, both the soft limit and hard limit of Workload Group are supported. The soft limit of Workload Group is usually used for sudden resource control, such as temporary query peaks or short-term increases in data writing. +When using the Resource Group / Compute Group method to partition resources, only the hard limit mode is supported. When using the Workload Group method to partition resources, both the soft limit and hard limit of Workload Group are supported. The soft limit of Workload Group is usually used for sudden resource control, such as temporary query peaks or short-term increases in data writing. diff --git a/versioned_docs/version-3.0/admin-manual/workload-management/compute-group.md b/versioned_docs/version-3.0/admin-manual/workload-management/compute-group.md index 50123e887c..dbaa506c03 100644 --- a/versioned_docs/version-3.0/admin-manual/workload-management/compute-group.md +++ b/versioned_docs/version-3.0/admin-manual/workload-management/compute-group.md @@ -1,6 +1,6 @@ --- { -"title": "Managing Compute Groups", +"title": "Compute Groups", "language": "en" } --- @@ -150,4 +150,4 @@ If the database or compute group name contains reserved keywords, the correspond ## Scaling Compute Groups -You can scale compute groups by adding or removing BE using `ALTER SYSTEM ADD BACKEND` and `ALTER SYSTEM DECOMMISION BACKEND`. \ No newline at end of file +You can scale compute groups by adding or removing BE using `ALTER SYSTEM ADD BACKEND` and `ALTER SYSTEM DECOMMISION BACKEND`. diff --git a/versioned_docs/version-3.0/admin-manual/workload-management/concurrency-control-and-queuing.md b/versioned_docs/version-3.0/admin-manual/workload-management/concurrency-control-and-queuing.md index e69de29bb2..3a40fee7a0 100644 --- a/versioned_docs/version-3.0/admin-manual/workload-management/concurrency-control-and-queuing.md +++ b/versioned_docs/version-3.0/admin-manual/workload-management/concurrency-control-and-queuing.md @@ -0,0 +1,104 @@ +--- +{ +"title": "Concurrency Control and Queuing", +"language": "en" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +Concurrency control and queuing is a resource management mechanism. When multiple queries simultaneously request resources and reach the system's concurrency limit, Doris will manage the queries based on predefined strategies and restrictions, ensuring the system can still operate smoothly under high load and avoid issues like OOM (Out of Memory) or system freezes. + +Doris's concurrency control and queuing mechanism is primarily implemented through workload groups. A workload group defines the resource usage limits for queries, including maximum concurrency, queue length, and timeout parameters. By properly configuring these parameters, the goal of resource management can be achieved. + +## Basic usage + +``` +create workload group if not exists queue_group +properties ( + "max_concurrency" = "10", + "max_queue_size" = "20", + "queue_timeout" = "3000" +); +``` + +**Parameter description** + + +| Property | Data type | Default value | Value range | Description | +|-----------------|-----------|---------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| max_concurrency | Integer | 2147483647 | [0, 2147483647] | Optional, the maximum number of concurrent queries. The default value is the maximum integer value, meaning there is no limit on concurrency. When the number of running queries reaches the maximum concurrency, new queries will enter the queuing process. | +| max_queue_size | Integer | 0 | [0, 2147483647] | Optional, the length of the query queue. When the queue is full, new queries will be rejected. The default value is 0, meaning no queuing. | +| queue_timeout | Integer | 0 | [0, 2147483647] | Optional, the maximum wait time for a query in the queue, in milliseconds. If the query exceeds this time in the queue, an exception will be thrown to the client. The default value is 0, meaning no queuing, and queries will immediately fail upon entering the queue. | + + +If there is currently 1 FE in the cluster, the meaning of this configuration is as follows: The maximum number of concurrent queries in the cluster is limited to 10. When the maximum concurrency is reached, new queries will enter the queue, with the queue length limited to 20. The maximum wait time for a query in the queue is 3 seconds, and queries that exceed 3 seconds in the queue will return a failure directly to the client. + +:::tip +The current queuing design does not take into account the number of FEs. The queuing parameters only take effect at the single FE level. For example: + +In a Doris cluster, if a workload group is configured with max_concurrency = 1, +If there is 1 FE in the cluster, the workload group will allow only one SQL query to run at a time in the cluster; +If there are 3 FEs in the cluster, the maximum number of SQL queries in the cluster could be 3. +::: + +## Check the queue status + +**grammar** + +``` +show workload groups +``` + +**example** + +``` +mysql [(none)]>show workload groups\G; +*************************** 1. row *************************** + Id: 1 + Name: normal + cpu_share: 20 + memory_limit: 50% + enable_memory_overcommit: true + max_concurrency: 2147483647 + max_queue_size: 0 + queue_timeout: 0 + cpu_hard_limit: 1% + scan_thread_num: 16 + max_remote_scan_thread_num: -1 + min_remote_scan_thread_num: -1 + memory_low_watermark: 50% + memory_high_watermark: 80% + tag: + read_bytes_per_second: -1 +remote_read_bytes_per_second: -1 + running_query_num: 0 + waiting_query_num: 0 +``` + +```running_query_num```Represents the number of queries currently running, ```waiting_query_num```Represents the number of queries in the queue. + +## Bypass the queuing + +In some operational scenarios, the administrator account needs to bypass the queuing logic to execute SQL for system management tasks. This can be done by setting session variables to bypass the queuing: + +``` +set bypass_workload_group = true; +``` diff --git a/versioned_docs/version-3.0/admin-manual/workload-management/workload-management-summary.md b/versioned_docs/version-3.0/admin-manual/workload-management/workload-management-summary.md index 8432e653e2..783e3d3594 100644 --- a/versioned_docs/version-3.0/admin-manual/workload-management/workload-management-summary.md +++ b/versioned_docs/version-3.0/admin-manual/workload-management/workload-management-summary.md @@ -1,3 +1,28 @@ +--- +{ +"title": "Overview", +"language": "en" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> Workload management is an very important feature of Doris, playing a critical role in the overall operation of the system. Through reasonable workload management strategies, resource utilization can be optimized, system stability enhanced, and response time reduced. It has the following abilities: --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org