(doris-website) 01/01: Update CCR manual

w41ter Sun, 22 Dec 2024 23:07:06 -0800

This is an automated email from the ASF dual-hosted git repository.

w41ter pushed a commit to branch update_ccr_manual
in repository https://gitbox.apache.org/repos/asf/doris-website.git


commit b46ff758ab03d147a77673b69a119e218b1fe4f2
Author: w41ter <w41te...@gmail.com>
AuthorDate: Mon Dec 23 07:06:41 2024 +0000

    Update CCR manual
---
 docs/admin-manual/data-admin/ccr/manual.md         | 192 ++++++++++++++-------
 .../current/admin-manual/data-admin/ccr/manual.md  | 150 ++++++++++------
 2 files changed, 229 insertions(+), 113 deletions(-)

diff --git a/docs/admin-manual/data-admin/ccr/manual.md 
b/docs/admin-manual/data-admin/ccr/manual.md
index d9cec3a7c12..a7f9ea5b0ac 100644
--- a/docs/admin-manual/data-admin/ccr/manual.md
+++ b/docs/admin-manual/data-admin/ccr/manual.md
@@ -24,14 +24,38 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## Limitations
+## Usage Requirements
 
-### Network Constraints
+### Network Requirements
 
-- Syncer needs to be able to communicate with both the upstream and downstream 
FE (Frontend) and BE (Backend).
+- The Syncer must be able to communicate with both the upstream and downstream 
FE (Frontend) and BE (Backend).
 
-- The downstream BE and upstream BE are directly connected through the IP used 
by the Doris BE process (as seen in `show frontends/backends`).
+- The downstream BE and upstream BE must use the IP of the Doris BE process 
(as seen in `show frontends/backends`), which must be directly accessible.
 
+### Permission Requirements
+
+When syncing, the user must provide accounts for both upstream and downstream, 
and these accounts must have the following permissions:
+
+1. **Select_priv**: Read-only permissions on databases and tables.
+2. **Load_priv**: Write permissions on databases and tables, including Load, 
Insert, Delete, etc.
+3. **Alter_priv**: Permissions to modify databases and tables, including 
renaming databases/tables, adding/deleting/changing columns, adding/removing 
partitions, etc.
+4. **Create_priv**: Permissions to create databases, tables, and views.
+5. **drop_priv**: Permissions to delete databases, tables, and views.
+
+Additionally, **Admin privileges** are required (this may be removed in the 
future). These are needed to check the enable binlog configuration, which 
currently requires admin rights.
+
+### Version Requirements
+
+Minimum version required: v2.0.15
+
+:::caution
+**Starting from versions 2.1.8/3.0.4, the minimum supported Doris version for 
the ccr syncer is 2.1. Version 2.0 will no longer be supported.**
+:::
+
+#### Versions Not Recommended for Use
+
+Doris Versions:
+- 2.1.5/2.0.14: If upgrading from previous versions to these two versions, and 
the user has a drop partition operation, an NPE may occur during upgrade or 
restart. This is due to a new field introduced in these versions, which older 
versions don't have, causing a default value of null. This issue was fixed in 
2.1.6/2.0.15.
 
 ## Start Syncer
 
@@ -58,7 +82,7 @@ output_dir
 
 **Start options**
 
-**--daemon** 
+**--daemon**
 
 Run Syncer in the background, set to false by default.
 
@@ -66,7 +90,7 @@ Run Syncer in the background, set to false by default.
 bash bin/start_syncer.sh --daemon
 ```
 
-**--db_type** 
+**--db_type**
 
 Syncer can currently use two databases to store its metadata, `sqlite3 `(for 
local storage) and `mysql `(for local or remote storage).
 
@@ -78,7 +102,7 @@ The default value is sqlite3.
 
 When using MySQL to store metadata, Syncer will use `CREATE IF NOT EXISTS `to 
create a database called `ccr`, where the metadata table related to CCR will be 
saved.
 
-**--db_dir** 
+**--db_dir**
 
 **This option only works when db uses `sqlite3`.**
 
@@ -100,9 +124,9 @@ bash bin/start_syncer.sh --db_host 127.0.0.1 --db_port 3306 
--db_user root --db_
 
 The default values of db_host and db_port are shown in the example. The 
default values of db_user and db_password are empty.
 
-**--log_dir** 
+**--log_dir**
 
-Output path of the logs: 
+Output path of the logs:
 
 ```SQL
 bash bin/start_syncer.sh --log_dir /path/to/ccr_syncer.log
@@ -110,7 +134,7 @@ bash bin/start_syncer.sh --log_dir /path/to/ccr_syncer.log
 
 The default path is`SYNCER_OUTPUT_DIR/log` and the default file name is 
`ccr_syncer.log`.
 
-**--log_level** 
+**--log_level**
 
 Used to specify the output level of Syncer logs.
 
@@ -134,7 +158,7 @@ Under --daemon, the default value of log_level is `info`.
 
 When running in the foreground, log_level defaults to `trace`, and logs are 
saved to log_dir using the tee command.
 
-**--host && --port** 
+**--host && --port**
 
 Used to specify the host and port of Syncer, where host only plays the role of 
distinguishing itself in the cluster, which can be understood as the name of 
Syncer, and the name of Syncer in the cluster is `host: port`.
 
@@ -144,7 +168,7 @@ bash bin/start_syncer.sh --host 127.0.0.1 --port 9190
 
 The default value of host is 127.0.0.1, and the default value of port is 9190.
 
-**--pid_dir** 
+**--pid_dir**
 
 Used to specify the storage path of the pid file
 
@@ -195,7 +219,7 @@ Specify the names of the pid files to be stopped, wrap the 
names in `""` and sep
 
 Follow the default configurations.
 
-**--pid_dir** 
+**--pid_dir**
 
 Specify the directory where the pid file is located. The above three stopping 
methods all depend on the directory where the pid file is located for execution.
 
@@ -207,7 +231,7 @@ The effect of the above example is to close the Syncer 
corresponding to all pid
 
 The default value is `SYNCER_OUTPUT_DIR/bin`.
 
-**--host && --port** 
+**--host && --port**
 
 Stop the Syncer corresponding to host: port in the pid_dir path.
 
@@ -217,7 +241,7 @@ bash bin/stop_syncer.sh --host 127.0.0.1 --port 9190
 
 The default value of host is 127.0.0.1, and the default value of port is 
empty. That is, specifying the host alone will degrade **method 1** to **method 
3**. **Method 1** will only take effect when neither the host nor the port is 
empty.
 
-**--files** 
+**--files**
 
 Stop the Syncer corresponding to the specified pid file name in the pid_dir 
path.
 
@@ -298,7 +322,7 @@ The job_name is the name specified when create_ccr.
 ```shell
 curl -X POST -H "Content-Type: application/json" -d '{
     "name": "job_name"
-}' http://ccr_syncer_host:ccr_syncer_port/pause 
+}' http://ccr_syncer_host:ccr_syncer_port/pause
 ```
 
 ### Resume Job
@@ -388,74 +412,114 @@ output_dir
 bash bin/enable_db_binlog.sh -h host -p port -u user -P password -d db
 ```
 
-## High availability of Syncer
+## Syncer High Availability
 
-The high availability of Syncer relies on MySQL. If MySQL is used as the 
backend storage, the Syncer can discover other Syncers. If one Syncer crashes, 
the others will take over its tasks.
+Syncer high availability relies on MySQL. If MySQL is used as the backend 
storage, Syncer can detect other Syncers. If one crashes, others will take over 
its tasks.
 
-## Privilege requirements
+## Usage Notes
 
-1. `select_priv`: read-only privileges for databases and tables
-2. `load_priv`: write privileges for databases and tables, including load, 
insert, delete, etc.
-3. `alter_priv`: privilege to modify databases and tables, including renaming 
databases/tables, adding/deleting/changing columns, adding/deleting partitions, 
etc.
-4. `create_priv`: privilege to create databases, tables, and views
-5. `drop_priv`: privilege to drop databases, tables, and views
+### `IS_BEING_SYNCED` Attribute
 
-Admin privileges are required (We are planning on removing this in future 
versions). This is used to check the `enable binlog config`.
+When the CCR (Cluster-to-Cluster Replication) feature is enabled, a replica 
table (referred to as the target table, located in the target cluster) is 
created in the target cluster for each table in the source cluster’s sync scope 
(referred to as the source table, located in the source cluster). However, some 
features and attributes need to be disabled or erased during the creation of 
the replica table to ensure the correctness of the sync process.
 
-## Feature
+For example:
 
-### Rate limit
+- The source table may contain information that might not be synced to the 
target cluster, such as `storage_policy`, which could cause the target table 
creation to fail or behave abnormally.
+- The source table may include dynamic features, such as dynamic partitions, 
which could result in behavior inconsistencies in the target table, causing 
partitions to be inconsistent.
 
-BE-side configuration parameter
-
-```shell
-download_binlog_rate_limit_kbs=1024 # Limits the download speed of Binlog 
(including Local Snapshot) from the source cluster to 1 MB/s in a single BE node
-```
-
-1. The `download_binlog_rate_limit_kbs` parameter is configured on the BE 
nodes of the source cluster. By setting this parameter, the data pull rate can 
be effectively limited.
-
-2. The `download_binlog_rate_limit_kbs` parameter primarily controls the speed 
of data transfer for each single BE node. To calculate the overall cluster 
rate, one would multiply the parameter value by the number of nodes in the 
cluster.
-
-
-## IS_BEING_SYNCED
-
-:::tip 
-Doris v2.0 "is_being_synced" = "true" 
-:::
-
-During data synchronization using CCR, replica tables (referred to as target 
tables) are created in the target cluster for the tables within the 
synchronization scope of the source cluster (referred to as source tables). 
However, certain functionalities and attributes need to be disabled or cleared 
when creating replica tables to ensure the correctness of the synchronization 
process. For example:
-
-- The source tables may contain information that is not synchronized to the 
target cluster, such as `storage_policy`, which may cause the creation of the 
target table to fail or result in abnormal behavior.
-- The source tables may have dynamic functionalities, such as dynamic 
partitioning, which can lead to uncontrolled behavior in the target table and 
result in inconsistent partitions.
-
-The attributes that need to be cleared during replication are:
+Attributes that need to be erased due to invalidation during replication 
include:
 
 - `storage_policy`
 - `colocate_with`
 
-The functionalities that need to be disabled during synchronization are:
+Features that need to be disabled during synchronization include:
 
 - Automatic bucketing
-- Dynamic partitioning
+- Dynamic partitions
 
-### Implementation
+#### Implementation
 
-When creating the target table, the Syncer controls the addition or deletion 
of the `is_being_synced` property. In CCR, there are two approaches to creating 
a target table:
+When creating the target table, these attributes will be controlled by Syncer, 
either added or removed. In the CCR functionality, there are two ways to create 
a target table:
 
-1. During table synchronization, the Syncer performs a full copy of the source 
table using backup/restore to obtain the target table.
-2. During database synchronization, for existing tables, the Syncer also uses 
backup/restore to obtain the target table. For incremental tables, the Syncer 
creates the target table using the CreateTableRecord binlog.
+1. During table synchronization, Syncer performs a full copy of the source 
table using backup/restore to create the target table.
+2. During database synchronization, for existing tables, Syncer also uses 
backup/restore to create the target table. For incremental tables, Syncer 
creates the target table via binlog containing CreateTableRecord.
 
-Therefore, there are two entry points for inserting the `is_being_synced` 
property: the restore process during full synchronization and the getDdlStmt 
during incremental synchronization.
+Thus, there are two points of insertion for adding the `is_being_synced` 
attribute: during the restore process in full synchronization and during 
incremental synchronization via `getDdlStmt`.
 
-During the restoration process of full synchronization, the Syncer initiates a 
restore operation of the snapshot from the source cluster via RPC. During this 
process, the `is_being_synced` property is added to the RestoreStmt and takes 
effect in the final restoreJob, executing the relevant logic for 
`is_being_synced`.
+During the restore process in full synchronization, Syncer triggers a restore 
of the snapshot in the original cluster via RPC. In this process, it will add 
the `is_being_synced` attribute to the RestoreStmt, which will be applied in 
the final `restoreJob`, executing the related `isBeingSynced` logic. In 
incremental synchronization, the `getDdlStmt` method will be enhanced with a 
`boolean getDdlForSync` parameter to distinguish if it’s a controlled 
transformation into target table DDL, an [...]
 
-During incremental synchronization, add the `boolean getDdlForSync` parameter 
to the getDdlStmt method to differentiate whether it is a controlled 
transformation to the target table DDL, and execute the relevant logic for 
isBeingSynced during the creation of the target table.
+The erasure of invalid attributes needs no further explanation, but the 
disabling of the above features requires clarification:
 
-Regarding the disabling of the functionalities mentioned above:
+- **Automatic Bucketing**: Automatic bucketing is applied when creating the 
table to compute the appropriate number of buckets. This might cause the source 
table and the target table to have a different number of buckets. Therefore, 
during synchronization, the bucket count of the source table is retrieved, and 
it’s also necessary to check whether the source table is an automatically 
bucketed table so that the feature can be restored after synchronization. The 
current approach sets the au [...]
+- **Dynamic Partitions**: Dynamic partitions are controlled by adding 
`olapTable.isBeingSynced()` to the condition for performing add/drop partition 
operations. This ensures that during synchronization, the target table does not 
periodically perform add/drop partition operations.
 
-- Automatic bucketing: Automatic bucketing is enabled when creating a table. 
It calculates the appropriate number of buckets. This may result in a mismatch 
in the number of buckets between the source and target tables. Therefore, 
during synchronization, obtain the number of buckets from the source table, as 
well as the information about whether the source table is an automatic 
bucketing table in order to restore the functionality after synchronization. 
The current recommended approach is [...]
-- Dynamic partitioning: This is implemented by adding 
`olapTable.isBeingSynced()` to the condition for executing add/drop partition 
operations. This ensures that the target table does not perform periodic 
add/drop partition operations during synchronization.
+:::caution
 
-### Note
+Under normal circumstances, the `is_being_synced` attribute should be entirely 
controlled by Syncer, and users should not modify this attribute manually.
+
+:::
 
-The `is_being_synced` property should be fully controlled by the Syncer, and 
users should not modify this property manually unless there are exceptional 
circumstances.
+### Recommended Configuration Settings
+
+- `restore_reset_index_id`: If the table to be synced contains an inverted 
index, this must be set to `false` on the target cluster.
+- `ignore_backup_tmp_partitions`: If the upstream creates temporary 
partitions, Doris will prohibit performing backups, causing the ccr-syncer 
synchronization to break. This can be avoided by setting 
`ignore_backup_tmp_partitions=true` in the FE configuration.
+
+### Notes
+
+- During CCR synchronization, both backup/restore jobs and binlogs are stored 
in FE memory. Therefore, it is recommended to allocate at least 4GB of heap 
memory per CCR job (for both the source and target clusters). Additionally, 
consider modifying the following configurations to reduce memory consumption 
from unrelated jobs:
+    - Modify FE configuration `max_backup_restore_job_num_per_db`:
+        This configures the number of backup/restore jobs per DB stored in 
memory. The default value is 10, and setting it to 2 should suffice.
+    - Modify the source cluster's DB/table properties to set binlog retention 
limits:
+        - `binlog.max_bytes`: Maximum memory usage for binlogs. It is 
recommended to keep at least 4GB (default is unlimited).
+        - `binlog.ttl_seconds`: Binlog retention time. In versions prior to 
2.0.5, the default is unlimited; in later versions, the default is one day 
(86400 seconds).
+        For example, to set binlog TTL to one hour: `ALTER TABLE table SET 
("binlog.ttl_seconds"="3600")`
+- The correctness of CCR also depends on the transaction status in the target 
cluster. To ensure transactions are not prematurely reclaimed during 
synchronization, the following configurations should be increased:
+    - `label_num_threshold`: Controls the number of TXN labels.
+    - `stream_load_default_timeout_second`: Controls the TXN timeout duration.
+    - `label_keep_max_second`: Controls the retention time after TXN ends.
+    - `streaming_label_keep_max_second`: Same as above.
+- If it's a database synchronization and the source cluster has many tablets, 
the resulting CCR jobs may be very large. In this case, several FE 
configurations need to be adjusted:
+    - `max_backup_tablets_per_job`: The maximum number of tablets involved in 
a single backup job. Adjust this based on the tablet count (default is 
300,000). Too many tablets could risk FE OOM (Out of Memory), so consider 
reducing the tablet count if possible.
+    - `thrift_max_message_size`: Maximum RPC packet size allowed by the FE 
thrift server. The default is 100MB. If the snapshot info exceeds 100MB due to 
too many tablets, this limit needs to be adjusted (maximum is 2GB).
+        - The snapshot info size can be found in the CCR syncer logs, using 
the keyword: `snapshot response meta size: %d, job info size: %d`. The snapshot 
info size is approximately `meta size + job info size`.
+    - `fe_thrift_max_pkg_bytes`: Another parameter for RPC packet size, which 
needs to be adjusted in version 2.0. The default value is 20MB.
+    - `restore_download_task_num_per_be`: The maximum number of download tasks 
per BE. The default is 3, which may be too small for restore jobs. It should be 
adjusted to 0 (i.e., disable this limit). From versions 2.1.8 and 3.0.4, this 
configuration is no longer needed.
+    - `backup_upload_task_num_per_be`: The maximum number of upload tasks per 
BE. The default is 3, which may be too small for backup jobs. It should be 
adjusted to 0 (i.e., disable this limit). From versions 2.1.8 and 3.0.4, this 
configuration is no longer needed.
+    - In addition to the above FE configurations, if the CCR job's DB type is 
MySQL, some MySQL configurations need to be adjusted:
+        - MySQL server limits the size of the data packet returned/inserted in 
a single `select/insert`. Increase the following configuration to relax this 
limit, for example, adjusting it to the maximum of 1GB:
+        ```ini
+        [mysqld]
+        max_allowed_packet = 1024MB
+        ```
+        - MySQL client also has this limit. In CCR syncer versions 
2.1.6/2.0.15 and earlier, the limit is 128MB. In later versions, this can be 
adjusted via the `--mysql_max_allowed_packet` parameter (in bytes). The default 
value is 1024MB.
+        > Note: Starting from versions 2.1.8 and 3.0.4, CCR syncer no longer 
stores snapshot info in the DB, so the default data packet size is already 
sufficient.
+- Similarly, BE-side configurations need to be adjusted:
+    - `thrift_max_message_size`: The maximum RPC packet size allowed by the BE 
thrift server. The default is 100MB. If the agent task size exceeds 100MB due 
to too many tablets, this limit needs to be adjusted (maximum is 2GB).
+    - `be_thrift_max_pkg_bytes`: The same parameter as above, only needs 
adjustment in version 2.0. The default value is 20MB.
+- Even after modifying the above configurations, as the number of tablets 
increases, the resulting snapshot size might exceed 2GB, which is the threshold 
for Doris FE edit log and RPC message size, leading to synchronization failure. 
Starting from versions 2.1.8 and 3.0.4, Doris can compress the snapshot to 
further increase the number of tablets supported during backup and restore. 
Compression can be enabled with the following parameters:
+    - `restore_job_compressed_serialization`: Enable compression for restore 
jobs (affects metadata compatibility, default is off).
+    - `backup_job_compressed_serialization`: Enable compression for backup 
jobs (affects metadata compatibility, default is off).
+    - `enable_restore_snapshot_rpc_compression`: Enable compression for 
snapshot info, mainly affecting RPC (default is on).
+    > Note: Since identifying whether a backup/restore job is compressed 
requires additional code, and versions before 2.1.8 and 3.0.4 do not include 
this code, once a backup/restore job is generated, it cannot be rolled back to 
an earlier Doris version. There are two exceptions: Backup/restore jobs that 
are already canceled or finished will not be compressed. Therefore, waiting for 
the job to finish or canceling the job before rolling back will allow safe 
rollback.
+- Inside CCR, the database/table name is used as part of the internal job 
label. Therefore, if a CCR job exceeds the label length limit, the FE parameter 
`label_regex_length` can be adjusted to relax this limit (default value is 128).
+- Since backup does not currently support backing up tables with cooldown 
tablets, encountering such a table will cause synchronization to fail. 
Therefore, ensure that the `storage_policy` attribute is not set on any table 
before creating the CCR job.
+
+### Performance-Related Parameters
+
+- If the user's data volume is very large, and the time required for backup 
and restore exceeds one day (the default value), the following parameters need 
to be adjusted as needed:
+    - `backup_job_default_timeout_ms`: Timeout for backup/restore tasks. This 
needs to be configured on the FE of both the source and target clusters.
+    - Modify upstream binlog retention time: `ALTER DATABASE $db SET 
PROPERTIES ("binlog.ttl_seconds" = "xxxx")`
+
+- Downstream BE download speed is slow:
+    - `max_download_speed_kbps`: Maximum download speed limit for a single 
download thread on a downstream BE, the default is 50MB/s.
+    - `download_worker_count`: Number of threads performing download tasks on 
the downstream BE, the default is 1. This should be adjusted based on the 
customer’s machine type. The goal is to increase the thread count to the 
maximum without affecting normal read/write operations. If this parameter is 
adjusted, there's no need to modify `max_download_speed_kbps`.
+        - For example, if the customer's network card can provide a maximum 
bandwidth of 1GB, and the maximum allowed download speed per thread is 200MB, 
then `download_worker_count` should be set to 4, without changing the 
`max_download_speed_kbps`.
+
+- Limit downstream BE's binlog download speed:
+    - BE-side configuration parameter:
+    ```shell
+    download_binlog_rate_limit_kbs=1024 # Limit the speed of binlog (including 
Local Snapshot) download from the source cluster to 1MB/s for each BE node.
+    ```
+    Detailed parameters and explanations:
+    1. The `download_binlog_rate_limit_kbs` parameter is configured on the 
source cluster BE nodes. Setting this parameter effectively limits the data 
pull speed.
+    2. The `download_binlog_rate_limit_kbs` parameter mainly controls the 
speed of a single BE node. If the overall speed of the cluster is calculated, 
the parameter value should be multiplied by the number of BE nodes in the 
cluster.
\ No newline at end of file
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/data-admin/ccr/manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/data-admin/ccr/manual.md
index 819c923561a..2ab32e928e2 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/data-admin/ccr/manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/data-admin/ccr/manual.md
@@ -24,14 +24,38 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## 使用限制
+## 使用要求
 
-### 网络约束
+### 网络要求
 
 - 需要 Syncer 与上下游的 FE 和 BE 是互通的
 
 - 下游 BE 与上游 BE 通过 Doris BE 进程使用的 IP （`show frontends/backends` 看到的） 是直通的。
 
+### 权限要求
+
+Syncer 同步时需要用户提供上下游的账户，该账户需要拥有下述权限：
+
+1. Select_priv 对数据库、表的只读权限。
+2. Load_priv 对数据库、表的写权限。包括 Load、Insert、Delete 等。
+3. Alter_priv 对数据库、表的更改权限。包括重命名 库/表、添加/删除/变更 列、添加/删除 分区等操作。
+4. Create_priv 创建数据库、表、视图的权限。
+5. drop_priv 删除数据库、表、视图的权限。
+
+此外还需要加上 Admin 权限 (之后考虑彻底移除), 这个是用来检测 enable binlog config 的，现在需要 admin 权限。
+
+### 版本要求
+
+版本最低要求：v2.0.15
+
+:::caution
+**从 2.1.8/3.0.4 开始，ccr syncer 支持的最小 doris 版本是 2.1，2.0 版本将不再支持。**
+:::
+
+#### 不建议使用版本
+
+Doris 版本
+- 2.1.5/2.0.14：如果从之前的版本升级到这两个版本，且用户有 drop partition 操作，那么会在升级、重启时碰到 
NPE，原因是这个版本引入了一个新字段，旧版本没有所以默认值为 null。这个问题在 2.1.6/2.0.15 修复。
 
 ## 启动 Syncer
 
@@ -402,76 +426,35 @@ bash bin/enable_db_binlog.sh -h host -p port -u user -P 
password -d db
 
 ## Syncer 高可用
 
-Syncer 高可用依赖 mysql，如果使用 mysql 作为后端存储，Syncer 可以发现其它 Syncer，如果一个 crash 
了，其他会分担他的任务
-
-## 权限要求
-
-1. Select_priv 对数据库、表的只读权限。
-
-2. Load_priv 对数据库、表的写权限。包括 Load、Insert、Delete 等。
-
-3. Alter_priv 对数据库、表的更改权限。包括重命名 库/表、添加/删除/变更 列、添加/删除 分区等操作。
-
-4. Create_priv 创建数据库、表、视图的权限。
-
-5. drop_priv 删除数据库、表、视图的权限。
-
-加上 Admin 权限 (之后考虑彻底移除), 这个是用来检测 enable binlog config 的，现在需要 admin
-
-
-### 版本要求
-
-版本最低要求：v2.0.15
-
-
-## Feature
-
-### 限速
-
-BE 端配置参数：
-
-```shell
-download_binlog_rate_limit_kbs=1024 # 限制单个 BE 节点从源集群拉取 Binlog（包括 Local 
Snapshot）的速度为 1 MB/s
-```
+Syncer 高可用依赖 mysql，如果使用 mysql 作为后端存储，Syncer 可以发现其它 Syncer，如果一个 crash 
了，其他会分担它的任务。
 
-详细参数加说明：
 
-1. `download_binlog_rate_limit_kbs` 参数在源集群 BE 节点配置，通过设置该参数能够有效限制数据拉取速度。
+## 使用须知
 
-2. `download_binlog_rate_limit_kbs` 参数主要用于设置单个 BE 节点的速度，若计算集群整体速率一般需要参数值乘以集群个数。
-
-
-
-## IS_BEING_SYNCED 属性
-
-从 Doris v2.0 "is_being_synced" = "true"
+### IS_BEING_SYNCED 属性
 
 CCR 
功能在建立同步时，会在目标集群中创建源集群同步范围中表（后称源表，位于源集群）的副本表（后称目标表，位于目标集群），但是在创建副本表时需要失效或者擦除一些功能和属性以保证同步过程中的正确性。
 
 如：
 
 - 源表中包含了可能没有被同步到目标集群的信息，如`storage_policy`等，可能会导致目标表创建失败或者行为异常。
-
 - 源表中可能包含一些动态功能，如动态分区等，可能导致目标表的行为不受 Syncer 控制导致 partition 不一致。
 
 在被复制时因失效而需要擦除的属性有：
 
 - `storage_policy`
-
 - `colocate_with`
 
 在被同步时需要失效的功能有：
 
 - 自动分桶
-
 - 动态分区
 
-### 实现
+#### 实现
 
 在创建目标表时，这条属性将会由 Syncer 控制添加或者删除，在 CCR 功能中，创建一个目标表有两个途径：
 
 1. 在表同步时，Syncer 通过 backup/restore 的方式对源表进行全量复制来得到目标表。
-
 2. 在库同步时，对于存量表而言，Syncer 同样通过 backup/restore 的方式来得到目标表，对于增量表而言，Syncer 会通过携带有 
CreateTableRecord 的 binlog 来创建目标表。
 
 综上，对于插入`is_being_synced`属性有两个切入点：全量同步中的 restore 过程和增量同步时的 getDdlStmt。
@@ -481,9 +464,78 @@ CCR 功能在建立同步时，会在目标集群中创建源集群同步范围
 对于失效属性的擦除无需多言，对于上述功能的失效需要进行说明：
 
 - 自动分桶 自动分桶会在创建表时生效，计算当前合适的 bucket 数量，这就可能导致源表和目的表的 bucket 数目不一致。因此在同步时需要获得源表的 
bucket 数目，并且也要获得源表是否为自动分桶表的信息以便结束同步后恢复功能。当前的做法是在获取 distribution 信息时默认 
autobucket 为 false，在恢复表时通过检查`_auto_bucket`属性来判断源表是否为自动分桶表，如是则将目标表的 autobucket 
字段设置为 true，以此来达到跳过计算 bucket 数量，直接应用源表 bucket 数量的目的。
-
 - 动态分区 动态分区则是通过将`olapTable.isBeingSynced()`添加到是否执行 add/drop partition 
的判断中来实现的，这样目标表在被同步的过程中就不会周期性的执行 add/drop partition 操作。
 
-### 注意
+:::caution
 
 在未出现异常时，`is_being_synced`属性应该完全由 Syncer 控制开启或关闭，用户不要自行修改该属性。
+
+:::
+
+### 建议打开的配置
+
+- `restore_reset_index_id`：如果要同步的表中带有 inverted index，那么必须在目标集群上配置为 `false`。
+- `ignore_backup_tmp_partitions`：如果上游有创建 tmp partition，那么 doris 会禁止做 backup，因此 
ccr-syncer 同步会中断；通过在 FE 设置 `ignore_backup_tmp_partitions=true` 可以避免这个问题。
+
+### 注意事项
+
+- CCR 同步期间 backup/restore job 和 binlogs 都在 FE 内存中，因此建议在 FE 给每个 ccr job 都留出 4GB 
及以上的堆内存（源和目标集群都需要），同时注意修改下列配置减少无关 job 对内存的消耗：
+    - 修改 FE 配置 `max_backup_restore_job_num_per_db`:
+        记录在内存中的每个 DB 的 backup/restore job 数量。默认值是 10，设置为 2 就可以了。
+    - 修改源集群 db/table property，设置 binlog 保留限制
+        - `binlog.max_bytes`: binlog 最大占用内存，建议至少保留 4GB（默认无限制）
+        - `binlog.ttl_seconds`: binlog 保留时间，从 2.0.5 
之前的老版本默认无限制；之后的版本默认值为一天（86400）
+        比如要修改 binlog ttl seconds 为保留一个小时: `ALTER TABLE  table SET 
("binlog.ttl_seconds"="3600")`
+- CCR 正确性依也赖于目标集群的事务状态，因此要保证在同步过程中事务不会过快被回收，需要调大下列配置
+    - `label_num_threshold`：用于控制 TXN Label 数量
+    - `stream_load_default_timeout_second`：用于控制 TXN 超时时间
+    - `label_keep_max_second`: 用于控制 TXN 结束后保留时间
+    - `streaming_label_keep_max_second`：同上
+- 如果是 db 同步且源集群的 tablet 数量较多，那么产生的 ccr job 可能非常大，需要修改几个 FE 的配置：
+    - `max_backup_tablets_per_job`:
+        一次 backup 任务涉及的 tablet 上限，需要根据 tablet 数量调整（默认值为 30w，过多的 tablet 数量会有 FE 
OOM 风险，优先考虑能否降低 tablet 数量）
+    - `thrift_max_message_size`:
+        FE thrift server 允许的单次 RPC packet 上限，默认值为 100MB，如果 tablet 数量太多导致 
snapshot info 大小超过 100MB，则需要调整该限制，最大 2GB
+        - Snapshot info 大小可以从 ccr syncer 日志中找到，关键字：`snapshot response meta 
size: %d, job info size: %d`，snapshot info 大小大约是 meta size + job info size。
+    - `fe_thrift_max_pkg_bytes`:
+        同上，一个额外的参数，2.0 中需要调整，默认值为 20MB
+    - `restore_download_task_num_per_be`:
+        发送给每个 BE download task 数量上限，默认值是 3，对 restore job 来说太小了，需要调整为 
0（也就是关闭这个限制）； 2.1.8 和 3.0.4 起不再需要这个配置。
+    - `backup_upload_task_num_per_be`:
+        发送给每个 BE upload task 数量上限，默认值是 3，对 backup job 来说太小了，需要调整为 0 
（也就是关闭这个限制）；2.1.8 和 3.0.4 起不再需要这个配置。
+    - 除了上述 FE 的配置外，如果 ccr job 的 db type 是 mysql，还需要调整 mysql 的一些配置：
+        - mysql 服务端会限制单次 select/insert 返回/插入数据包的大小。增加下列配置以放松该限制，比如调整到上限 1GB
+        ```
+        [mysqld]
+        max_allowed_packet = 1024MB
+        ```
+        - mysql client 也会有该限制，在 ccr syncer 2.1.6/2.0.15 及之前的版本，上限为 
128MB；之后的版本可以通过参数 `--mysql_max_allowed_packet` 调整（单位 bytes），默认值为 1024MB
+        > 注：在 2.1.8 和 3.0.4 以后，ccr syncer 不再将 snapshot info 保存在 db 
中，因此默认的数据包大小已经足够了。
+- 同上，BE 端也需要修改几个配置
+    - `thrift_max_message_size`: BE thrift server 允许的单次 RPC packet 上限，默认值为 
100MB，如果 tablet 数量太多导致 agent task 大小超过 100MB，则需要调整该限制，最大 2GB
+    - `be_thrift_max_pkg_bytes`：同上，只有 2.0 中需要调整的参数，默认值为 20MB
+- 即使修改了上述配置，当 tablet 继续上升时，产生的 snapshot 大小可能会超过 2GB，也就是 doris FE edit log 和 
RPC message size 的阈值，导致同步失败。从 2.1.8 和 3.0.4 开始，doris 可以通过压缩 snapshot 
来进一步提高备份恢复支持的 tablet 数量。可以通过下面几个参数开启压缩：
+    - `restore_job_compressed_serialization`: 开启对 restore job 
的压缩（影响元数据兼容性，默认关闭）
+    - `backup_job_compressed_serialization`: 开启对 backup job 的压缩（影响元数据兼容性，默认关闭）
+    - `enable_restore_snapshot_rpc_compression`: 开启对 snapshot info 的压缩，主要影响 
RPC（默认开启）
+    > 注：由于识别 backup/restore job 是否压缩需要额外的代码，而 2.1.8 和 3.0.4 
之前的代码中不包含相关代码，因此一旦有 backup/restore job 生成，那么就无法回退到更早的 doris 版本。有两种情况例外：已经 
cancel 或者 finished 的 backup/restore job 不会被压缩，因此在回退前等待 backup/restore job 
完成或者主动取消 job 后，就能安全回退。
+- Ccr 内部会使用 db/table 名作为一些内部 job 的 label，因此如果 ccr job 中碰到了 label 超过限制了，可以调整 FE 
参数 `label_regex_length` 来放松该限制（默认值为 128）
+- 由于 backup 暂时不支持备份带有 cooldown tablet 的表，如果碰到了会导致同步终端，因此需要在创建 ccr job 前检查是否有 
table 设置了 `storage_policy` 属性。
+
+### 性能相关参数
+
+- 如果用户的数据量非常大，备份、恢复执行完需要的时间可能会超过一天（默认值），那么需要按需调整下列参数
+    - `backup_job_default_timeout_ms` 备份/恢复任务超时时间，源、目标集群的 FE 都需要配置
+    - 上游修改 binlog 保留时间: `ALTER DATABASE $db SET PROPERTIES 
("binlog.ttl_seconds" = "xxxx")`
+- 下游 BE 下载速度慢
+    - `max_download_speed_kbps` 下游单个 BE 中单个下载线程的下载限速，默认值为 50MB/s
+    - `download_worker_count` 下游执行下载任务的线程数，默认值为 
1；需要结合客户机型调整，在不影响客户正常读写时跳到最大；如果调整了这个参数，就可以不用调整 `max_download_speed_kbps`。
+        - 比如客户机器网卡最大提供 1GB 的带宽，现在最大允许下载线程利用 200MB 的带宽，那么在不改变 
`max_download_speed_kbps` 的情况下，`download_worker_count` 应该配置成 4。
+- 限制下游 BE 下载 binlog 速度
+    BE 端配置参数：
+    ```shell
+    download_binlog_rate_limit_kbs=1024 # 限制单个 BE 节点从源集群拉取 Binlog（包括 Local 
Snapshot）的速度为 1 MB/s
+    ```
+    详细参数加说明：
+    1. `download_binlog_rate_limit_kbs` 参数在源集群 BE 节点配置，通过设置该参数能够有效限制数据拉取速度。
+    2. `download_binlog_rate_limit_kbs` 参数主要用于设置单个 BE 
节点的速度，若计算集群整体速率一般需要参数值乘以集群个数。


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

(doris-website) 01/01: Update CCR manual

Reply via email to