(doris-website) branch master updated: [doc](load) optimize load doc (#1784)

liaoxin Tue, 14 Jan 2025 06:35:50 -0800

This is an automated email from the ASF dual-hosted git repository.

liaoxin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git



The following commit(s) were added to refs/heads/master by this push:
     new ed47fb5447f [doc](load) optimize load doc (#1784)
ed47fb5447f is described below

commit ed47fb5447f22b60a483b4201b15a11a54d9b22d
Author: hui lai <lai...@selectdb.com>
AuthorDate: Tue Jan 14 22:35:15 2025 +0800

    [doc](load) optimize load doc (#1784)
---
 docs/data-operate/import/data-source/kafka.md                         | 4 +++-
 docs/data-operate/import/data-source/local-file.md                    | 2 +-
 .../current/data-operate/import/data-source/kafka.md                  | 4 +++-
 .../current/data-operate/import/data-source/local-file.md             | 2 +-
 .../current/data-operate/import/import-way/stream-load-manual.md      | 3 +--
 .../version-2.1/data-operate/import/data-source/kafka.md              | 4 +++-
 .../version-2.1/data-operate/import/data-source/local-file.md         | 2 +-
 .../version-2.1/data-operate/import/import-way/stream-load-manual.md  | 3 +--
 .../version-3.0/data-operate/import/data-source/kafka.md              | 4 +++-
 .../version-3.0/data-operate/import/data-source/local-file.md         | 2 +-
 .../version-3.0/data-operate/import/import-way/stream-load-manual.md  | 3 +--
 versioned_docs/version-2.1/data-operate/import/data-source/kafka.md   | 4 +++-
 .../version-2.1/data-operate/import/data-source/local-file.md         | 2 +-
 versioned_docs/version-3.0/data-operate/import/data-source/kafka.md   | 4 +++-
 .../version-3.0/data-operate/import/data-source/local-file.md         | 2 +-
 15 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/docs/data-operate/import/data-source/kafka.md 
b/docs/data-operate/import/data-source/kafka.md
index c84d72d8758..51254997a11 100644
--- a/docs/data-operate/import/data-source/kafka.md
+++ b/docs/data-operate/import/data-source/kafka.md
@@ -34,6 +34,8 @@ Doris continuously consumes data from Kafka Topics through 
Routine Load. After s
 
 The Doris Kafka Connector is a tool for loading Kafka data streams into the 
Doris database. Users can easily load various serialization formats (such as 
JSON, Avro, Protobuf) through the Kafka Connect plugin, and it supports parsing 
data formats from the Debezium component. For more documentation, please refer 
to [Doris Kafka Connector](../../../ecosystem/doris-kafka-connector.md).
 
+In most cases, you can directly choose Routine Load for loading data without 
the need to integrate external components to consume Kafka data. When you need 
to load data in Avro or Protobuf formats, or data collected from upstream 
databases via Debezium, you can use the Doris Kafka Connector.
+
 ## Using Routine Load to consume Kafka data
 
 ### Usage Restrictions
@@ -99,7 +101,7 @@ mysql> select * from test_routineload_tbl;
 
 #### Multi-Table Load
 
-For scenarios that require loading multiple tables simultaneously, the data in 
Kafka must contain table name information. It supports obtaining dynamic table 
names from the Kafka Value, formatted as: `table_name|{"col1": "val1", "col2": 
"val2"}`. The CSV format is similar: `table_name|val1,val2,val3`. Note that the 
table name must match the table name in Doris; otherwise, the load will fail, 
and dynamic tables do not support the column_mapping configuration introduced 
later.
+In scenarios where multiple tables need to be loaded simultaneously, the data 
in Kafka must include table name information, formatted as: `table_name|data`. 
For example, when loading CSV data, the format should be: 
`table_name|val1,val2,val3`. Please note that the table name must exactly match 
the table name in Doris; otherwise, the loading will fail, and the 
column_mapping configuration introduced later is not supported.
 
 **Step 1: Prepare Data**
 
diff --git a/docs/data-operate/import/data-source/local-file.md 
b/docs/data-operate/import/data-source/local-file.md
index b62818d7868..993b401cc24 100644
--- a/docs/data-operate/import/data-source/local-file.md
+++ b/docs/data-operate/import/data-source/local-file.md
@@ -32,7 +32,7 @@ Load local files or data streams into Doris via HTTP 
protocol. Supports CSV, JSO
 
 ### 2. Streamloader Tool
 
-Streamloader is a dedicated client tool based on Stream Load, supporting 
concurrent loads, making it suitable for large data loads. For more 
information, refer to the [Streamloader 
documentation](../../../ecosystem/doris-streamloader).
+The Streamloader tool is a dedicated client tool for loading data into the 
Doris database, based on Stream Load. It can provide multi-file and 
multi-concurrent load capabilities, reducing the time required for loading 
large volumes of data. For more documentation, refer to 
[Streamloader](../../../ecosystem/doris-streamloader).
 
 ### 3. MySQL Load
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/kafka.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/kafka.md
index 2fb9f25f09f..752c1f0bf89 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/kafka.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/kafka.md
@@ -34,6 +34,8 @@ Doris 通过 Routine Load 持续消费 Kafka Topic 中的数据。提交 Routine
 
 Doris Kafka Connector 是将 Kafka 数据流导入 Doris 数据库的工具。用户可通过 Kafka Connect 
插件轻松导入多种序列化格式（如 JSON、Avro、Protobuf），并支持解析 Debezium 组件的数据格式。更多文档请参考 [Doris Kafka 
Connector](../../../ecosystem/doris-kafka-connector.md)。
 
+在大多数情况下，可以直接选择 Routine Load 进行数据导入，无需集成外部组件即可消费 Kafka 数据。当需要加载 Avro、Protobuf 
格式的数据，或通过 Debezium 采集的上游数据库数据时，可以使用 Doris Kafka Connector。
+
 ## 使用 Routine Load 消费 Kafka 数据
 
 ### 使用限制
@@ -99,7 +101,7 @@ mysql> select * from test_routineload_tbl;
 
 #### 多表导入
 
-对于需要同时导入多张表的场景，Kafka 中的数据需包含表名信息。支持从 Kafka 的 Value 
中获取动态表名，格式为：`table_name|{"col1": "val1", "col2": "val2"}`。CSV 
格式类似：`table_name|val1,val2,val3`。注意，表名必须与 Doris 中的表名一致，否则导入失败，且动态表不支持后面介绍的 
column_mapping 配置。
+对于需要同时导入多张表的场景，Kafka 中的数据必须包含表名信息，格式为：`table_name|data`。例如，导入 CSV 
数据时，格式应为：`table_name|val1,val2,val3`。请注意，表名必须与 Doris 
中的表名完全一致，否则导入将失败，并且不支持后面介绍的 column_mapping 配置。
 
 **第 1 步：准备数据**
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/local-file.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/local-file.md
index dfd7a6b2c6d..b37a8ff0c80 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/local-file.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/local-file.md
@@ -32,7 +32,7 @@ Stream Load 是通过 HTTP 协议将本地文件或数据流导入到 Doris 中
 
 - **streamloader**
 
-Streamloader工具是一款用于将数据导入 Doris 数据库的专用客户端工具，底层基于Stream 
Load实现，可以提供多并发导入的功能，降低大数据量导入的耗时。支持并发导入CSV格式的数据，导入其他格式（JSON、Parquet 与 ORC 
）时，可以同时导入多个文件，但是无法并发。更多文档参考[Streamloader](../../../ecosystem/doris-streamloader)。
+Streamloader工具是一款用于将数据导入 Doris 数据库的专用客户端工具，底层基于Stream 
Load实现，可以提供多文件，多并发导入的功能，降低大数据量导入的耗时。更多文档参考[Streamloader](../../../ecosystem/doris-streamloader)。
 
 - **MySQL Load**
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
index faec3913d83..7f04752770d 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
@@ -27,7 +27,6 @@ under the License.
 Stream Load 支持通过 HTTP 协议将本地文件或数据流导入到 Doris 中。Stream Load 
是一个同步导入方式，执行导入后返回导入结果，可以通过请求的返回判断导入是否成功。一般来说，可以使用 Stream Load 导入 10GB 
以下的文件，如果文件过大，建议将文件进行切分后使用 Stream Load 进行导入。Stream Load 
可以保证一批导入任务的原子性，要么全部导入成功，要么全部导入失败。
 
 :::tip
-提示
 
 相比于直接使用 `curl` 的单并发导入，更推荐使用专用导入工具 Doris Streamloader。该工具是一款用于将数据导入 Doris 
数据库的专用客户端工具，可以提供**多并发导入**的功能，降低大数据量导入的耗时。点击 [Doris Streamloader 
文档](../../../ecosystem/doris-streamloader) 了解使用方法与实践详情。
 :::
@@ -299,7 +298,7 @@ Stream Load 操作支持 HTTP 分块导入（HTTP chunked）与 HTTP 非分块
 | column_separator             | 
用于指定导入文件中的列分隔符，默认为`\t`。如果是不可见字符，则需要加`\x`作为前缀，使用十六进制来表示分隔符。可以使用多个字符的组合作为列分隔符。例如，hive
 文件的分隔符 `\x01`，需要指定命令 `-H "column_separator:\x01"`。 |
 | line_delimiter               | 用于指定导入文件中的换行符，默认为 
`\n`。可以使用做多个字符的组合作为换行符。例如，指定换行符为 `\n`，需要指定命令 `-H "line_delimiter:\n"`。 |
 | columns                      | 用于指定导入文件中的列和 table 
中的列的对应关系。如果源文件中的列正好对应表中的内容，那么是不需要指定这个字段的内容的。如果源文件与表 schema 
不对应，那么需要这个字段进行一些数据转换。有两种形式 column：直接对应导入文件中的字段，直接使用字段名表示衍生列，语法为 `column_name` = 
expression 详细案例参考 [导入过程中数据转换](../../../data-operate/import/load-data-convert)。 |
-| where                        | 
用于抽取部分数据。用户如果有需要将不需要的数据过滤掉，那么可以通过设定这个选项来达到。例如，只导入大于 k1 列等于 20180601 
的数据，那么可以在导入时候指定 `-H "where: k1 = 20180601"`。 |
+| where                        | 
用于抽取部分数据。用户如果有需要将不需要的数据过滤掉，那么可以通过设定这个选项来达到。例如，只导入 k1 列等于 20180601 
的数据，那么可以在导入时候指定 `-H "where: k1 = 20180601"`。 |
 | max_filter_ratio             | 最大容忍可过滤（数据不规范等原因）的数据比例，默认零容忍。取值范围是 
0~1。当导入的错误率超过该值，则导入失败。数据不规范不包括通过 where 条件过滤掉的行。例如，最大程度保证所有正确的数据都可以导入（容忍度 
100%），需要指定命令 `-H "max_filter_ratio:1"`。 |
 | partitions                   | 用于指定这次导入所涉及的 partition。如果用户能够确定数据对应的 
partition，推荐指定该项。不满足这些分区的数据将被过滤掉。例如，指定导入到 p1, p2 分区，需要指定命令 `-H "partitions: p1, 
p2"`。 |
 | timeout                      | 指定导入的超时时间。单位秒。默认是 600 秒。可设置范围为 1 秒 ~ 259200 
秒。例如，指定导入超时时间为 1200s，需要指定命令 `-H "timeout:1200"`。 |
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/kafka.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/kafka.md
index 5cc00328e08..cb1b29c1a47 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/kafka.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/kafka.md
@@ -34,6 +34,8 @@ Doris 通过 Routine Load 持续消费 Kafka Topic 中的数据。提交 Routine
 
 Doris Kafka Connector 是将 Kafka 数据流导入 Doris 数据库的工具。用户可通过 Kafka Connect 
插件轻松导入多种序列化格式（如 JSON、Avro、Protobuf），并支持解析 Debezium 组件的数据格式。更多文档请参考 [Doris Kafka 
Connector](../../../ecosystem/doris-kafka-connector.md)。
 
+在大多数情况下，可以直接选择 Routine Load 进行数据导入，无需集成外部组件即可消费 Kafka 数据。当需要加载 Avro、Protobuf 
格式的数据，或通过 Debezium 采集的上游数据库数据时，可以使用 Doris Kafka Connector。
+
 ## 使用 Routine Load 消费 Kafka 数据
 
 ### 使用限制
@@ -99,7 +101,7 @@ mysql> select * from test_routineload_tbl;
 
 #### 多表导入
 
-对于需要同时导入多张表的场景，Kafka 中的数据需包含表名信息。支持从 Kafka 的 Value 
中获取动态表名，格式为：`table_name|{"col1": "val1", "col2": "val2"}`。CSV 
格式类似：`table_name|val1,val2,val3`。注意，表名必须与 Doris 中的表名一致，否则导入失败，且动态表不支持后面介绍的 
column_mapping 配置。
+对于需要同时导入多张表的场景，Kafka 中的数据必须包含表名信息，格式为：`table_name|data`。例如，导入 CSV 
数据时，格式应为：`table_name|val1,val2,val3`。请注意，表名必须与 Doris 
中的表名完全一致，否则导入将失败，并且不支持后面介绍的 column_mapping 配置。
 
 **第 1 步：准备数据**
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/local-file.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/local-file.md
index dfd7a6b2c6d..b37a8ff0c80 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/local-file.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/local-file.md
@@ -32,7 +32,7 @@ Stream Load 是通过 HTTP 协议将本地文件或数据流导入到 Doris 中
 
 - **streamloader**
 
-Streamloader工具是一款用于将数据导入 Doris 数据库的专用客户端工具，底层基于Stream 
Load实现，可以提供多并发导入的功能，降低大数据量导入的耗时。支持并发导入CSV格式的数据，导入其他格式（JSON、Parquet 与 ORC 
）时，可以同时导入多个文件，但是无法并发。更多文档参考[Streamloader](../../../ecosystem/doris-streamloader)。
+Streamloader工具是一款用于将数据导入 Doris 数据库的专用客户端工具，底层基于Stream 
Load实现，可以提供多文件，多并发导入的功能，降低大数据量导入的耗时。更多文档参考[Streamloader](../../../ecosystem/doris-streamloader)。
 
 - **MySQL Load**
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
index 481779f1e4c..07f868cf98c 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
@@ -27,7 +27,6 @@ under the License.
 Stream Load 支持通过 HTTP 协议将本地文件或数据流导入到 Doris 中。Stream Load 
是一个同步导入方式，执行导入后返回导入结果，可以通过请求的返回判断导入是否成功。一般来说，可以使用 Stream Load 导入 10GB 
以下的文件，如果文件过大，建议将文件进行切分后使用 Stream Load 进行导入。Stream Load 
可以保证一批导入任务的原子性，要么全部导入成功，要么全部导入失败。
 
 :::tip
-提示
 
 相比于直接使用 `curl` 的单并发导入，更推荐使用专用导入工具 Doris Streamloader。该工具是一款用于将数据导入 Doris 
数据库的专用客户端工具，可以提供**多并发导入**的功能，降低大数据量导入的耗时。点击 [Doris Streamloader 
文档](../../../ecosystem/doris-streamloader) 了解使用方法与实践详情。
 :::
@@ -299,7 +298,7 @@ Stream Load 操作支持 HTTP 分块导入（HTTP chunked）与 HTTP 非分块
 | column_separator             | 
用于指定导入文件中的列分隔符，默认为`\t`。如果是不可见字符，则需要加`\x`作为前缀，使用十六进制来表示分隔符。可以使用多个字符的组合作为列分隔符。例如，hive
 文件的分隔符 `\x01`，需要指定命令 `-H "column_separator:\x01"`。 |
 | line_delimiter               | 用于指定导入文件中的换行符，默认为 
`\n`。可以使用做多个字符的组合作为换行符。例如，指定换行符为 `\n`，需要指定命令 `-H "line_delimiter:\n"`。 |
 | columns                      | 用于指定导入文件中的列和 table 
中的列的对应关系。如果源文件中的列正好对应表中的内容，那么是不需要指定这个字段的内容的。如果源文件与表 schema 
不对应，那么需要这个字段进行一些数据转换。有两种形式 column：直接对应导入文件中的字段，直接使用字段名表示衍生列，语法为 `column_name` = 
expression 详细案例参考 [导入过程中数据转换](../../../data-operate/import/load-data-convert)。 |
-| where                        | 
用于抽取部分数据。用户如果有需要将不需要的数据过滤掉，那么可以通过设定这个选项来达到。例如，只导入大于 k1 列等于 20180601 
的数据，那么可以在导入时候指定 `-H "where: k1 = 20180601"`。 |
+| where                        | 
用于抽取部分数据。用户如果有需要将不需要的数据过滤掉，那么可以通过设定这个选项来达到。例如，只导入 k1 列等于 20180601 
的数据，那么可以在导入时候指定 `-H "where: k1 = 20180601"`。 |
 | max_filter_ratio             | 最大容忍可过滤（数据不规范等原因）的数据比例，默认零容忍。取值范围是 
0~1。当导入的错误率超过该值，则导入失败。数据不规范不包括通过 where 条件过滤掉的行。例如，最大程度保证所有正确的数据都可以导入（容忍度 
100%），需要指定命令 `-H "max_filter_ratio:1"`。 |
 | partitions                   | 用于指定这次导入所涉及的 partition。如果用户能够确定数据对应的 
partition，推荐指定该项。不满足这些分区的数据将被过滤掉。例如，指定导入到 p1, p2 分区，需要指定命令 `-H "partitions: p1, 
p2"`。 |
 | timeout                      | 指定导入的超时时间。单位秒。默认是 600 秒。可设置范围为 1 秒 ~ 259200 
秒。例如，指定导入超时时间为 1200s，需要指定命令 `-H "timeout:1200"`。 |
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/data-source/kafka.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/data-source/kafka.md
index 5cc00328e08..cb1b29c1a47 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/data-source/kafka.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/data-source/kafka.md
@@ -34,6 +34,8 @@ Doris 通过 Routine Load 持续消费 Kafka Topic 中的数据。提交 Routine
 
 Doris Kafka Connector 是将 Kafka 数据流导入 Doris 数据库的工具。用户可通过 Kafka Connect 
插件轻松导入多种序列化格式（如 JSON、Avro、Protobuf），并支持解析 Debezium 组件的数据格式。更多文档请参考 [Doris Kafka 
Connector](../../../ecosystem/doris-kafka-connector.md)。
 
+在大多数情况下，可以直接选择 Routine Load 进行数据导入，无需集成外部组件即可消费 Kafka 数据。当需要加载 Avro、Protobuf 
格式的数据，或通过 Debezium 采集的上游数据库数据时，可以使用 Doris Kafka Connector。
+
 ## 使用 Routine Load 消费 Kafka 数据
 
 ### 使用限制
@@ -99,7 +101,7 @@ mysql> select * from test_routineload_tbl;
 
 #### 多表导入
 
-对于需要同时导入多张表的场景，Kafka 中的数据需包含表名信息。支持从 Kafka 的 Value 
中获取动态表名，格式为：`table_name|{"col1": "val1", "col2": "val2"}`。CSV 
格式类似：`table_name|val1,val2,val3`。注意，表名必须与 Doris 中的表名一致，否则导入失败，且动态表不支持后面介绍的 
column_mapping 配置。
+对于需要同时导入多张表的场景，Kafka 中的数据必须包含表名信息，格式为：`table_name|data`。例如，导入 CSV 
数据时，格式应为：`table_name|val1,val2,val3`。请注意，表名必须与 Doris 
中的表名完全一致，否则导入将失败，并且不支持后面介绍的 column_mapping 配置。
 
 **第 1 步：准备数据**
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/data-source/local-file.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/data-source/local-file.md
index dfd7a6b2c6d..b37a8ff0c80 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/data-source/local-file.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/data-source/local-file.md
@@ -32,7 +32,7 @@ Stream Load 是通过 HTTP 协议将本地文件或数据流导入到 Doris 中
 
 - **streamloader**
 
-Streamloader工具是一款用于将数据导入 Doris 数据库的专用客户端工具，底层基于Stream 
Load实现，可以提供多并发导入的功能，降低大数据量导入的耗时。支持并发导入CSV格式的数据，导入其他格式（JSON、Parquet 与 ORC 
）时，可以同时导入多个文件，但是无法并发。更多文档参考[Streamloader](../../../ecosystem/doris-streamloader)。
+Streamloader工具是一款用于将数据导入 Doris 数据库的专用客户端工具，底层基于Stream 
Load实现，可以提供多文件，多并发导入的功能，降低大数据量导入的耗时。更多文档参考[Streamloader](../../../ecosystem/doris-streamloader)。
 
 - **MySQL Load**
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
index 36c3de60fc2..35385151824 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
@@ -27,7 +27,6 @@ under the License.
 Stream Load 支持通过 HTTP 协议将本地文件或数据流导入到 Doris 中。Stream Load 
是一个同步导入方式，执行导入后返回导入结果，可以通过请求的返回判断导入是否成功。一般来说，可以使用 Stream Load 导入 10GB 
以下的文件，如果文件过大，建议将文件进行切分后使用 Stream Load 进行导入。Stream Load 
可以保证一批导入任务的原子性，要么全部导入成功，要么全部导入失败。
 
 :::tip
-提示
 
 相比于直接使用 `curl` 的单并发导入，更推荐使用专用导入工具 Doris Streamloader。该工具是一款用于将数据导入 Doris 
数据库的专用客户端工具，可以提供**多并发导入**的功能，降低大数据量导入的耗时。点击 [Doris Streamloader 
文档](../../../ecosystem/doris-streamloader) 了解使用方法与实践详情。
 :::
@@ -299,7 +298,7 @@ Stream Load 操作支持 HTTP 分块导入（HTTP chunked）与 HTTP 非分块
 | column_separator             | 
用于指定导入文件中的列分隔符，默认为`\t`。如果是不可见字符，则需要加`\x`作为前缀，使用十六进制来表示分隔符。可以使用多个字符的组合作为列分隔符。例如，hive
 文件的分隔符 `\x01`，需要指定命令 `-H "column_separator:\x01"`。 |
 | line_delimiter               | 用于指定导入文件中的换行符，默认为 
`\n`。可以使用做多个字符的组合作为换行符。例如，指定换行符为 `\n`，需要指定命令 `-H "line_delimiter:\n"`。 |
 | columns                      | 用于指定导入文件中的列和 table 
中的列的对应关系。如果源文件中的列正好对应表中的内容，那么是不需要指定这个字段的内容的。如果源文件与表 schema 
不对应，那么需要这个字段进行一些数据转换。有两种形式 column：直接对应导入文件中的字段，直接使用字段名表示衍生列，语法为 `column_name` = 
expression 详细案例参考 [导入过程中数据转换](../../../data-operate/import/load-data-convert)。 |
-| where                        | 
用于抽取部分数据。用户如果有需要将不需要的数据过滤掉，那么可以通过设定这个选项来达到。例如，只导入大于 k1 列等于 20180601 
的数据，那么可以在导入时候指定 `-H "where: k1 = 20180601"`。 |
+| where                        | 
用于抽取部分数据。用户如果有需要将不需要的数据过滤掉，那么可以通过设定这个选项来达到。例如，只导入 k1 列等于 20180601 
的数据，那么可以在导入时候指定 `-H "where: k1 = 20180601"`。 |
 | max_filter_ratio             | 最大容忍可过滤（数据不规范等原因）的数据比例，默认零容忍。取值范围是 
0~1。当导入的错误率超过该值，则导入失败。数据不规范不包括通过 where 条件过滤掉的行。例如，最大程度保证所有正确的数据都可以导入（容忍度 
100%），需要指定命令 `-H "max_filter_ratio:1"`。 |
 | partitions                   | 用于指定这次导入所涉及的 partition。如果用户能够确定数据对应的 
partition，推荐指定该项。不满足这些分区的数据将被过滤掉。例如，指定导入到 p1, p2 分区，需要指定命令 `-H "partitions: p1, 
p2"`。 |
 | timeout                      | 指定导入的超时时间。单位秒。默认是 600 秒。可设置范围为 1 秒 ~ 259200 
秒。例如，指定导入超时时间为 1200s，需要指定命令 `-H "timeout:1200"`。 |
diff --git 
a/versioned_docs/version-2.1/data-operate/import/data-source/kafka.md 
b/versioned_docs/version-2.1/data-operate/import/data-source/kafka.md
index c84d72d8758..51254997a11 100644
--- a/versioned_docs/version-2.1/data-operate/import/data-source/kafka.md
+++ b/versioned_docs/version-2.1/data-operate/import/data-source/kafka.md
@@ -34,6 +34,8 @@ Doris continuously consumes data from Kafka Topics through 
Routine Load. After s
 
 The Doris Kafka Connector is a tool for loading Kafka data streams into the 
Doris database. Users can easily load various serialization formats (such as 
JSON, Avro, Protobuf) through the Kafka Connect plugin, and it supports parsing 
data formats from the Debezium component. For more documentation, please refer 
to [Doris Kafka Connector](../../../ecosystem/doris-kafka-connector.md).
 
+In most cases, you can directly choose Routine Load for loading data without 
the need to integrate external components to consume Kafka data. When you need 
to load data in Avro or Protobuf formats, or data collected from upstream 
databases via Debezium, you can use the Doris Kafka Connector.
+
 ## Using Routine Load to consume Kafka data
 
 ### Usage Restrictions
@@ -99,7 +101,7 @@ mysql> select * from test_routineload_tbl;
 
 #### Multi-Table Load
 
-For scenarios that require loading multiple tables simultaneously, the data in 
Kafka must contain table name information. It supports obtaining dynamic table 
names from the Kafka Value, formatted as: `table_name|{"col1": "val1", "col2": 
"val2"}`. The CSV format is similar: `table_name|val1,val2,val3`. Note that the 
table name must match the table name in Doris; otherwise, the load will fail, 
and dynamic tables do not support the column_mapping configuration introduced 
later.
+In scenarios where multiple tables need to be loaded simultaneously, the data 
in Kafka must include table name information, formatted as: `table_name|data`. 
For example, when loading CSV data, the format should be: 
`table_name|val1,val2,val3`. Please note that the table name must exactly match 
the table name in Doris; otherwise, the loading will fail, and the 
column_mapping configuration introduced later is not supported.
 
 **Step 1: Prepare Data**
 
diff --git 
a/versioned_docs/version-2.1/data-operate/import/data-source/local-file.md 
b/versioned_docs/version-2.1/data-operate/import/data-source/local-file.md
index b62818d7868..993b401cc24 100644
--- a/versioned_docs/version-2.1/data-operate/import/data-source/local-file.md
+++ b/versioned_docs/version-2.1/data-operate/import/data-source/local-file.md
@@ -32,7 +32,7 @@ Load local files or data streams into Doris via HTTP 
protocol. Supports CSV, JSO
 
 ### 2. Streamloader Tool
 
-Streamloader is a dedicated client tool based on Stream Load, supporting 
concurrent loads, making it suitable for large data loads. For more 
information, refer to the [Streamloader 
documentation](../../../ecosystem/doris-streamloader).
+The Streamloader tool is a dedicated client tool for loading data into the 
Doris database, based on Stream Load. It can provide multi-file and 
multi-concurrent load capabilities, reducing the time required for loading 
large volumes of data. For more documentation, refer to 
[Streamloader](../../../ecosystem/doris-streamloader).
 
 ### 3. MySQL Load
 
diff --git 
a/versioned_docs/version-3.0/data-operate/import/data-source/kafka.md 
b/versioned_docs/version-3.0/data-operate/import/data-source/kafka.md
index c84d72d8758..51254997a11 100644
--- a/versioned_docs/version-3.0/data-operate/import/data-source/kafka.md
+++ b/versioned_docs/version-3.0/data-operate/import/data-source/kafka.md
@@ -34,6 +34,8 @@ Doris continuously consumes data from Kafka Topics through 
Routine Load. After s
 
 The Doris Kafka Connector is a tool for loading Kafka data streams into the 
Doris database. Users can easily load various serialization formats (such as 
JSON, Avro, Protobuf) through the Kafka Connect plugin, and it supports parsing 
data formats from the Debezium component. For more documentation, please refer 
to [Doris Kafka Connector](../../../ecosystem/doris-kafka-connector.md).
 
+In most cases, you can directly choose Routine Load for loading data without 
the need to integrate external components to consume Kafka data. When you need 
to load data in Avro or Protobuf formats, or data collected from upstream 
databases via Debezium, you can use the Doris Kafka Connector.
+
 ## Using Routine Load to consume Kafka data
 
 ### Usage Restrictions
@@ -99,7 +101,7 @@ mysql> select * from test_routineload_tbl;
 
 #### Multi-Table Load
 
-For scenarios that require loading multiple tables simultaneously, the data in 
Kafka must contain table name information. It supports obtaining dynamic table 
names from the Kafka Value, formatted as: `table_name|{"col1": "val1", "col2": 
"val2"}`. The CSV format is similar: `table_name|val1,val2,val3`. Note that the 
table name must match the table name in Doris; otherwise, the load will fail, 
and dynamic tables do not support the column_mapping configuration introduced 
later.
+In scenarios where multiple tables need to be loaded simultaneously, the data 
in Kafka must include table name information, formatted as: `table_name|data`. 
For example, when loading CSV data, the format should be: 
`table_name|val1,val2,val3`. Please note that the table name must exactly match 
the table name in Doris; otherwise, the loading will fail, and the 
column_mapping configuration introduced later is not supported.
 
 **Step 1: Prepare Data**
 
diff --git 
a/versioned_docs/version-3.0/data-operate/import/data-source/local-file.md 
b/versioned_docs/version-3.0/data-operate/import/data-source/local-file.md
index b62818d7868..993b401cc24 100644
--- a/versioned_docs/version-3.0/data-operate/import/data-source/local-file.md
+++ b/versioned_docs/version-3.0/data-operate/import/data-source/local-file.md
@@ -32,7 +32,7 @@ Load local files or data streams into Doris via HTTP 
protocol. Supports CSV, JSO
 
 ### 2. Streamloader Tool
 
-Streamloader is a dedicated client tool based on Stream Load, supporting 
concurrent loads, making it suitable for large data loads. For more 
information, refer to the [Streamloader 
documentation](../../../ecosystem/doris-streamloader).
+The Streamloader tool is a dedicated client tool for loading data into the 
Doris database, based on Stream Load. It can provide multi-file and 
multi-concurrent load capabilities, reducing the time required for loading 
large volumes of data. For more documentation, refer to 
[Streamloader](../../../ecosystem/doris-streamloader).
 
 ### 3. MySQL Load
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

(doris-website) branch master updated: [doc](load) optimize load doc (#1784)

Reply via email to