This is an automated email from the ASF dual-hosted git repository. luzhijing pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push: new 72dbd052c11 [docs](fix) Fix invalid link and unified name of Doris Streamloader (#30696) 72dbd052c11 is described below commit 72dbd052c113e267277c047fcd8bc011ea476478 Author: KassieZ <139741991+kass...@users.noreply.github.com> AuthorDate: Thu Feb 1 17:09:30 2024 +0800 [docs](fix) Fix invalid link and unified name of Doris Streamloader (#30696) --- .../docs/data-operate/import/import-way/stream-load-manual.md | 4 ++-- docs/en/docs/ecosystem/doris-streamloader.md | 10 +++++----- .../docs/data-operate/import/import-way/stream-load-manual.md | 4 ++-- docs/zh-CN/docs/ecosystem/doris-streamloader.md | 8 +++++--- 4 files changed, 14 insertions(+), 12 deletions(-) diff --git a/docs/en/docs/data-operate/import/import-way/stream-load-manual.md b/docs/en/docs/data-operate/import/import-way/stream-load-manual.md index e7aefde9357..cab3ad56be2 100644 --- a/docs/en/docs/data-operate/import/import-way/stream-load-manual.md +++ b/docs/en/docs/data-operate/import/import-way/stream-load-manual.md @@ -32,7 +32,7 @@ Stream load is mainly suitable for importing local files or data from data strea :::tip -In comparison to single-threaded load using `curl`, Doris-Streamloader is a client tool designed for loading data into Apache Doris. it reduces the ingestion latency of large datasets by its concurrent loading capabilities. It comes with the following features: +In comparison to single-threaded load using `curl`, Doris Streamloader is a client tool designed for loading data into Apache Doris. it reduces the ingestion latency of large datasets by its concurrent loading capabilities. It comes with the following features: - **Parallel loading**: multi-threaded load for the Stream Load method. You can set the parallelism level using the `workers` parameter. - **Multi-file load:** simultaneously load of multiple files and directories with one shot. It supports recursive file fetching and allows you to specify file names with wildcard characters. @@ -40,7 +40,7 @@ In comparison to single-threaded load using `curl`, Doris-Streamloader is a clie - **Resilience and continuity:** in case of partial load failures, it can resume data loading from the point of failure. - **Automatic retry mechanism:** in case of loading failures, it can automatically retry a default number of times. If the loading remains unsuccessful, it will print the command for manual retry. -See [Doris-Streamloader](../docs/ecosystem/doris-streamloader) for detailed instructions and best practices. +See [Doris Streamloader](https://doris.apache.org/docs/ecosystem/doris-streamloader) for detailed instructions and best practices. ::: ## Basic Principles diff --git a/docs/en/docs/ecosystem/doris-streamloader.md b/docs/en/docs/ecosystem/doris-streamloader.md index 4dc46085241..14a61371166 100644 --- a/docs/en/docs/ecosystem/doris-streamloader.md +++ b/docs/en/docs/ecosystem/doris-streamloader.md @@ -1,6 +1,6 @@ --- { - "title": "Doris-Streamloader", + "title": "Doris Streamloader", "language": "en" } --- @@ -26,7 +26,7 @@ under the License. ## Overview -Doris-Streamloader is a client tool designed for loading data into Apache Doris. In comparison to single-threaded load using `curl`, it reduces the load latency of large datasets by its concurrent loading capabilities. It comes with the following features: +Doris Streamloader is a client tool designed for loading data into Apache Doris. In comparison to single-threaded load using `curl`, it reduces the load latency of large datasets by its concurrent loading capabilities. It comes with the following features: - **Parallel loading**: multi-threaded load for the Stream Load method. You can set the parallelism level using the `workers` parameter. - **Multi-file load:** simultaneously load of multiple files and directories with one shot. It supports recursive file fetching and allows you to specify file names with wildcard characters. @@ -120,7 +120,7 @@ The parameters above are required, and the following parameters are optional: |---|---|---|---| | --u | Username of the database | root | | | --p | Password | empty string | | -| --compress | Whether to compress data upon HTTP transmission | false | Remain as default. Compression and decompression can increase pressure on Doris-Streamloader side and the CPU resources on Doris BE side, so it is advised to only enable this when network bandwidth is constrained. | +| --compress | Whether to compress data upon HTTP transmission | false | Remain as default. Compression and decompression can increase pressure on Doris Streamloader side and the CPU resources on Doris BE side, so it is advised to only enable this when network bandwidth is constrained. | |--timeout | Timeout of the HTTP request sent to Doris (seconds) | 60\*60\*10 | Remain as default | | --batch | Granularity of batch reading and sending of files (rows) | 4096 | Remain as default | | --batch_byte | Granularity of batch reading and sending of files (byte) | 943718400 (900MB) | Remain as default | @@ -238,8 +238,8 @@ In most cases, you only need to set the required parameters and `workers`. ### FAQ -- Before resumable loading was available, to fix any partial failures in loading would require deleting the current table and starting over. In this case, Doris-Streamloader would retry automatically. If the retry fails, a retry command will be printed so you can copy and execute it. -- The default maximum data loading size for Doris-Streamloader is limited by BE config `streaming_load_max_mb` (default: 100GB). If you don't want to restart BE, you can also dial down `max_byte_per_task`. +- Before resumable loading was available, to fix any partial failures in loading would require deleting the current table and starting over. In this case, Doris Streamloader would retry automatically. If the retry fails, a retry command will be printed so you can copy and execute it. +- The default maximum data loading size for Doris Streamloader is limited by BE config `streaming_load_max_mb` (default: 100GB). If you don't want to restart BE, you can also dial down `max_byte_per_task`. To show current `streaming_load_max_mb`: diff --git a/docs/zh-CN/docs/data-operate/import/import-way/stream-load-manual.md b/docs/zh-CN/docs/data-operate/import/import-way/stream-load-manual.md index 69e47407906..78393892bdb 100644 --- a/docs/zh-CN/docs/data-operate/import/import-way/stream-load-manual.md +++ b/docs/zh-CN/docs/data-operate/import/import-way/stream-load-manual.md @@ -31,7 +31,7 @@ Stream load 是一个同步的导入方式,用户通过发送 HTTP 协议发 Stream load 主要适用于导入本地文件,或通过程序导入数据流中的数据。 :::tip -相比于直接使用 `curl` 的单并发导入,更推荐使用 **专用导入工具 Doris-Streamloader** 该工具是一款用于将数据导入 Doris 数据库的专用客户端工具,可以提供 **多并发导入** 的功能,降低大数据量导入的耗时。拥有以下功能: +相比于直接使用 `curl` 的单并发导入,更推荐使用 **专用导入工具 Doris Streamloader** 该工具是一款用于将数据导入 Doris 数据库的专用客户端工具,可以提供 **多并发导入** 的功能,降低大数据量导入的耗时。拥有以下功能: - 并发导入,实现 Stream Load 的多并发导入。可以通过 `workers` 值设置并发数。 - 多文件导入,一次导入可以同时导入多个文件及目录,支持设置通配符以及会自动递归获取文件夹下的所有文件。 @@ -39,7 +39,7 @@ Stream load 主要适用于导入本地文件,或通过程序导入数据流 - 自动重传,在导入出现失败的情况后,无需手动重传,工具会自动重传默认的次数,如果仍然不成功,打印出手动重传的命令。 -点击 [Doris-Streamloader 文档](../docs/ecosystem/doris-streamloader)了解使用方法与实践详情。 +点击 [Doris Streamloader 文档](https://doris.apache.org/zh-CN/docs/ecosystem/doris-streamloader) 了解使用方法与实践详情。 ::: ## 基本原理 diff --git a/docs/zh-CN/docs/ecosystem/doris-streamloader.md b/docs/zh-CN/docs/ecosystem/doris-streamloader.md index 40e7b1b3fbc..96ca55e5aa6 100644 --- a/docs/zh-CN/docs/ecosystem/doris-streamloader.md +++ b/docs/zh-CN/docs/ecosystem/doris-streamloader.md @@ -1,7 +1,7 @@ --- { - "title": "Doris-Streamloader", + "title": "Doris Streamloader", "language": "zh-CN" } --- @@ -27,7 +27,7 @@ under the License. ## 概述 -[Doris-Streamloader](https://github.com/apache/doris-streamloader) 是一款用于将数据导入 Doris 数据库的专用客户端工具。相比于直接使用 `curl` 的单并发导入,该工具可以提供多并发导入的功能,降低大数据量导入的耗时。拥有以下功能: +[Doris Streamloader](https://github.com/apache/doris-streamloader) 是一款用于将数据导入 Doris 数据库的专用客户端工具。相比于直接使用 `curl` 的单并发导入,该工具可以提供多并发导入的功能,降低大数据量导入的耗时。拥有以下功能: - 并发导入,实现 Stream Load 的多并发导入。可以通过 workers 值设置并发数。 - 多文件导入,一次导入可以同时导入多个文件及目录,支持设置通配符以及会自动递归获取文件夹下的所有文件。 @@ -45,6 +45,7 @@ under the License. | v1.0 | 20240131 | x64 | https://apache-doris-releases.oss-accelerate.aliyuncs.com/apache-doris-streamloader-1.0.1-bin-x64.tar.xz| | v1.0 | 20240131 | arm64 | https://apache-doris-releases.oss-accelerate.aliyuncs.com/apache-doris-streamloader-1.0.1-bin-arm64.tar.xz| + :::note 获取结果即为可执行二进制。 ::: @@ -101,7 +102,7 @@ doris-streamloader --source_file={FILE_LIST} --url={FE_OR_BE_SERVER_URL}:{PORT} ``` :::tip -当需要多个文件导入时,使用 Doris-Streamloader 也只会产生一个版本号 +当需要多个文件导入时,使用 Doris Streamloader 也只会产生一个版本号 ::: @@ -240,6 +241,7 @@ Load Result: { 查看 `streaming_load_max_mb` 大小的方法: + ```Go -curl "http://127.0.0.1:8040/api/show_config" ``` --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org