(doris-website) branch master updated: [opt](lake) add optimization doc (#3001)

morningman Thu, 23 Oct 2025 02:09:45 -0700

This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git



The following commit(s) were added to refs/heads/master by this push:
     new d182b35eefc [opt](lake) add optimization doc (#3001)
d182b35eefc is described below

commit d182b35eefc8758d9de2bfc75dc70d7f0230592e
Author: Mingyu Chen (Rayner) <[email protected]>
AuthorDate: Thu Oct 23 17:09:33 2025 +0800

    [opt](lake) add optimization doc (#3001)
    
    ## Versions
    
    - [x] dev
    - [x] 3.x
    - [x] 2.1
    - [ ] 2.0
    
    ## Languages
    
    - [x] Chinese
    - [x] English
    
    ## Docs Checklist
    
    - [ ] Checked by AI
    - [ ] Test Cases Built
---
 docs/lakehouse/best-practices/optimization.md      | 96 ++++++++++++++++++++++
 .../lakehouse/best-practices/optimization.md       | 96 ++++++++++++++++++++++
 .../lakehouse/best-practices/optimization.md       | 96 ++++++++++++++++++++++
 .../lakehouse/best-practices/optimization.md       | 96 ++++++++++++++++++++++
 sidebars.json                                      |  1 +
 .../lakehouse/best-practices/optimization.md       | 96 ++++++++++++++++++++++
 .../lakehouse/best-practices/optimization.md       | 96 ++++++++++++++++++++++
 versioned_sidebars/version-2.1-sidebars.json       |  3 +-
 versioned_sidebars/version-3.x-sidebars.json       |  3 +-
 9 files changed, 581 insertions(+), 2 deletions(-)

diff --git a/docs/lakehouse/best-practices/optimization.md 
b/docs/lakehouse/best-practices/optimization.md
new file mode 100644
index 00000000000..b7877d50e9d
--- /dev/null
+++ b/docs/lakehouse/best-practices/optimization.md
@@ -0,0 +1,96 @@
+---
+{
+"title": "Data Lake Query Optimization",
+"language": "en"
+}
+---
+
+This document mainly introduces optimization methods and strategies for 
querying lake data (Hive, Iceberg, Paimon, etc.).
+
+## Partition Pruning
+
+By specifying partition column conditions in queries, unnecessary partitions 
can be pruned, reducing the amount of data that needs to be read.
+
+You can use `EXPLAIN <SQL>` to view the `partition` section of `XXX_SCAN_NODE` 
to check whether partition pruning is effective and how many partitions need to 
be scanned in this query.
+
+For example:
+
+```
+0:VPAIMON_SCAN_NODE(88)
+    table: paimon_ctl.db.table
+    predicates: (user_id[#4] = 431304818)
+    inputSplitNum=15775, totalFileSize=951754154566, scanRanges=15775
+    partition=203/0
+```
+
+## Local Data Cache
+
+Data Cache accelerates subsequent queries accessing the same data by caching 
recently accessed data files from remote storage systems (HDFS or object 
storage) to local disk.
+
+The cache feature is disabled by default. Please refer to the [Data 
Cache](../data-cache.md) documentation to configure and enable it.
+
+## HDFS Read Optimization
+
+In some cases, high HDFS load may cause long delays when reading data replicas 
from HDFS, slowing down overall query efficiency. HDFS Client provides the 
Hedged Read feature.
+This feature can start another read thread to read the same data when a read 
request exceeds a certain threshold without returning, using whichever returns 
first.
+
+Note: This feature may increase the load on the HDFS cluster, please use it 
judiciously.
+
+You can enable this feature in the following way:
+
+```
+create catalog regression properties (
+    'type'='hms',
+    'hive.metastore.uris' = 'thrift://172.21.16.47:7004',
+    'dfs.client.hedged.read.threadpool.size' = '128',
+    'dfs.client.hedged.read.threshold.millis' = "500"
+);
+```
+
+- `dfs.client.hedged.read.threadpool.size`: Represents the number of threads 
used for Hedged Read, which are shared by an HDFS Client. Typically, for an 
HDFS cluster, BE nodes share one HDFS Client.
+
+- `dfs.client.hedged.read.threshold.millis`: The read threshold in 
milliseconds. When a read request exceeds this threshold without returning, 
Hedged Read is triggered.
+
+After enabling, you can see related parameters in the Query Profile:
+
+- `TotalHedgedRead`: Number of times Hedged Read was initiated.
+
+- `HedgedReadWins`: Number of successful Hedged Reads (initiated and returned 
faster than the original request)
+
+Note that these values are cumulative for a single HDFS Client, not for a 
single query. The same HDFS Client is reused by multiple queries.
+
+## Merge IO Optimization
+
+For remote storage systems like HDFS and object storage, Doris optimizes IO 
access through Merge IO technology. Merge IO technology essentially merges 
multiple adjacent small IO requests into one large IO request, which can reduce 
IOPS and increase IO throughput.
+
+For example, if the original request needs to read parts [0, 10] and [20, 50] 
of file `file1`:
+
+```
+Request Range: [0, 10], [20, 50]
+```
+
+Through Merge IO, it will be merged into one request:
+
+```
+Request Range: [0, 50]
+```
+
+In this example, two IO requests are merged into one, but it also reads some 
additional data (data between 10-20). Therefore, while Merge IO reduces the 
number of IO operations, it may bring potential read amplification issues.
+
+You can view specific Merge IO information through Query Profile:
+
+```
+- MergedSmallIO:
+    - MergedBytes: 3.00 GB
+    - MergedIO: 424
+    - RequestBytes: 2.50 GB
+    - RequestIO: 65.555K (65555)
+```
+
+Where `RequestBytes` and `RequestIO` indicate the data volume and number of 
requests in the original request. `MergedBytes` and `MergedIO` indicate the 
data volume and number of requests after merging.
+
+If you find that `MergedBytes` is much larger than `RequestBytes`, it 
indicates serious read amplification. You can adjust it through the following 
parameters:
+
+- `merge_io_read_slice_size_bytes`
+
+    Session variable, supported since version 3.1.3. Default is 8MB. If you 
find serious read amplification, you can reduce this parameter, such as to 
64KB, and observe whether the modified IO requests and query latency improve.
\ No newline at end of file
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/best-practices/optimization.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/best-practices/optimization.md
new file mode 100644
index 00000000000..14e53738cae
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/best-practices/optimization.md
@@ -0,0 +1,96 @@
+---
+{
+"title": "数据湖查询调优",
+"language": "zh-CN"
+}
+---
+
+本文档主要介绍在针对湖上数据（Hive、Iceberg、Paimon 等）查询的优化手段和优化策略。
+
+## 分区裁剪
+
+通过在查询中指定分区列条件，能够裁减掉不必要的分区，减少需要读取的数据量。
+
+可以通过 `EXPLAIN <SQL>` 来查看 `XXX_SCAN_NODE` 的 `partition` 
部分，可以查看分区裁剪是否生效，以及本次查询需要扫描多少分区。
+
+如：
+
+```
+0:VPAIMON_SCAN_NODE(88)
+    table: paimon_ctl.db.table
+    predicates: (user_id[#4] = 431304818)
+    inputSplitNum=15775, totalFileSize=951754154566, scanRanges=15775
+    partition=203/0
+```
+
+## 本地数据缓存
+
+数据缓存（Data Cache）通过缓存最近访问的远端存储系统（HDFS 或对象存储）的数据文件到本地磁盘上，加速后续访问相同数据的查询。
+
+缓存功能默认是关闭的，请参阅 [数据缓存](../data-cache.md) 文档配置并开启。
+
+## HDFS 读取优化
+
+在某些情况下，HDFS 的负载较高可能导致读取某个 HDFS 上的数据副本的时间较长，从而拖慢整体的查询效率。HDFS Client 提供了 Hedged 
Read 功能。
+该功能可以在一个读请求超过一定阈值未返回时，启动另一个读线程读取同一份数据，哪个先返回就是用哪个结果。
+
+注意：该功能可能会增加 HDFS 集群的负载，请酌情使用。
+
+可以通过以下方式开启这个功能：
+
+```
+create catalog regression properties (
+    'type'='hms',
+    'hive.metastore.uris' = 'thrift://172.21.16.47:7004',
+    'dfs.client.hedged.read.threadpool.size' = '128',
+    'dfs.client.hedged.read.threshold.millis' = "500"
+);
+```
+
+- `dfs.client.hedged.read.threadpool.size`：表示用于 Hedged Read 的线程数，这些线程由一个 HDFS 
Client 共享。通常情况下，针对一个 HDFS 集群，BE 节点会共享一个 HDFS Client。
+
+- `dfs.client.hedged.read.threshold.millis`：是读取阈值，单位毫秒。当一个读请求超过这个阈值未返回时，会触发 
Hedged Read。
+
+开启后，可以在 Query Profile 中看到相关参数：
+
+- `TotalHedgedRead`：发起 Hedged Read 的次数。
+
+- `HedgedReadWins`：Hedged Read 成功的次数（发起并且比原请求更快返回的次数）
+
+注意，这里的值是单个 HDFS Client 的累计值，而不是单个查询的数值。同一个 HDFS Client 会被多个查询复用。
+
+## Merge IO 优化
+
+针对 HDFS、对象存储等远端存储系统，Doris 会通过 Merge IO 技术来优化 IO 访问。Merge IO 技术，本质上是将多个相邻的小 IO 
请求，合并成一个大 IO 请求，这样可以减少 IOPS，增加 IO 吞吐。
+
+比如原始请求需要读取文件 `file1` 的 [0, 10] 和 [20, 50] 两部分数据：
+
+```
+Request Range: [0, 10], [20, 50]
+```
+
+通过 Merge IO，会合并成一个请求：
+
+```
+Request Range: [0, 50]
+```
+
+在这个示例中，两次 IO 请求合并为了一次，但同时也多读了一部分数据（10-20 之间的数据）。因此，Merge IO 在降低 IO 
次数的同时，可能带来潜在的读放大问题。
+
+通过 Query Profile 可以查看 MergeIO 的具体情况：
+
+```
+- MergedSmallIO:
+    - MergedBytes: 3.00 GB
+    - MergedIO: 424
+    - RequestBytes: 2.50 GB
+    - RequestIO: 65.555K (65555)
+```
+
+其中 `RequestBytes` 和 `RequestIO` 标识原始请求的数据量和请求次数。`MergedBytes` 和 `MergedIO` 
标识合并和的请求数据量和请求次数。
+
+如果发现 `MergedBytes` 数据量远大于 `RequestBytes`，则说明读放大比较严重，可以通过下面的参数调整修改：
+
+- `merge_io_read_slice_size_bytes`
+
+    会话变量，自 3.1.3 版本支持。默认为 8MB。如果发现读放大严重，可以将此参数调小，如 64KB。并观察修改后的 IO 
请求和查询延迟是否有提升。
\ No newline at end of file
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/best-practices/optimization.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/best-practices/optimization.md
new file mode 100644
index 00000000000..14e53738cae
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/best-practices/optimization.md
@@ -0,0 +1,96 @@
+---
+{
+"title": "数据湖查询调优",
+"language": "zh-CN"
+}
+---
+
+本文档主要介绍在针对湖上数据（Hive、Iceberg、Paimon 等）查询的优化手段和优化策略。
+
+## 分区裁剪
+
+通过在查询中指定分区列条件，能够裁减掉不必要的分区，减少需要读取的数据量。
+
+可以通过 `EXPLAIN <SQL>` 来查看 `XXX_SCAN_NODE` 的 `partition` 
部分，可以查看分区裁剪是否生效，以及本次查询需要扫描多少分区。
+
+如：
+
+```
+0:VPAIMON_SCAN_NODE(88)
+    table: paimon_ctl.db.table
+    predicates: (user_id[#4] = 431304818)
+    inputSplitNum=15775, totalFileSize=951754154566, scanRanges=15775
+    partition=203/0
+```
+
+## 本地数据缓存
+
+数据缓存（Data Cache）通过缓存最近访问的远端存储系统（HDFS 或对象存储）的数据文件到本地磁盘上，加速后续访问相同数据的查询。
+
+缓存功能默认是关闭的，请参阅 [数据缓存](../data-cache.md) 文档配置并开启。
+
+## HDFS 读取优化
+
+在某些情况下，HDFS 的负载较高可能导致读取某个 HDFS 上的数据副本的时间较长，从而拖慢整体的查询效率。HDFS Client 提供了 Hedged 
Read 功能。
+该功能可以在一个读请求超过一定阈值未返回时，启动另一个读线程读取同一份数据，哪个先返回就是用哪个结果。
+
+注意：该功能可能会增加 HDFS 集群的负载，请酌情使用。
+
+可以通过以下方式开启这个功能：
+
+```
+create catalog regression properties (
+    'type'='hms',
+    'hive.metastore.uris' = 'thrift://172.21.16.47:7004',
+    'dfs.client.hedged.read.threadpool.size' = '128',
+    'dfs.client.hedged.read.threshold.millis' = "500"
+);
+```
+
+- `dfs.client.hedged.read.threadpool.size`：表示用于 Hedged Read 的线程数，这些线程由一个 HDFS 
Client 共享。通常情况下，针对一个 HDFS 集群，BE 节点会共享一个 HDFS Client。
+
+- `dfs.client.hedged.read.threshold.millis`：是读取阈值，单位毫秒。当一个读请求超过这个阈值未返回时，会触发 
Hedged Read。
+
+开启后，可以在 Query Profile 中看到相关参数：
+
+- `TotalHedgedRead`：发起 Hedged Read 的次数。
+
+- `HedgedReadWins`：Hedged Read 成功的次数（发起并且比原请求更快返回的次数）
+
+注意，这里的值是单个 HDFS Client 的累计值，而不是单个查询的数值。同一个 HDFS Client 会被多个查询复用。
+
+## Merge IO 优化
+
+针对 HDFS、对象存储等远端存储系统，Doris 会通过 Merge IO 技术来优化 IO 访问。Merge IO 技术，本质上是将多个相邻的小 IO 
请求，合并成一个大 IO 请求，这样可以减少 IOPS，增加 IO 吞吐。
+
+比如原始请求需要读取文件 `file1` 的 [0, 10] 和 [20, 50] 两部分数据：
+
+```
+Request Range: [0, 10], [20, 50]
+```
+
+通过 Merge IO，会合并成一个请求：
+
+```
+Request Range: [0, 50]
+```
+
+在这个示例中，两次 IO 请求合并为了一次，但同时也多读了一部分数据（10-20 之间的数据）。因此，Merge IO 在降低 IO 
次数的同时，可能带来潜在的读放大问题。
+
+通过 Query Profile 可以查看 MergeIO 的具体情况：
+
+```
+- MergedSmallIO:
+    - MergedBytes: 3.00 GB
+    - MergedIO: 424
+    - RequestBytes: 2.50 GB
+    - RequestIO: 65.555K (65555)
+```
+
+其中 `RequestBytes` 和 `RequestIO` 标识原始请求的数据量和请求次数。`MergedBytes` 和 `MergedIO` 
标识合并和的请求数据量和请求次数。
+
+如果发现 `MergedBytes` 数据量远大于 `RequestBytes`，则说明读放大比较严重，可以通过下面的参数调整修改：
+
+- `merge_io_read_slice_size_bytes`
+
+    会话变量，自 3.1.3 版本支持。默认为 8MB。如果发现读放大严重，可以将此参数调小，如 64KB。并观察修改后的 IO 
请求和查询延迟是否有提升。
\ No newline at end of file
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/optimization.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/optimization.md
new file mode 100644
index 00000000000..14e53738cae
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/optimization.md
@@ -0,0 +1,96 @@
+---
+{
+"title": "数据湖查询调优",
+"language": "zh-CN"
+}
+---
+
+本文档主要介绍在针对湖上数据（Hive、Iceberg、Paimon 等）查询的优化手段和优化策略。
+
+## 分区裁剪
+
+通过在查询中指定分区列条件，能够裁减掉不必要的分区，减少需要读取的数据量。
+
+可以通过 `EXPLAIN <SQL>` 来查看 `XXX_SCAN_NODE` 的 `partition` 
部分，可以查看分区裁剪是否生效，以及本次查询需要扫描多少分区。
+
+如：
+
+```
+0:VPAIMON_SCAN_NODE(88)
+    table: paimon_ctl.db.table
+    predicates: (user_id[#4] = 431304818)
+    inputSplitNum=15775, totalFileSize=951754154566, scanRanges=15775
+    partition=203/0
+```
+
+## 本地数据缓存
+
+数据缓存（Data Cache）通过缓存最近访问的远端存储系统（HDFS 或对象存储）的数据文件到本地磁盘上，加速后续访问相同数据的查询。
+
+缓存功能默认是关闭的，请参阅 [数据缓存](../data-cache.md) 文档配置并开启。
+
+## HDFS 读取优化
+
+在某些情况下，HDFS 的负载较高可能导致读取某个 HDFS 上的数据副本的时间较长，从而拖慢整体的查询效率。HDFS Client 提供了 Hedged 
Read 功能。
+该功能可以在一个读请求超过一定阈值未返回时，启动另一个读线程读取同一份数据，哪个先返回就是用哪个结果。
+
+注意：该功能可能会增加 HDFS 集群的负载，请酌情使用。
+
+可以通过以下方式开启这个功能：
+
+```
+create catalog regression properties (
+    'type'='hms',
+    'hive.metastore.uris' = 'thrift://172.21.16.47:7004',
+    'dfs.client.hedged.read.threadpool.size' = '128',
+    'dfs.client.hedged.read.threshold.millis' = "500"
+);
+```
+
+- `dfs.client.hedged.read.threadpool.size`：表示用于 Hedged Read 的线程数，这些线程由一个 HDFS 
Client 共享。通常情况下，针对一个 HDFS 集群，BE 节点会共享一个 HDFS Client。
+
+- `dfs.client.hedged.read.threshold.millis`：是读取阈值，单位毫秒。当一个读请求超过这个阈值未返回时，会触发 
Hedged Read。
+
+开启后，可以在 Query Profile 中看到相关参数：
+
+- `TotalHedgedRead`：发起 Hedged Read 的次数。
+
+- `HedgedReadWins`：Hedged Read 成功的次数（发起并且比原请求更快返回的次数）
+
+注意，这里的值是单个 HDFS Client 的累计值，而不是单个查询的数值。同一个 HDFS Client 会被多个查询复用。
+
+## Merge IO 优化
+
+针对 HDFS、对象存储等远端存储系统，Doris 会通过 Merge IO 技术来优化 IO 访问。Merge IO 技术，本质上是将多个相邻的小 IO 
请求，合并成一个大 IO 请求，这样可以减少 IOPS，增加 IO 吞吐。
+
+比如原始请求需要读取文件 `file1` 的 [0, 10] 和 [20, 50] 两部分数据：
+
+```
+Request Range: [0, 10], [20, 50]
+```
+
+通过 Merge IO，会合并成一个请求：
+
+```
+Request Range: [0, 50]
+```
+
+在这个示例中，两次 IO 请求合并为了一次，但同时也多读了一部分数据（10-20 之间的数据）。因此，Merge IO 在降低 IO 
次数的同时，可能带来潜在的读放大问题。
+
+通过 Query Profile 可以查看 MergeIO 的具体情况：
+
+```
+- MergedSmallIO:
+    - MergedBytes: 3.00 GB
+    - MergedIO: 424
+    - RequestBytes: 2.50 GB
+    - RequestIO: 65.555K (65555)
+```
+
+其中 `RequestBytes` 和 `RequestIO` 标识原始请求的数据量和请求次数。`MergedBytes` 和 `MergedIO` 
标识合并和的请求数据量和请求次数。
+
+如果发现 `MergedBytes` 数据量远大于 `RequestBytes`，则说明读放大比较严重，可以通过下面的参数调整修改：
+
+- `merge_io_read_slice_size_bytes`
+
+    会话变量，自 3.1.3 版本支持。默认为 8MB。如果发现读放大严重，可以将此参数调小，如 64KB。并观察修改后的 IO 
请求和查询延迟是否有提升。
\ No newline at end of file
diff --git a/sidebars.json b/sidebars.json
index 91061e23cb0..7b90fc3b0b1 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -500,6 +500,7 @@
                             "type": "category",
                             "label": "Lakehouse Best Practices",
                             "items": [
+                                "lakehouse/best-practices/optimization",
                                 "lakehouse/best-practices/doris-hudi",
                                 "lakehouse/best-practices/doris-paimon",
                                 "lakehouse/best-practices/doris-iceberg",
diff --git 
a/versioned_docs/version-2.1/lakehouse/best-practices/optimization.md 
b/versioned_docs/version-2.1/lakehouse/best-practices/optimization.md
new file mode 100644
index 00000000000..b7877d50e9d
--- /dev/null
+++ b/versioned_docs/version-2.1/lakehouse/best-practices/optimization.md
@@ -0,0 +1,96 @@
+---
+{
+"title": "Data Lake Query Optimization",
+"language": "en"
+}
+---
+
+This document mainly introduces optimization methods and strategies for 
querying lake data (Hive, Iceberg, Paimon, etc.).
+
+## Partition Pruning
+
+By specifying partition column conditions in queries, unnecessary partitions 
can be pruned, reducing the amount of data that needs to be read.
+
+You can use `EXPLAIN <SQL>` to view the `partition` section of `XXX_SCAN_NODE` 
to check whether partition pruning is effective and how many partitions need to 
be scanned in this query.
+
+For example:
+
+```
+0:VPAIMON_SCAN_NODE(88)
+    table: paimon_ctl.db.table
+    predicates: (user_id[#4] = 431304818)
+    inputSplitNum=15775, totalFileSize=951754154566, scanRanges=15775
+    partition=203/0
+```
+
+## Local Data Cache
+
+Data Cache accelerates subsequent queries accessing the same data by caching 
recently accessed data files from remote storage systems (HDFS or object 
storage) to local disk.
+
+The cache feature is disabled by default. Please refer to the [Data 
Cache](../data-cache.md) documentation to configure and enable it.
+
+## HDFS Read Optimization
+
+In some cases, high HDFS load may cause long delays when reading data replicas 
from HDFS, slowing down overall query efficiency. HDFS Client provides the 
Hedged Read feature.
+This feature can start another read thread to read the same data when a read 
request exceeds a certain threshold without returning, using whichever returns 
first.
+
+Note: This feature may increase the load on the HDFS cluster, please use it 
judiciously.
+
+You can enable this feature in the following way:
+
+```
+create catalog regression properties (
+    'type'='hms',
+    'hive.metastore.uris' = 'thrift://172.21.16.47:7004',
+    'dfs.client.hedged.read.threadpool.size' = '128',
+    'dfs.client.hedged.read.threshold.millis' = "500"
+);
+```
+
+- `dfs.client.hedged.read.threadpool.size`: Represents the number of threads 
used for Hedged Read, which are shared by an HDFS Client. Typically, for an 
HDFS cluster, BE nodes share one HDFS Client.
+
+- `dfs.client.hedged.read.threshold.millis`: The read threshold in 
milliseconds. When a read request exceeds this threshold without returning, 
Hedged Read is triggered.
+
+After enabling, you can see related parameters in the Query Profile:
+
+- `TotalHedgedRead`: Number of times Hedged Read was initiated.
+
+- `HedgedReadWins`: Number of successful Hedged Reads (initiated and returned 
faster than the original request)
+
+Note that these values are cumulative for a single HDFS Client, not for a 
single query. The same HDFS Client is reused by multiple queries.
+
+## Merge IO Optimization
+
+For remote storage systems like HDFS and object storage, Doris optimizes IO 
access through Merge IO technology. Merge IO technology essentially merges 
multiple adjacent small IO requests into one large IO request, which can reduce 
IOPS and increase IO throughput.
+
+For example, if the original request needs to read parts [0, 10] and [20, 50] 
of file `file1`:
+
+```
+Request Range: [0, 10], [20, 50]
+```
+
+Through Merge IO, it will be merged into one request:
+
+```
+Request Range: [0, 50]
+```
+
+In this example, two IO requests are merged into one, but it also reads some 
additional data (data between 10-20). Therefore, while Merge IO reduces the 
number of IO operations, it may bring potential read amplification issues.
+
+You can view specific Merge IO information through Query Profile:
+
+```
+- MergedSmallIO:
+    - MergedBytes: 3.00 GB
+    - MergedIO: 424
+    - RequestBytes: 2.50 GB
+    - RequestIO: 65.555K (65555)
+```
+
+Where `RequestBytes` and `RequestIO` indicate the data volume and number of 
requests in the original request. `MergedBytes` and `MergedIO` indicate the 
data volume and number of requests after merging.
+
+If you find that `MergedBytes` is much larger than `RequestBytes`, it 
indicates serious read amplification. You can adjust it through the following 
parameters:
+
+- `merge_io_read_slice_size_bytes`
+
+    Session variable, supported since version 3.1.3. Default is 8MB. If you 
find serious read amplification, you can reduce this parameter, such as to 
64KB, and observe whether the modified IO requests and query latency improve.
\ No newline at end of file
diff --git 
a/versioned_docs/version-3.x/lakehouse/best-practices/optimization.md 
b/versioned_docs/version-3.x/lakehouse/best-practices/optimization.md
new file mode 100644
index 00000000000..b7877d50e9d
--- /dev/null
+++ b/versioned_docs/version-3.x/lakehouse/best-practices/optimization.md
@@ -0,0 +1,96 @@
+---
+{
+"title": "Data Lake Query Optimization",
+"language": "en"
+}
+---
+
+This document mainly introduces optimization methods and strategies for 
querying lake data (Hive, Iceberg, Paimon, etc.).
+
+## Partition Pruning
+
+By specifying partition column conditions in queries, unnecessary partitions 
can be pruned, reducing the amount of data that needs to be read.
+
+You can use `EXPLAIN <SQL>` to view the `partition` section of `XXX_SCAN_NODE` 
to check whether partition pruning is effective and how many partitions need to 
be scanned in this query.
+
+For example:
+
+```
+0:VPAIMON_SCAN_NODE(88)
+    table: paimon_ctl.db.table
+    predicates: (user_id[#4] = 431304818)
+    inputSplitNum=15775, totalFileSize=951754154566, scanRanges=15775
+    partition=203/0
+```
+
+## Local Data Cache
+
+Data Cache accelerates subsequent queries accessing the same data by caching 
recently accessed data files from remote storage systems (HDFS or object 
storage) to local disk.
+
+The cache feature is disabled by default. Please refer to the [Data 
Cache](../data-cache.md) documentation to configure and enable it.
+
+## HDFS Read Optimization
+
+In some cases, high HDFS load may cause long delays when reading data replicas 
from HDFS, slowing down overall query efficiency. HDFS Client provides the 
Hedged Read feature.
+This feature can start another read thread to read the same data when a read 
request exceeds a certain threshold without returning, using whichever returns 
first.
+
+Note: This feature may increase the load on the HDFS cluster, please use it 
judiciously.
+
+You can enable this feature in the following way:
+
+```
+create catalog regression properties (
+    'type'='hms',
+    'hive.metastore.uris' = 'thrift://172.21.16.47:7004',
+    'dfs.client.hedged.read.threadpool.size' = '128',
+    'dfs.client.hedged.read.threshold.millis' = "500"
+);
+```
+
+- `dfs.client.hedged.read.threadpool.size`: Represents the number of threads 
used for Hedged Read, which are shared by an HDFS Client. Typically, for an 
HDFS cluster, BE nodes share one HDFS Client.
+
+- `dfs.client.hedged.read.threshold.millis`: The read threshold in 
milliseconds. When a read request exceeds this threshold without returning, 
Hedged Read is triggered.
+
+After enabling, you can see related parameters in the Query Profile:
+
+- `TotalHedgedRead`: Number of times Hedged Read was initiated.
+
+- `HedgedReadWins`: Number of successful Hedged Reads (initiated and returned 
faster than the original request)
+
+Note that these values are cumulative for a single HDFS Client, not for a 
single query. The same HDFS Client is reused by multiple queries.
+
+## Merge IO Optimization
+
+For remote storage systems like HDFS and object storage, Doris optimizes IO 
access through Merge IO technology. Merge IO technology essentially merges 
multiple adjacent small IO requests into one large IO request, which can reduce 
IOPS and increase IO throughput.
+
+For example, if the original request needs to read parts [0, 10] and [20, 50] 
of file `file1`:
+
+```
+Request Range: [0, 10], [20, 50]
+```
+
+Through Merge IO, it will be merged into one request:
+
+```
+Request Range: [0, 50]
+```
+
+In this example, two IO requests are merged into one, but it also reads some 
additional data (data between 10-20). Therefore, while Merge IO reduces the 
number of IO operations, it may bring potential read amplification issues.
+
+You can view specific Merge IO information through Query Profile:
+
+```
+- MergedSmallIO:
+    - MergedBytes: 3.00 GB
+    - MergedIO: 424
+    - RequestBytes: 2.50 GB
+    - RequestIO: 65.555K (65555)
+```
+
+Where `RequestBytes` and `RequestIO` indicate the data volume and number of 
requests in the original request. `MergedBytes` and `MergedIO` indicate the 
data volume and number of requests after merging.
+
+If you find that `MergedBytes` is much larger than `RequestBytes`, it 
indicates serious read amplification. You can adjust it through the following 
parameters:
+
+- `merge_io_read_slice_size_bytes`
+
+    Session variable, supported since version 3.1.3. Default is 8MB. If you 
find serious read amplification, you can reduce this parameter, such as to 
64KB, and observe whether the modified IO requests and query latency improve.
\ No newline at end of file
diff --git a/versioned_sidebars/version-2.1-sidebars.json 
b/versioned_sidebars/version-2.1-sidebars.json
index 6a4f6c6d7aa..d0be97fb724 100644
--- a/versioned_sidebars/version-2.1-sidebars.json
+++ b/versioned_sidebars/version-2.1-sidebars.json
@@ -439,6 +439,7 @@
                             "type": "category",
                             "label": "Lakehouse Best Practices",
                             "items": [
+                                "lakehouse/best-practices/optimization",
                                 "lakehouse/best-practices/doris-hudi",
                                 "lakehouse/best-practices/doris-paimon",
                                 "lakehouse/best-practices/doris-iceberg",
@@ -2228,4 +2229,4 @@
             ]
         }
     ]
-}
\ No newline at end of file
+}
diff --git a/versioned_sidebars/version-3.x-sidebars.json 
b/versioned_sidebars/version-3.x-sidebars.json
index 10114fe4588..8cd208a2963 100644
--- a/versioned_sidebars/version-3.x-sidebars.json
+++ b/versioned_sidebars/version-3.x-sidebars.json
@@ -464,6 +464,7 @@
                             "type": "category",
                             "label": "Lakehouse Best Practices",
                             "items": [
+                                "lakehouse/best-practices/optimization",
                                 "lakehouse/best-practices/doris-hudi",
                                 "lakehouse/best-practices/doris-paimon",
                                 "lakehouse/best-practices/doris-iceberg",
@@ -2335,4 +2336,4 @@
             ]
         }
     ]
-}
\ No newline at end of file
+}


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(doris-website) branch master updated: [opt](lake) add optimization doc (#3001)

Reply via email to