This is an automated email from the ASF dual-hosted git repository.

liaoxin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 2a1b51d18a4 [opt](load) add load faq (#2333)
2a1b51d18a4 is described below

commit 2a1b51d18a48d580b1d3d6fdb40c7b8f6cee846c
Author: Xin Liao <liao...@selectdb.com>
AuthorDate: Tue Apr 29 09:51:59 2025 +0800

    [opt](load) add load faq (#2333)
---
 docs/faq/load-faq.md                               | 140 +++++++++++++++++++++
 docs/faq/routineload-faq.md                        |  55 --------
 .../faq/{routineload-faq.md => load-faq.md}        |  97 +++++++++++++-
 .../faq/{routineload-faq.md => load-faq.md}        |  97 +++++++++++++-
 .../faq/{routineload-faq.md => load-faq.md}        |  97 +++++++++++++-
 sidebars.json                                      |   2 +-
 versioned_docs/version-2.1/faq/load-faq.md         | 140 +++++++++++++++++++++
 versioned_docs/version-2.1/faq/routineload-faq.md  |  55 --------
 versioned_docs/version-3.0/faq/load-faq.md         | 140 +++++++++++++++++++++
 versioned_docs/version-3.0/faq/routineload-faq.md  |  55 --------
 versioned_sidebars/version-2.1-sidebars.json       |   2 +-
 versioned_sidebars/version-3.0-sidebars.json       |   2 +-
 12 files changed, 696 insertions(+), 186 deletions(-)

diff --git a/docs/faq/load-faq.md b/docs/faq/load-faq.md
new file mode 100644
index 00000000000..c9fe8f19ef3
--- /dev/null
+++ b/docs/faq/load-faq.md
@@ -0,0 +1,140 @@
+---
+{
+    "title": "Load FAQ",
+    "language": "en"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## General Load FAQ
+
+### Error "[DATA_QUALITY_ERROR] Encountered unqualified data"
+**Problem Description**: Data quality error during loading.
+
+**Solution**:
+- Stream Load and Insert Into operations will return an error URL, while for 
Broker Load you can check the error URL through the `Show Load` command.
+- Use a browser or curl command to access the error URL to view the specific 
data quality error reasons.
+- Use the strict_mode and max_filter_ratio parameters to control the 
acceptable error rate.
+
+### Error "[E-235] Failed to init rowset builder"
+**Problem Description**: Error -235 occurs when the load frequency is too high 
and data hasn't been compacted in time, exceeding version limits.
+
+**Solution**:
+- Increase the batch size of data loading and reduce loading frequency.
+- Increase the `max_tablet_version_num` parameter in `be.conf`, it is 
recommended not to exceed 5000.
+
+### Error "[E-238] Too many segments in rowset"
+**Problem Description**: Error -238 occurs when the number of segments under a 
single rowset exceeds the limit.
+
+**Common Causes**:
+- The bucket number configured during table creation is too small.
+- Data skew occurs; consider using more balanced bucket keys.
+
+### Error "Transaction commit successfully, BUT data will be visible later"
+**Problem Description**: Data load is successful but temporarily not visible.
+
+**Cause**: Usually due to transaction publish delay caused by system resource 
pressure.
+
+### Error "Failed to commit kv txn [...] Transaction exceeds byte limit"
+**Problem Description**: In shared-nothing mode, too many partitions and 
tablets are involved in a single load, exceeding the transaction size limit.
+
+**Solution**:
+- Load data by partition in batches to reduce the number of partitions 
involved in a single load.
+- Optimize table structure to reduce the number of partitions and tablets.
+
+### Extra "\r" in the last column of CSV file
+**Problem Description**: Usually caused by Windows line endings.
+
+**Solution**:
+Specify the correct line delimiter: `-H "line_delimiter:\r\n"`
+
+### CSV data with quotes imported as null
+**Problem Description**: CSV data with quotes becomes null after import.
+
+**Solution**:
+Use the `trim_double_quotes` parameter to remove double quotes around fields.
+
+## Stream Load
+
+### Reasons for Slow Loading
+- Bottlenecks in CPU, IO, memory, or network card resources.
+- Slow network between client machine and BE machines, can be initially 
diagnosed through ping latency from client to BE machines.
+- Webserver thread count bottleneck, too many concurrent Stream Loads on a 
single BE (exceeding be.conf webserver_num_workers configuration) may cause 
thread count bottleneck.
+- Memtable Flush thread count bottleneck, check BE metrics 
doris_be_flush_thread_pool_queue_size to see if queuing is severe. Can be 
resolved by increasing the be.conf flush_thread_num_per_store parameter.
+
+### Handling Special Characters in Column Names
+When column names contain special characters, use single quotes with backticks 
to specify the columns parameter:
+```shell
+curl --location-trusted -u root:"" \
+    -H 'columns:`@coltime`,colint,colvar' \
+    -T a.csv \
+    -H "column_separator:," \
+    http://127.0.0.1:8030/api/db/loadtest/_stream_load
+```
+
+## Routine Load 
+
+### Major Bug Fixes
+
+| Issue Description | Trigger Conditions | Impact Scope | Temporary Solution | 
Affected Versions | Fixed Versions | Fix PR |
+|------------------|-------------------|--------------|-------------------|------------------|----------------|---------|
+| When at least one job times out while connecting to Kafka, it affects the 
import of other jobs, slowing down global Routine Load imports. | At least one 
job times out while connecting to Kafka. | Shared-nothing and shared-storage | 
Stop or manually pause the job to resolve the issue. | <2.1.9 <3.0.5 | 2.1.9 
3.0.5 | [#47530](https://github.com/apache/doris/pull/47530) |
+| User data may be lost after restarting the FE Master. | The job's offset is 
set to OFFSET_END, and the FE is restarted. | Shared-storage | Change the 
consumption mode to OFFSET_BEGINNING. | 3.0.2-3.0.4 | 3.0.5 | 
[#46149](https://github.com/apache/doris/pull/46149) |
+| A large number of small transactions are generated during import, causing 
compaction to fail and resulting in continuous -235 errors. | Doris consumes 
data too quickly, or Kafka data flow is in small batches. | Shared-nothing and 
shared-storage | Pause the Routine Load job and execute the following command: 
`ALTER ROUTINE LOAD FOR jobname FROM kafka ("property.enable.partition.eof" = 
"false");` | <2.1.8 <3.0.4 | 2.1.8 3.0.4 | 
[#45528](https://github.com/apache/doris/pull/45528), [#4494 [...]
+| Kafka third-party library destructor hangs, causing data consumption to 
fail. | Kafka topic deletion (possibly other conditions). | Shared-nothing and 
shared-storage | Restart all BE nodes. | <2.1.8 <3.0.4 | 2.1.8 3.0.4 | 
[#44913](https://github.com/apache/doris/pull/44913) |
+| Routine Load scheduling hangs. | Timeout occurs when FE aborts a transaction 
in Meta Service. | Shared-storage | Restart the FE node. | <3.0.2 | 3.0.2 | 
[#41267](https://github.com/apache/doris/pull/41267) |
+| Routine Load restart issue. | Restarting BE nodes. | Shared-nothing and 
shared-storage | Manually resume the job. | <2.1.7 <3.0.2 | 2.1.7 3.0.2 | 
[#3727](https://github.com/selectdb/selectdb-core/pull/3727) |
+
+### Default Configuration Optimizations
+
+| Optimization Content | Applied Versions | Corresponding PR |
+|---------------------|------------------|------------------|
+| Increased the timeout duration for Routine Load. | 2.1.7 3.0.3 | 
[#42042](https://github.com/apache/doris/pull/42042), 
[#40818](https://github.com/apache/doris/pull/40818) |
+| Adjusted the default value of `max_batch_interval`. | 2.1.8 3.0.3 | 
[#42491](https://github.com/apache/doris/pull/42491) |
+| Removed the restriction on `max_batch_interval`. | 2.1.5 3.0.0 | 
[#29071](https://github.com/apache/doris/pull/29071) |
+| Adjusted the default values of `max_batch_rows` and `max_batch_size`. | 
2.1.5 3.0.0 | [#36632](https://github.com/apache/doris/pull/36632) |
+
+### Observability Optimizations
+
+| Optimization Content | Applied Versions | Corresponding PR |
+|---------------------|------------------|------------------|
+| Added observability-related metrics. | 3.0.5 | 
[#48209](https://github.com/apache/doris/pull/48209), 
[#48171](https://github.com/apache/doris/pull/48171), 
[#48963](https://github.com/apache/doris/pull/48963) |
+
+### Error "failed to get latest offset"
+**Problem Description**: Routine Load cannot get the latest Kafka offset.
+
+**Common Causes**:
+- Usually due to network connectivity issues with Kafka. Verify by pinging or 
using telnet to test the Kafka domain name.
+- Timeout caused by third-party library bug, error: 
java.util.concurrent.TimeoutException: Waited X seconds
+
+### Error "failed to get partition meta: Local:'Broker transport failure"
+**Problem Description**: Routine Load cannot get Kafka Topic Partition Meta.
+
+**Common Causes**:
+- Usually due to network connectivity issues with Kafka. Verify by pinging or 
using telnet to test the Kafka domain name.
+- If using domain names, try configuring domain name mapping in /etc/hosts
+
+### Error "Broker: Offset out of range"
+**Problem Description**: The consumed offset doesn't exist in Kafka, possibly 
because it has been cleaned up by Kafka.
+
+**Solution**:
+- Need to specify a new offset for consumption, for example, set offset to 
OFFSET_BEGINNING.
+- Need to set appropriate Kafka log cleanup parameters based on import speed: 
log.retention.hours, log.retention.bytes, etc.
\ No newline at end of file
diff --git a/docs/faq/routineload-faq.md b/docs/faq/routineload-faq.md
deleted file mode 100644
index 3960f67bfe8..00000000000
--- a/docs/faq/routineload-faq.md
+++ /dev/null
@@ -1,55 +0,0 @@
----
-{
-    "title": "Routine Load FAQ",
-    "language": "en"
-}
----
-
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# Routine Load FAQ
-
-This document records common issues, bug fixes, and optimization improvements 
related to Routine Load in Doris. It will be updated periodically.
-
-## Major Bug Fixes
-
-| Issue Description                                           | Trigger 
Conditions                          | Impact Scope      | Temporary Solution    
                                     | Affected Versions | Fixed Versions | Fix 
PR                                                     |
-| ----------------------------------------------------------- | 
------------------------------------------- | ----------------- | 
---------------------------------------------------------- | ----------------- 
| -------------- | ---------------------------------------------------------- |
-| When at least one job times out while connecting to Kafka, it affects the 
import of other jobs, slowing down global Routine Load imports. | At least one 
job times out while connecting to Kafka. | Shared-nothing and shared-storage | 
Stop or manually pause the job to resolve the issue.        | <2.1.9 <3.0.5    
| 2.1.9 3.0.5   | [#47530](https://github.com/apache/doris/pull/47530)       |
-| User data may be lost after restarting the FE Master.       | The job's 
offset is set to OFFSET_END, and the FE is restarted. | Shared-storage     | 
Change the consumption mode to OFFSET_BEGINNING.           | 3.0.2-3.0.4      | 
3.0.5          | [#46149](https://github.com/apache/doris/pull/46149)       |
-| A large number of small transactions are generated during import, causing 
compaction to fail and resulting in continuous -235 errors. | Doris consumes 
data too quickly, or Kafka data flow is in small batches. | Shared-nothing and 
shared-storage | Pause the Routine Load job and execute the following command: 
`ALTER ROUTINE LOAD FOR jobname FROM kafka ("property.enable.partition.eof" = 
"false");` | <2.1.8 <3.0.4    | 2.1.8 3.0.4   | 
[#45528](https://github.com/apache/doris/pull/45528), [ [...]
-| Kafka third-party library destructor hangs, causing data consumption to 
fail. | Kafka topic deletion (possibly other conditions). | Shared-nothing and 
shared-storage | Restart all BE nodes.                                       | 
<2.1.8 <3.0.4    | 2.1.8 3.0.4   | 
[#44913](https://github.com/apache/doris/pull/44913)       |
-| Routine Load scheduling hangs.                              | Timeout occurs 
when FE aborts a transaction in Meta Service. | Shared-storage     | Restart 
the FE node.                                        | <3.0.2           | 3.0.2  
        | [#41267](https://github.com/apache/doris/pull/41267)       |
-| Routine Load restart issue.                                | Restarting BE 
nodes.                         | Shared-nothing and shared-storage | Manually 
resume the job.                                    | <2.1.7 <3.0.2    | 2.1.7 
3.0.2   | [#3727](https://github.com/selectdb/selectdb-core/pull/3727) |
-
-## Default Configuration Optimizations
-
-| Optimization Content                        | Applied Versions | 
Corresponding PR                                            |
-| ------------------------------------------- | ---------------- | 
---------------------------------------------------------- |
-| Increased the timeout duration for Routine Load. | 2.1.7 3.0.3      | 
[#42042](https://github.com/apache/doris/pull/42042), 
[#40818](https://github.com/apache/doris/pull/40818) |
-| Adjusted the default value of `max_batch_interval`. | 2.1.8 3.0.3      | 
[#42491](https://github.com/apache/doris/pull/42491)       |
-| Removed the restriction on `max_batch_interval`. | 2.1.5 3.0.0      | 
[#29071](https://github.com/apache/doris/pull/29071)       |
-| Adjusted the default values of `max_batch_rows` and `max_batch_size`. | 
2.1.5 3.0.0      | [#36632](https://github.com/apache/doris/pull/36632)       |
-
-## Observability Optimizations
-
-| Optimization Content         | Applied Versions | Corresponding PR           
                                 |
-| ---------------------------- | ---------------- | 
---------------------------------------------------------- |
-| Added observability-related metrics. | 3.0.5           | 
[#48209](https://github.com/apache/doris/pull/48209), 
[#48171](https://github.com/apache/doris/pull/48171), 
[#48963](https://github.com/apache/doris/pull/48963) |
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/routineload-faq.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/load-faq.md
similarity index 54%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/routineload-faq.md
rename to i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/load-faq.md
index 84891e72d74..35d5ffbc946 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/routineload-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/load-faq.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "Routine Load 常见问题",
+    "title": "常见导入问题",
     "language": "zh-CN"
 }
 ---
@@ -24,11 +24,75 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Routine Load 常见问题
+## 导入通用问题
 
-本文档记录了 Doris 在使用过程中与 Routine Load 相关的常见问题、Bug 修复及优化改进,并将不定期更新。
+### 报错”[DATA_QUALITY_ERROR] Encountered unqualified data“
+**问题描述**:导入报数据质量错误。
 
-## 较严重的 Bug 修复
+**解决方案**:
+- Stream Load 和 Insert Into 结果中会返回错误 URL,Broker Load 可通过 `Show Load` 命令查看对应错误 
URL。
+- 通过浏览器或 curl 命令访问错误 URL 查看具体的数量质量错误原因。
+- 通过 strict_mode 和 max_filter_ratio 参数项来控制能容忍的错误率。
+
+### 报错“[E-235] Failed to init rowset builder”
+**问题描述**:-235 错误是因为导入频率过高,数据未能及时 compaction,超过版本限制。
+
+**解决方案**:
+- 增加每批次导入数据量,降低导入频率。
+- 在 `be.conf` 中调大 `max_tablet_version_num` 参数, 建议不超过5000。
+
+### 报错“[E-238] Too many segments in rowset”
+**问题描述**:-238 错误是因为单个 rowset 下的 segment 数量超限。
+
+**常见原因**:
+- 建表时 bucket 数配置过小。
+- 数据出现倾斜,建议使用更均衡的分桶键。
+
+### 报错”Transaction commit successfully, BUT data will be visible later“
+**问题描述**:数据导入成功但暂时不可见。
+
+**原因**:通常是由于系统资源压力导致事务 publish 延迟。
+
+### 报错”Failed to commit kv txn [...] Transaction exceeds byte limit“
+**问题描述**:存算分离模式下,单次导入涉及的 partition 和 tablet 过多, 超过事务大小的限制。
+
+**解决方案**:
+- 分批按 partition 导入数据, 减小单次导入涉及到的 partition 数量。
+- 优化表结构减少 partition 和 tablet 数量。
+
+### CSV 文件最后一列出现额外的 "\r"
+**问题描述**:通常是 windows 换行符导致。
+
+**解决方案**:
+指定正确的换行符:`-H "line_delimiter:\r\n"`
+
+### CSV 带引号数据导入为 null
+**问题描述**:带引号的 CSV 数据导入后值变为 null。
+
+**解决方案**:
+使用 `trim_double_quotes` 参数去除字段外层双引号。
+
+## Stream Load
+
+### 导入慢的原因
+- CPU、IO、内存、网卡资源有瓶颈。
+- 客户端机器到 BE 机器网络慢, 通过客户端机器到 BE 机器的 Ping 时延可以做初步的判断。
+- Webserver 线程数瓶颈,单 BE 上 Stream Load 并发数太高(超过be.conf webserver_num_workers 
配置)可能导致线程数据瓶颈。
+- Memtable Flush 线程数瓶颈,通过 BE metrics 查看 doris_be_flush_thread_pool_queue_size 
看排队是否比较严重。可以适当调大 be.conf flush_thread_num_per_store 参数来解决。
+
+### 特殊字符列名处理
+列名中含有特殊字符时需要使用单引号配合反引号方式指定 columns 参数:
+```shell
+curl --location-trusted -u root:"" \
+    -H 'columns:`@coltime`,colint,colvar' \
+    -T a.csv \
+    -H "column_separator:," \
+    http://127.0.0.1:8030/api/db/loadtest/_stream_load
+```
+
+## Routine Load 
+
+### 较严重的 Bug 修复
 
 | 问题描述                                                   | 发生条件                
                   | 影响范围         | 临时解决方案                                      
         | 受影响版本      | 修复版本    | 修复 PR                                         
            |
 | ---------------------------------------------------------- | 
------------------------------------------ | ---------------- | 
---------------------------------------------------------- | ------------- | 
----------- | ---------------------------------------------------------- |
@@ -39,7 +103,7 @@ under the License.
 | Routine Load 调度卡住                                      | 当 FE 向 Meta Service 
中止事务时发生超时   | 存算分离         | 重启 FE 节点。                                          
   | <3.0.2        | 3.0.2       | 
[#41267](https://github.com/apache/doris/pull/41267)       |
 | Routine Load 重启问题                                      | 重启 BE 节点            
                   | 存算分离存算一体 | 手动恢复 Job。                                       
      | <2.1.7 <3.0.2 | 2.1.7 3.0.2 | 
[#3727](https://github.com/selectdb/selectdb-core/pull/3727) |
 
-## 默认配置优化
+### 默认配置优化
 
 | 优化内容                                 | 合入版本   | 对应 PR                        
                             |
 | ---------------------------------------- | ---------- | 
---------------------------------------------------------- |
@@ -48,8 +112,29 @@ under the License.
 | 移除了 max_batch_interval 的限制         | 2.1.5 3.0.0 | 
[#29071](https://github.com/apache/doris/pull/29071)       |
 | 调整了 max_batch_rows 和 max_batch_size 的默认值 | 2.1.5 3.0.0 | 
[#36632](https://github.com/apache/doris/pull/36632)       |
 
-## 可观测优化
+### 可观测优化
 
 | 优化内容                | 合入版本 | 对应 PR                                           
          |
 | ----------------------- | -------- | 
---------------------------------------------------------- |
 | 增加了可观测性相关的 Metrics 指标 | 3.0.5    | 
[#48209](https://github.com/apache/doris/pull/48209), 
[#48171](https://github.com/apache/doris/pull/48171), 
[#48963](https://github.com/apache/doris/pull/48963) |
+
+### 报错”failed to get latest offset“
+**问题描述**:Routine Load 无法获取 Kafka 最新的 Offset。
+
+**常见原因**:
+- 一般都是到kafka的网络不通, ping或者telnet kafka的域名确认下
+- 三方库的bug导致的获取超时,错误为:java.util.concurrent.TimeoutException: Waited X seconds
+
+### 报错”failed to get partition meta: Local:'Broker transport failure“ 
+**问题描述**:Routine Load 无法获取 Kafka Topic 的 Partition Meta。
+
+**常见原因**:
+- 一般都是到kafka的网络不通, ping或者telnet kafka的域名确认下
+- 如果使用的是域名的方式,可以在/etc/hosts 配置域名映射
+
+### 报错“Broker: Offset out of range”
+**问题描述**:消费的 offset 在 kafka 中不存在,可能是因为该 offset 已经被 kafka 清理掉了。
+
+**解决方案**:
+- 需要重新指定 offset 进行消费,例如可以指定 offset 为 OFFSET_BEGINNING。
+- 需要根据导入速度设置合理的 kafka log清理参数:log.retention.hours、log.retention.bytes等。
\ No newline at end of file
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/faq/routineload-faq.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/faq/load-faq.md
similarity index 54%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/faq/routineload-faq.md
rename to i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/faq/load-faq.md
index 84891e72d74..35d5ffbc946 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/faq/routineload-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/faq/load-faq.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "Routine Load 常见问题",
+    "title": "常见导入问题",
     "language": "zh-CN"
 }
 ---
@@ -24,11 +24,75 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Routine Load 常见问题
+## 导入通用问题
 
-本文档记录了 Doris 在使用过程中与 Routine Load 相关的常见问题、Bug 修复及优化改进,并将不定期更新。
+### 报错”[DATA_QUALITY_ERROR] Encountered unqualified data“
+**问题描述**:导入报数据质量错误。
 
-## 较严重的 Bug 修复
+**解决方案**:
+- Stream Load 和 Insert Into 结果中会返回错误 URL,Broker Load 可通过 `Show Load` 命令查看对应错误 
URL。
+- 通过浏览器或 curl 命令访问错误 URL 查看具体的数量质量错误原因。
+- 通过 strict_mode 和 max_filter_ratio 参数项来控制能容忍的错误率。
+
+### 报错“[E-235] Failed to init rowset builder”
+**问题描述**:-235 错误是因为导入频率过高,数据未能及时 compaction,超过版本限制。
+
+**解决方案**:
+- 增加每批次导入数据量,降低导入频率。
+- 在 `be.conf` 中调大 `max_tablet_version_num` 参数, 建议不超过5000。
+
+### 报错“[E-238] Too many segments in rowset”
+**问题描述**:-238 错误是因为单个 rowset 下的 segment 数量超限。
+
+**常见原因**:
+- 建表时 bucket 数配置过小。
+- 数据出现倾斜,建议使用更均衡的分桶键。
+
+### 报错”Transaction commit successfully, BUT data will be visible later“
+**问题描述**:数据导入成功但暂时不可见。
+
+**原因**:通常是由于系统资源压力导致事务 publish 延迟。
+
+### 报错”Failed to commit kv txn [...] Transaction exceeds byte limit“
+**问题描述**:存算分离模式下,单次导入涉及的 partition 和 tablet 过多, 超过事务大小的限制。
+
+**解决方案**:
+- 分批按 partition 导入数据, 减小单次导入涉及到的 partition 数量。
+- 优化表结构减少 partition 和 tablet 数量。
+
+### CSV 文件最后一列出现额外的 "\r"
+**问题描述**:通常是 windows 换行符导致。
+
+**解决方案**:
+指定正确的换行符:`-H "line_delimiter:\r\n"`
+
+### CSV 带引号数据导入为 null
+**问题描述**:带引号的 CSV 数据导入后值变为 null。
+
+**解决方案**:
+使用 `trim_double_quotes` 参数去除字段外层双引号。
+
+## Stream Load
+
+### 导入慢的原因
+- CPU、IO、内存、网卡资源有瓶颈。
+- 客户端机器到 BE 机器网络慢, 通过客户端机器到 BE 机器的 Ping 时延可以做初步的判断。
+- Webserver 线程数瓶颈,单 BE 上 Stream Load 并发数太高(超过be.conf webserver_num_workers 
配置)可能导致线程数据瓶颈。
+- Memtable Flush 线程数瓶颈,通过 BE metrics 查看 doris_be_flush_thread_pool_queue_size 
看排队是否比较严重。可以适当调大 be.conf flush_thread_num_per_store 参数来解决。
+
+### 特殊字符列名处理
+列名中含有特殊字符时需要使用单引号配合反引号方式指定 columns 参数:
+```shell
+curl --location-trusted -u root:"" \
+    -H 'columns:`@coltime`,colint,colvar' \
+    -T a.csv \
+    -H "column_separator:," \
+    http://127.0.0.1:8030/api/db/loadtest/_stream_load
+```
+
+## Routine Load 
+
+### 较严重的 Bug 修复
 
 | 问题描述                                                   | 发生条件                
                   | 影响范围         | 临时解决方案                                      
         | 受影响版本      | 修复版本    | 修复 PR                                         
            |
 | ---------------------------------------------------------- | 
------------------------------------------ | ---------------- | 
---------------------------------------------------------- | ------------- | 
----------- | ---------------------------------------------------------- |
@@ -39,7 +103,7 @@ under the License.
 | Routine Load 调度卡住                                      | 当 FE 向 Meta Service 
中止事务时发生超时   | 存算分离         | 重启 FE 节点。                                          
   | <3.0.2        | 3.0.2       | 
[#41267](https://github.com/apache/doris/pull/41267)       |
 | Routine Load 重启问题                                      | 重启 BE 节点            
                   | 存算分离存算一体 | 手动恢复 Job。                                       
      | <2.1.7 <3.0.2 | 2.1.7 3.0.2 | 
[#3727](https://github.com/selectdb/selectdb-core/pull/3727) |
 
-## 默认配置优化
+### 默认配置优化
 
 | 优化内容                                 | 合入版本   | 对应 PR                        
                             |
 | ---------------------------------------- | ---------- | 
---------------------------------------------------------- |
@@ -48,8 +112,29 @@ under the License.
 | 移除了 max_batch_interval 的限制         | 2.1.5 3.0.0 | 
[#29071](https://github.com/apache/doris/pull/29071)       |
 | 调整了 max_batch_rows 和 max_batch_size 的默认值 | 2.1.5 3.0.0 | 
[#36632](https://github.com/apache/doris/pull/36632)       |
 
-## 可观测优化
+### 可观测优化
 
 | 优化内容                | 合入版本 | 对应 PR                                           
          |
 | ----------------------- | -------- | 
---------------------------------------------------------- |
 | 增加了可观测性相关的 Metrics 指标 | 3.0.5    | 
[#48209](https://github.com/apache/doris/pull/48209), 
[#48171](https://github.com/apache/doris/pull/48171), 
[#48963](https://github.com/apache/doris/pull/48963) |
+
+### 报错”failed to get latest offset“
+**问题描述**:Routine Load 无法获取 Kafka 最新的 Offset。
+
+**常见原因**:
+- 一般都是到kafka的网络不通, ping或者telnet kafka的域名确认下
+- 三方库的bug导致的获取超时,错误为:java.util.concurrent.TimeoutException: Waited X seconds
+
+### 报错”failed to get partition meta: Local:'Broker transport failure“ 
+**问题描述**:Routine Load 无法获取 Kafka Topic 的 Partition Meta。
+
+**常见原因**:
+- 一般都是到kafka的网络不通, ping或者telnet kafka的域名确认下
+- 如果使用的是域名的方式,可以在/etc/hosts 配置域名映射
+
+### 报错“Broker: Offset out of range”
+**问题描述**:消费的 offset 在 kafka 中不存在,可能是因为该 offset 已经被 kafka 清理掉了。
+
+**解决方案**:
+- 需要重新指定 offset 进行消费,例如可以指定 offset 为 OFFSET_BEGINNING。
+- 需要根据导入速度设置合理的 kafka log清理参数:log.retention.hours、log.retention.bytes等。
\ No newline at end of file
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/faq/routineload-faq.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/faq/load-faq.md
similarity index 54%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/faq/routineload-faq.md
rename to i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/faq/load-faq.md
index 84891e72d74..35d5ffbc946 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/faq/routineload-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/faq/load-faq.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "Routine Load 常见问题",
+    "title": "常见导入问题",
     "language": "zh-CN"
 }
 ---
@@ -24,11 +24,75 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Routine Load 常见问题
+## 导入通用问题
 
-本文档记录了 Doris 在使用过程中与 Routine Load 相关的常见问题、Bug 修复及优化改进,并将不定期更新。
+### 报错”[DATA_QUALITY_ERROR] Encountered unqualified data“
+**问题描述**:导入报数据质量错误。
 
-## 较严重的 Bug 修复
+**解决方案**:
+- Stream Load 和 Insert Into 结果中会返回错误 URL,Broker Load 可通过 `Show Load` 命令查看对应错误 
URL。
+- 通过浏览器或 curl 命令访问错误 URL 查看具体的数量质量错误原因。
+- 通过 strict_mode 和 max_filter_ratio 参数项来控制能容忍的错误率。
+
+### 报错“[E-235] Failed to init rowset builder”
+**问题描述**:-235 错误是因为导入频率过高,数据未能及时 compaction,超过版本限制。
+
+**解决方案**:
+- 增加每批次导入数据量,降低导入频率。
+- 在 `be.conf` 中调大 `max_tablet_version_num` 参数, 建议不超过5000。
+
+### 报错“[E-238] Too many segments in rowset”
+**问题描述**:-238 错误是因为单个 rowset 下的 segment 数量超限。
+
+**常见原因**:
+- 建表时 bucket 数配置过小。
+- 数据出现倾斜,建议使用更均衡的分桶键。
+
+### 报错”Transaction commit successfully, BUT data will be visible later“
+**问题描述**:数据导入成功但暂时不可见。
+
+**原因**:通常是由于系统资源压力导致事务 publish 延迟。
+
+### 报错”Failed to commit kv txn [...] Transaction exceeds byte limit“
+**问题描述**:存算分离模式下,单次导入涉及的 partition 和 tablet 过多, 超过事务大小的限制。
+
+**解决方案**:
+- 分批按 partition 导入数据, 减小单次导入涉及到的 partition 数量。
+- 优化表结构减少 partition 和 tablet 数量。
+
+### CSV 文件最后一列出现额外的 "\r"
+**问题描述**:通常是 windows 换行符导致。
+
+**解决方案**:
+指定正确的换行符:`-H "line_delimiter:\r\n"`
+
+### CSV 带引号数据导入为 null
+**问题描述**:带引号的 CSV 数据导入后值变为 null。
+
+**解决方案**:
+使用 `trim_double_quotes` 参数去除字段外层双引号。
+
+## Stream Load
+
+### 导入慢的原因
+- CPU、IO、内存、网卡资源有瓶颈。
+- 客户端机器到 BE 机器网络慢, 通过客户端机器到 BE 机器的 Ping 时延可以做初步的判断。
+- Webserver 线程数瓶颈,单 BE 上 Stream Load 并发数太高(超过be.conf webserver_num_workers 
配置)可能导致线程数据瓶颈。
+- Memtable Flush 线程数瓶颈,通过 BE metrics 查看 doris_be_flush_thread_pool_queue_size 
看排队是否比较严重。可以适当调大 be.conf flush_thread_num_per_store 参数来解决。
+
+### 特殊字符列名处理
+列名中含有特殊字符时需要使用单引号配合反引号方式指定 columns 参数:
+```shell
+curl --location-trusted -u root:"" \
+    -H 'columns:`@coltime`,colint,colvar' \
+    -T a.csv \
+    -H "column_separator:," \
+    http://127.0.0.1:8030/api/db/loadtest/_stream_load
+```
+
+## Routine Load 
+
+### 较严重的 Bug 修复
 
 | 问题描述                                                   | 发生条件                
                   | 影响范围         | 临时解决方案                                      
         | 受影响版本      | 修复版本    | 修复 PR                                         
            |
 | ---------------------------------------------------------- | 
------------------------------------------ | ---------------- | 
---------------------------------------------------------- | ------------- | 
----------- | ---------------------------------------------------------- |
@@ -39,7 +103,7 @@ under the License.
 | Routine Load 调度卡住                                      | 当 FE 向 Meta Service 
中止事务时发生超时   | 存算分离         | 重启 FE 节点。                                          
   | <3.0.2        | 3.0.2       | 
[#41267](https://github.com/apache/doris/pull/41267)       |
 | Routine Load 重启问题                                      | 重启 BE 节点            
                   | 存算分离存算一体 | 手动恢复 Job。                                       
      | <2.1.7 <3.0.2 | 2.1.7 3.0.2 | 
[#3727](https://github.com/selectdb/selectdb-core/pull/3727) |
 
-## 默认配置优化
+### 默认配置优化
 
 | 优化内容                                 | 合入版本   | 对应 PR                        
                             |
 | ---------------------------------------- | ---------- | 
---------------------------------------------------------- |
@@ -48,8 +112,29 @@ under the License.
 | 移除了 max_batch_interval 的限制         | 2.1.5 3.0.0 | 
[#29071](https://github.com/apache/doris/pull/29071)       |
 | 调整了 max_batch_rows 和 max_batch_size 的默认值 | 2.1.5 3.0.0 | 
[#36632](https://github.com/apache/doris/pull/36632)       |
 
-## 可观测优化
+### 可观测优化
 
 | 优化内容                | 合入版本 | 对应 PR                                           
          |
 | ----------------------- | -------- | 
---------------------------------------------------------- |
 | 增加了可观测性相关的 Metrics 指标 | 3.0.5    | 
[#48209](https://github.com/apache/doris/pull/48209), 
[#48171](https://github.com/apache/doris/pull/48171), 
[#48963](https://github.com/apache/doris/pull/48963) |
+
+### 报错”failed to get latest offset“
+**问题描述**:Routine Load 无法获取 Kafka 最新的 Offset。
+
+**常见原因**:
+- 一般都是到kafka的网络不通, ping或者telnet kafka的域名确认下
+- 三方库的bug导致的获取超时,错误为:java.util.concurrent.TimeoutException: Waited X seconds
+
+### 报错”failed to get partition meta: Local:'Broker transport failure“ 
+**问题描述**:Routine Load 无法获取 Kafka Topic 的 Partition Meta。
+
+**常见原因**:
+- 一般都是到kafka的网络不通, ping或者telnet kafka的域名确认下
+- 如果使用的是域名的方式,可以在/etc/hosts 配置域名映射
+
+### 报错“Broker: Offset out of range”
+**问题描述**:消费的 offset 在 kafka 中不存在,可能是因为该 offset 已经被 kafka 清理掉了。
+
+**解决方案**:
+- 需要重新指定 offset 进行消费,例如可以指定 offset 为 OFFSET_BEGINNING。
+- 需要根据导入速度设置合理的 kafka log清理参数:log.retention.hours、log.retention.bytes等。
\ No newline at end of file
diff --git a/sidebars.json b/sidebars.json
index 87a9b6b274a..6ceb86500ed 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -828,7 +828,7 @@
                 "faq/lakehouse-faq",
                 "faq/bi-faq",
                 "faq/correctness-faq",
-                "faq/routineload-faq"
+                "faq/load-faq"
             ]
         },
         {
diff --git a/versioned_docs/version-2.1/faq/load-faq.md 
b/versioned_docs/version-2.1/faq/load-faq.md
new file mode 100644
index 00000000000..c9fe8f19ef3
--- /dev/null
+++ b/versioned_docs/version-2.1/faq/load-faq.md
@@ -0,0 +1,140 @@
+---
+{
+    "title": "Load FAQ",
+    "language": "en"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## General Load FAQ
+
+### Error "[DATA_QUALITY_ERROR] Encountered unqualified data"
+**Problem Description**: Data quality error during loading.
+
+**Solution**:
+- Stream Load and Insert Into operations will return an error URL, while for 
Broker Load you can check the error URL through the `Show Load` command.
+- Use a browser or curl command to access the error URL to view the specific 
data quality error reasons.
+- Use the strict_mode and max_filter_ratio parameters to control the 
acceptable error rate.
+
+### Error "[E-235] Failed to init rowset builder"
+**Problem Description**: Error -235 occurs when the load frequency is too high 
and data hasn't been compacted in time, exceeding version limits.
+
+**Solution**:
+- Increase the batch size of data loading and reduce loading frequency.
+- Increase the `max_tablet_version_num` parameter in `be.conf`, it is 
recommended not to exceed 5000.
+
+### Error "[E-238] Too many segments in rowset"
+**Problem Description**: Error -238 occurs when the number of segments under a 
single rowset exceeds the limit.
+
+**Common Causes**:
+- The bucket number configured during table creation is too small.
+- Data skew occurs; consider using more balanced bucket keys.
+
+### Error "Transaction commit successfully, BUT data will be visible later"
+**Problem Description**: Data load is successful but temporarily not visible.
+
+**Cause**: Usually due to transaction publish delay caused by system resource 
pressure.
+
+### Error "Failed to commit kv txn [...] Transaction exceeds byte limit"
+**Problem Description**: In shared-nothing mode, too many partitions and 
tablets are involved in a single load, exceeding the transaction size limit.
+
+**Solution**:
+- Load data by partition in batches to reduce the number of partitions 
involved in a single load.
+- Optimize table structure to reduce the number of partitions and tablets.
+
+### Extra "\r" in the last column of CSV file
+**Problem Description**: Usually caused by Windows line endings.
+
+**Solution**:
+Specify the correct line delimiter: `-H "line_delimiter:\r\n"`
+
+### CSV data with quotes imported as null
+**Problem Description**: CSV data with quotes becomes null after import.
+
+**Solution**:
+Use the `trim_double_quotes` parameter to remove double quotes around fields.
+
+## Stream Load
+
+### Reasons for Slow Loading
+- Bottlenecks in CPU, IO, memory, or network card resources.
+- Slow network between client machine and BE machines, can be initially 
diagnosed through ping latency from client to BE machines.
+- Webserver thread count bottleneck, too many concurrent Stream Loads on a 
single BE (exceeding be.conf webserver_num_workers configuration) may cause 
thread count bottleneck.
+- Memtable Flush thread count bottleneck, check BE metrics 
doris_be_flush_thread_pool_queue_size to see if queuing is severe. Can be 
resolved by increasing the be.conf flush_thread_num_per_store parameter.
+
+### Handling Special Characters in Column Names
+When column names contain special characters, use single quotes with backticks 
to specify the columns parameter:
+```shell
+curl --location-trusted -u root:"" \
+    -H 'columns:`@coltime`,colint,colvar' \
+    -T a.csv \
+    -H "column_separator:," \
+    http://127.0.0.1:8030/api/db/loadtest/_stream_load
+```
+
+## Routine Load 
+
+### Major Bug Fixes
+
+| Issue Description | Trigger Conditions | Impact Scope | Temporary Solution | 
Affected Versions | Fixed Versions | Fix PR |
+|------------------|-------------------|--------------|-------------------|------------------|----------------|---------|
+| When at least one job times out while connecting to Kafka, it affects the 
import of other jobs, slowing down global Routine Load imports. | At least one 
job times out while connecting to Kafka. | Shared-nothing and shared-storage | 
Stop or manually pause the job to resolve the issue. | <2.1.9 <3.0.5 | 2.1.9 
3.0.5 | [#47530](https://github.com/apache/doris/pull/47530) |
+| User data may be lost after restarting the FE Master. | The job's offset is 
set to OFFSET_END, and the FE is restarted. | Shared-storage | Change the 
consumption mode to OFFSET_BEGINNING. | 3.0.2-3.0.4 | 3.0.5 | 
[#46149](https://github.com/apache/doris/pull/46149) |
+| A large number of small transactions are generated during import, causing 
compaction to fail and resulting in continuous -235 errors. | Doris consumes 
data too quickly, or Kafka data flow is in small batches. | Shared-nothing and 
shared-storage | Pause the Routine Load job and execute the following command: 
`ALTER ROUTINE LOAD FOR jobname FROM kafka ("property.enable.partition.eof" = 
"false");` | <2.1.8 <3.0.4 | 2.1.8 3.0.4 | 
[#45528](https://github.com/apache/doris/pull/45528), [#4494 [...]
+| Kafka third-party library destructor hangs, causing data consumption to 
fail. | Kafka topic deletion (possibly other conditions). | Shared-nothing and 
shared-storage | Restart all BE nodes. | <2.1.8 <3.0.4 | 2.1.8 3.0.4 | 
[#44913](https://github.com/apache/doris/pull/44913) |
+| Routine Load scheduling hangs. | Timeout occurs when FE aborts a transaction 
in Meta Service. | Shared-storage | Restart the FE node. | <3.0.2 | 3.0.2 | 
[#41267](https://github.com/apache/doris/pull/41267) |
+| Routine Load restart issue. | Restarting BE nodes. | Shared-nothing and 
shared-storage | Manually resume the job. | <2.1.7 <3.0.2 | 2.1.7 3.0.2 | 
[#3727](https://github.com/selectdb/selectdb-core/pull/3727) |
+
+### Default Configuration Optimizations
+
+| Optimization Content | Applied Versions | Corresponding PR |
+|---------------------|------------------|------------------|
+| Increased the timeout duration for Routine Load. | 2.1.7 3.0.3 | 
[#42042](https://github.com/apache/doris/pull/42042), 
[#40818](https://github.com/apache/doris/pull/40818) |
+| Adjusted the default value of `max_batch_interval`. | 2.1.8 3.0.3 | 
[#42491](https://github.com/apache/doris/pull/42491) |
+| Removed the restriction on `max_batch_interval`. | 2.1.5 3.0.0 | 
[#29071](https://github.com/apache/doris/pull/29071) |
+| Adjusted the default values of `max_batch_rows` and `max_batch_size`. | 
2.1.5 3.0.0 | [#36632](https://github.com/apache/doris/pull/36632) |
+
+### Observability Optimizations
+
+| Optimization Content | Applied Versions | Corresponding PR |
+|---------------------|------------------|------------------|
+| Added observability-related metrics. | 3.0.5 | 
[#48209](https://github.com/apache/doris/pull/48209), 
[#48171](https://github.com/apache/doris/pull/48171), 
[#48963](https://github.com/apache/doris/pull/48963) |
+
+### Error "failed to get latest offset"
+**Problem Description**: Routine Load cannot get the latest Kafka offset.
+
+**Common Causes**:
+- Usually due to network connectivity issues with Kafka. Verify by pinging or 
using telnet to test the Kafka domain name.
+- Timeout caused by third-party library bug, error: 
java.util.concurrent.TimeoutException: Waited X seconds
+
+### Error "failed to get partition meta: Local:'Broker transport failure"
+**Problem Description**: Routine Load cannot get Kafka Topic Partition Meta.
+
+**Common Causes**:
+- Usually due to network connectivity issues with Kafka. Verify by pinging or 
using telnet to test the Kafka domain name.
+- If using domain names, try configuring domain name mapping in /etc/hosts
+
+### Error "Broker: Offset out of range"
+**Problem Description**: The consumed offset doesn't exist in Kafka, possibly 
because it has been cleaned up by Kafka.
+
+**Solution**:
+- Need to specify a new offset for consumption, for example, set offset to 
OFFSET_BEGINNING.
+- Need to set appropriate Kafka log cleanup parameters based on import speed: 
log.retention.hours, log.retention.bytes, etc.
\ No newline at end of file
diff --git a/versioned_docs/version-2.1/faq/routineload-faq.md 
b/versioned_docs/version-2.1/faq/routineload-faq.md
deleted file mode 100644
index 3960f67bfe8..00000000000
--- a/versioned_docs/version-2.1/faq/routineload-faq.md
+++ /dev/null
@@ -1,55 +0,0 @@
----
-{
-    "title": "Routine Load FAQ",
-    "language": "en"
-}
----
-
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# Routine Load FAQ
-
-This document records common issues, bug fixes, and optimization improvements 
related to Routine Load in Doris. It will be updated periodically.
-
-## Major Bug Fixes
-
-| Issue Description                                           | Trigger 
Conditions                          | Impact Scope      | Temporary Solution    
                                     | Affected Versions | Fixed Versions | Fix 
PR                                                     |
-| ----------------------------------------------------------- | 
------------------------------------------- | ----------------- | 
---------------------------------------------------------- | ----------------- 
| -------------- | ---------------------------------------------------------- |
-| When at least one job times out while connecting to Kafka, it affects the 
import of other jobs, slowing down global Routine Load imports. | At least one 
job times out while connecting to Kafka. | Shared-nothing and shared-storage | 
Stop or manually pause the job to resolve the issue.        | <2.1.9 <3.0.5    
| 2.1.9 3.0.5   | [#47530](https://github.com/apache/doris/pull/47530)       |
-| User data may be lost after restarting the FE Master.       | The job's 
offset is set to OFFSET_END, and the FE is restarted. | Shared-storage     | 
Change the consumption mode to OFFSET_BEGINNING.           | 3.0.2-3.0.4      | 
3.0.5          | [#46149](https://github.com/apache/doris/pull/46149)       |
-| A large number of small transactions are generated during import, causing 
compaction to fail and resulting in continuous -235 errors. | Doris consumes 
data too quickly, or Kafka data flow is in small batches. | Shared-nothing and 
shared-storage | Pause the Routine Load job and execute the following command: 
`ALTER ROUTINE LOAD FOR jobname FROM kafka ("property.enable.partition.eof" = 
"false");` | <2.1.8 <3.0.4    | 2.1.8 3.0.4   | 
[#45528](https://github.com/apache/doris/pull/45528), [ [...]
-| Kafka third-party library destructor hangs, causing data consumption to 
fail. | Kafka topic deletion (possibly other conditions). | Shared-nothing and 
shared-storage | Restart all BE nodes.                                       | 
<2.1.8 <3.0.4    | 2.1.8 3.0.4   | 
[#44913](https://github.com/apache/doris/pull/44913)       |
-| Routine Load scheduling hangs.                              | Timeout occurs 
when FE aborts a transaction in Meta Service. | Shared-storage     | Restart 
the FE node.                                        | <3.0.2           | 3.0.2  
        | [#41267](https://github.com/apache/doris/pull/41267)       |
-| Routine Load restart issue.                                | Restarting BE 
nodes.                         | Shared-nothing and shared-storage | Manually 
resume the job.                                    | <2.1.7 <3.0.2    | 2.1.7 
3.0.2   | [#3727](https://github.com/selectdb/selectdb-core/pull/3727) |
-
-## Default Configuration Optimizations
-
-| Optimization Content                        | Applied Versions | 
Corresponding PR                                            |
-| ------------------------------------------- | ---------------- | 
---------------------------------------------------------- |
-| Increased the timeout duration for Routine Load. | 2.1.7 3.0.3      | 
[#42042](https://github.com/apache/doris/pull/42042), 
[#40818](https://github.com/apache/doris/pull/40818) |
-| Adjusted the default value of `max_batch_interval`. | 2.1.8 3.0.3      | 
[#42491](https://github.com/apache/doris/pull/42491)       |
-| Removed the restriction on `max_batch_interval`. | 2.1.5 3.0.0      | 
[#29071](https://github.com/apache/doris/pull/29071)       |
-| Adjusted the default values of `max_batch_rows` and `max_batch_size`. | 
2.1.5 3.0.0      | [#36632](https://github.com/apache/doris/pull/36632)       |
-
-## Observability Optimizations
-
-| Optimization Content         | Applied Versions | Corresponding PR           
                                 |
-| ---------------------------- | ---------------- | 
---------------------------------------------------------- |
-| Added observability-related metrics. | 3.0.5           | 
[#48209](https://github.com/apache/doris/pull/48209), 
[#48171](https://github.com/apache/doris/pull/48171), 
[#48963](https://github.com/apache/doris/pull/48963) |
diff --git a/versioned_docs/version-3.0/faq/load-faq.md 
b/versioned_docs/version-3.0/faq/load-faq.md
new file mode 100644
index 00000000000..c9fe8f19ef3
--- /dev/null
+++ b/versioned_docs/version-3.0/faq/load-faq.md
@@ -0,0 +1,140 @@
+---
+{
+    "title": "Load FAQ",
+    "language": "en"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## General Load FAQ
+
+### Error "[DATA_QUALITY_ERROR] Encountered unqualified data"
+**Problem Description**: Data quality error during loading.
+
+**Solution**:
+- Stream Load and Insert Into operations will return an error URL, while for 
Broker Load you can check the error URL through the `Show Load` command.
+- Use a browser or curl command to access the error URL to view the specific 
data quality error reasons.
+- Use the strict_mode and max_filter_ratio parameters to control the 
acceptable error rate.
+
+### Error "[E-235] Failed to init rowset builder"
+**Problem Description**: Error -235 occurs when the load frequency is too high 
and data hasn't been compacted in time, exceeding version limits.
+
+**Solution**:
+- Increase the batch size of data loading and reduce loading frequency.
+- Increase the `max_tablet_version_num` parameter in `be.conf`, it is 
recommended not to exceed 5000.
+
+### Error "[E-238] Too many segments in rowset"
+**Problem Description**: Error -238 occurs when the number of segments under a 
single rowset exceeds the limit.
+
+**Common Causes**:
+- The bucket number configured during table creation is too small.
+- Data skew occurs; consider using more balanced bucket keys.
+
+### Error "Transaction commit successfully, BUT data will be visible later"
+**Problem Description**: Data load is successful but temporarily not visible.
+
+**Cause**: Usually due to transaction publish delay caused by system resource 
pressure.
+
+### Error "Failed to commit kv txn [...] Transaction exceeds byte limit"
+**Problem Description**: In shared-nothing mode, too many partitions and 
tablets are involved in a single load, exceeding the transaction size limit.
+
+**Solution**:
+- Load data by partition in batches to reduce the number of partitions 
involved in a single load.
+- Optimize table structure to reduce the number of partitions and tablets.
+
+### Extra "\r" in the last column of CSV file
+**Problem Description**: Usually caused by Windows line endings.
+
+**Solution**:
+Specify the correct line delimiter: `-H "line_delimiter:\r\n"`
+
+### CSV data with quotes imported as null
+**Problem Description**: CSV data with quotes becomes null after import.
+
+**Solution**:
+Use the `trim_double_quotes` parameter to remove double quotes around fields.
+
+## Stream Load
+
+### Reasons for Slow Loading
+- Bottlenecks in CPU, IO, memory, or network card resources.
+- Slow network between client machine and BE machines, can be initially 
diagnosed through ping latency from client to BE machines.
+- Webserver thread count bottleneck, too many concurrent Stream Loads on a 
single BE (exceeding be.conf webserver_num_workers configuration) may cause 
thread count bottleneck.
+- Memtable Flush thread count bottleneck, check BE metrics 
doris_be_flush_thread_pool_queue_size to see if queuing is severe. Can be 
resolved by increasing the be.conf flush_thread_num_per_store parameter.
+
+### Handling Special Characters in Column Names
+When column names contain special characters, use single quotes with backticks 
to specify the columns parameter:
+```shell
+curl --location-trusted -u root:"" \
+    -H 'columns:`@coltime`,colint,colvar' \
+    -T a.csv \
+    -H "column_separator:," \
+    http://127.0.0.1:8030/api/db/loadtest/_stream_load
+```
+
+## Routine Load 
+
+### Major Bug Fixes
+
+| Issue Description | Trigger Conditions | Impact Scope | Temporary Solution | 
Affected Versions | Fixed Versions | Fix PR |
+|------------------|-------------------|--------------|-------------------|------------------|----------------|---------|
+| When at least one job times out while connecting to Kafka, it affects the 
import of other jobs, slowing down global Routine Load imports. | At least one 
job times out while connecting to Kafka. | Shared-nothing and shared-storage | 
Stop or manually pause the job to resolve the issue. | <2.1.9 <3.0.5 | 2.1.9 
3.0.5 | [#47530](https://github.com/apache/doris/pull/47530) |
+| User data may be lost after restarting the FE Master. | The job's offset is 
set to OFFSET_END, and the FE is restarted. | Shared-storage | Change the 
consumption mode to OFFSET_BEGINNING. | 3.0.2-3.0.4 | 3.0.5 | 
[#46149](https://github.com/apache/doris/pull/46149) |
+| A large number of small transactions are generated during import, causing 
compaction to fail and resulting in continuous -235 errors. | Doris consumes 
data too quickly, or Kafka data flow is in small batches. | Shared-nothing and 
shared-storage | Pause the Routine Load job and execute the following command: 
`ALTER ROUTINE LOAD FOR jobname FROM kafka ("property.enable.partition.eof" = 
"false");` | <2.1.8 <3.0.4 | 2.1.8 3.0.4 | 
[#45528](https://github.com/apache/doris/pull/45528), [#4494 [...]
+| Kafka third-party library destructor hangs, causing data consumption to 
fail. | Kafka topic deletion (possibly other conditions). | Shared-nothing and 
shared-storage | Restart all BE nodes. | <2.1.8 <3.0.4 | 2.1.8 3.0.4 | 
[#44913](https://github.com/apache/doris/pull/44913) |
+| Routine Load scheduling hangs. | Timeout occurs when FE aborts a transaction 
in Meta Service. | Shared-storage | Restart the FE node. | <3.0.2 | 3.0.2 | 
[#41267](https://github.com/apache/doris/pull/41267) |
+| Routine Load restart issue. | Restarting BE nodes. | Shared-nothing and 
shared-storage | Manually resume the job. | <2.1.7 <3.0.2 | 2.1.7 3.0.2 | 
[#3727](https://github.com/selectdb/selectdb-core/pull/3727) |
+
+### Default Configuration Optimizations
+
+| Optimization Content | Applied Versions | Corresponding PR |
+|---------------------|------------------|------------------|
+| Increased the timeout duration for Routine Load. | 2.1.7 3.0.3 | 
[#42042](https://github.com/apache/doris/pull/42042), 
[#40818](https://github.com/apache/doris/pull/40818) |
+| Adjusted the default value of `max_batch_interval`. | 2.1.8 3.0.3 | 
[#42491](https://github.com/apache/doris/pull/42491) |
+| Removed the restriction on `max_batch_interval`. | 2.1.5 3.0.0 | 
[#29071](https://github.com/apache/doris/pull/29071) |
+| Adjusted the default values of `max_batch_rows` and `max_batch_size`. | 
2.1.5 3.0.0 | [#36632](https://github.com/apache/doris/pull/36632) |
+
+### Observability Optimizations
+
+| Optimization Content | Applied Versions | Corresponding PR |
+|---------------------|------------------|------------------|
+| Added observability-related metrics. | 3.0.5 | 
[#48209](https://github.com/apache/doris/pull/48209), 
[#48171](https://github.com/apache/doris/pull/48171), 
[#48963](https://github.com/apache/doris/pull/48963) |
+
+### Error "failed to get latest offset"
+**Problem Description**: Routine Load cannot get the latest Kafka offset.
+
+**Common Causes**:
+- Usually due to network connectivity issues with Kafka. Verify by pinging or 
using telnet to test the Kafka domain name.
+- Timeout caused by third-party library bug, error: 
java.util.concurrent.TimeoutException: Waited X seconds
+
+### Error "failed to get partition meta: Local:'Broker transport failure"
+**Problem Description**: Routine Load cannot get Kafka Topic Partition Meta.
+
+**Common Causes**:
+- Usually due to network connectivity issues with Kafka. Verify by pinging or 
using telnet to test the Kafka domain name.
+- If using domain names, try configuring domain name mapping in /etc/hosts
+
+### Error "Broker: Offset out of range"
+**Problem Description**: The consumed offset doesn't exist in Kafka, possibly 
because it has been cleaned up by Kafka.
+
+**Solution**:
+- Need to specify a new offset for consumption, for example, set offset to 
OFFSET_BEGINNING.
+- Need to set appropriate Kafka log cleanup parameters based on import speed: 
log.retention.hours, log.retention.bytes, etc.
\ No newline at end of file
diff --git a/versioned_docs/version-3.0/faq/routineload-faq.md 
b/versioned_docs/version-3.0/faq/routineload-faq.md
deleted file mode 100644
index 3960f67bfe8..00000000000
--- a/versioned_docs/version-3.0/faq/routineload-faq.md
+++ /dev/null
@@ -1,55 +0,0 @@
----
-{
-    "title": "Routine Load FAQ",
-    "language": "en"
-}
----
-
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# Routine Load FAQ
-
-This document records common issues, bug fixes, and optimization improvements 
related to Routine Load in Doris. It will be updated periodically.
-
-## Major Bug Fixes
-
-| Issue Description                                           | Trigger 
Conditions                          | Impact Scope      | Temporary Solution    
                                     | Affected Versions | Fixed Versions | Fix 
PR                                                     |
-| ----------------------------------------------------------- | 
------------------------------------------- | ----------------- | 
---------------------------------------------------------- | ----------------- 
| -------------- | ---------------------------------------------------------- |
-| When at least one job times out while connecting to Kafka, it affects the 
import of other jobs, slowing down global Routine Load imports. | At least one 
job times out while connecting to Kafka. | Shared-nothing and shared-storage | 
Stop or manually pause the job to resolve the issue.        | <2.1.9 <3.0.5    
| 2.1.9 3.0.5   | [#47530](https://github.com/apache/doris/pull/47530)       |
-| User data may be lost after restarting the FE Master.       | The job's 
offset is set to OFFSET_END, and the FE is restarted. | Shared-storage     | 
Change the consumption mode to OFFSET_BEGINNING.           | 3.0.2-3.0.4      | 
3.0.5          | [#46149](https://github.com/apache/doris/pull/46149)       |
-| A large number of small transactions are generated during import, causing 
compaction to fail and resulting in continuous -235 errors. | Doris consumes 
data too quickly, or Kafka data flow is in small batches. | Shared-nothing and 
shared-storage | Pause the Routine Load job and execute the following command: 
`ALTER ROUTINE LOAD FOR jobname FROM kafka ("property.enable.partition.eof" = 
"false");` | <2.1.8 <3.0.4    | 2.1.8 3.0.4   | 
[#45528](https://github.com/apache/doris/pull/45528), [ [...]
-| Kafka third-party library destructor hangs, causing data consumption to 
fail. | Kafka topic deletion (possibly other conditions). | Shared-nothing and 
shared-storage | Restart all BE nodes.                                       | 
<2.1.8 <3.0.4    | 2.1.8 3.0.4   | 
[#44913](https://github.com/apache/doris/pull/44913)       |
-| Routine Load scheduling hangs.                              | Timeout occurs 
when FE aborts a transaction in Meta Service. | Shared-storage     | Restart 
the FE node.                                        | <3.0.2           | 3.0.2  
        | [#41267](https://github.com/apache/doris/pull/41267)       |
-| Routine Load restart issue.                                | Restarting BE 
nodes.                         | Shared-nothing and shared-storage | Manually 
resume the job.                                    | <2.1.7 <3.0.2    | 2.1.7 
3.0.2   | [#3727](https://github.com/selectdb/selectdb-core/pull/3727) |
-
-## Default Configuration Optimizations
-
-| Optimization Content                        | Applied Versions | 
Corresponding PR                                            |
-| ------------------------------------------- | ---------------- | 
---------------------------------------------------------- |
-| Increased the timeout duration for Routine Load. | 2.1.7 3.0.3      | 
[#42042](https://github.com/apache/doris/pull/42042), 
[#40818](https://github.com/apache/doris/pull/40818) |
-| Adjusted the default value of `max_batch_interval`. | 2.1.8 3.0.3      | 
[#42491](https://github.com/apache/doris/pull/42491)       |
-| Removed the restriction on `max_batch_interval`. | 2.1.5 3.0.0      | 
[#29071](https://github.com/apache/doris/pull/29071)       |
-| Adjusted the default values of `max_batch_rows` and `max_batch_size`. | 
2.1.5 3.0.0      | [#36632](https://github.com/apache/doris/pull/36632)       |
-
-## Observability Optimizations
-
-| Optimization Content         | Applied Versions | Corresponding PR           
                                 |
-| ---------------------------- | ---------------- | 
---------------------------------------------------------- |
-| Added observability-related metrics. | 3.0.5           | 
[#48209](https://github.com/apache/doris/pull/48209), 
[#48171](https://github.com/apache/doris/pull/48171), 
[#48963](https://github.com/apache/doris/pull/48963) |
diff --git a/versioned_sidebars/version-2.1-sidebars.json 
b/versioned_sidebars/version-2.1-sidebars.json
index 83ac7c6ec6d..0a7ede3136a 100644
--- a/versioned_sidebars/version-2.1-sidebars.json
+++ b/versioned_sidebars/version-2.1-sidebars.json
@@ -832,7 +832,7 @@
                 "faq/lakehouse-faq",
                 "faq/bi-faq",
                 "faq/correctness-faq",
-                "faq/routineload-faq"
+                "faq/load-faq"
             ]
         },
         {
diff --git a/versioned_sidebars/version-3.0-sidebars.json 
b/versioned_sidebars/version-3.0-sidebars.json
index 4d80dc9d719..ed9fbeb39f9 100644
--- a/versioned_sidebars/version-3.0-sidebars.json
+++ b/versioned_sidebars/version-3.0-sidebars.json
@@ -885,7 +885,7 @@
                 "faq/lakehouse-faq",
                 "faq/bi-faq",
                 "faq/correctness-faq",
-                "faq/routineload-faq"
+                "faq/load-faq"
             ]
         },
         {


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to