(doris-website) branch master updated: [doc](stream load) optimize stream load doc (#1330)

liaoxin Wed, 13 Nov 2024 06:17:45 -0800

This is an automated email from the ASF dual-hosted git repository.

liaoxin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git



The following commit(s) were added to refs/heads/master by this push:
     new 84d87625084 [doc](stream load) optimize stream load doc (#1330)
84d87625084 is described below

commit 84d876250844d207e75f4409bf4535208dbfc97c
Author: hui lai <1353307...@qq.com>
AuthorDate: Wed Nov 13 22:05:54 2024 +0800

    [doc](stream load) optimize stream load doc (#1330)
---
 .../import/import-way/stream-load-manual.md        | 479 +--------------------
 .../import/import-way/stream-load-manual.md        |  45 +-
 .../import/import-way/stream-load-manual.md        |  61 ++-
 .../import/import-way/stream-load-manual.md        |  39 +-
 .../import/import-way/stream-load-manual.md        | 479 +--------------------
 .../import/import-way/stream-load-manual.md        | 477 +-------------------
 6 files changed, 64 insertions(+), 1516 deletions(-)

diff --git a/docs/data-operate/import/import-way/stream-load-manual.md 
b/docs/data-operate/import/import-way/stream-load-manual.md
index 05249f1cc7e..765e0f48ef1 100644
--- a/docs/data-operate/import/import-way/stream-load-manual.md
+++ b/docs/data-operate/import/import-way/stream-load-manual.md
@@ -43,13 +43,7 @@ See [Doris 
Streamloader](../../../ecosystem/doris-streamloader) for detailed ins
 
 ## User guide
 
-### Supported formats
-
-Stream Load supports importing data in CSV, JSON, Parquet, and ORC formats.
-
-### Usage limitations
-
-When importing CSV files, it's important to distinguish between null values 
and empty strings:
+Stream Load supports importing CSV, JSON, Parquet, and ORC format data from 
local or remote sources via HTTP.
 
 - Null values: Use `\N` to represent null. For example, `a,\N,b` indicates the 
middle column is null.
 - Empty string: An empty string is represented when there are no characters 
between two delimiters. For example, in `a,,b`, there are no characters between 
the two commas, indicating that the middle column value is an empty string.
@@ -287,10 +281,6 @@ Stream Load operations support both HTTP chunked and 
non-chunked import methods.
 
 Parameter Description: The default timeout for Stream Load. The load job will 
be canceled by the system if it is not completed within the set timeout (in 
seconds). If the source file cannot be imported within the specified time, the 
user can set an individual timeout in the Stream Load request. Alternatively, 
adjust the `stream_load_default_timeout_second` parameter on the FE to set the 
global default timeout.
 
-2. `enable_pipeline_load`
-
-Determines whether to enable the Pipeline engine to execute Streamload tasks. 
See the [import](../load-manual) documentation for more details.
-
 #### BE configuration
 
 1. `streaming_load_max_mb`
@@ -316,11 +306,11 @@ Determines whether to enable the Pipeline engine to 
execute Streamload tasks. Se
 | strict_mode                  | Used to specify whether to enable strict mode 
for this import, which is disabled by default. For example, to enable strict 
mode, use the command `-H "strict_mode:true"`. |
 | timezone                     | Used to specify the timezone to be used for 
this import, which defaults to GMT+8. This parameter affects the results of all 
timezone-related functions involved in the import. For example, to specify the 
import timezone as Africa/Abidjan, use the command `-H 
"timezone:Africa/Abidjan"`. |
 | exec_mem_limit               | The memory limit for the import, which 
defaults to 2GB. The unit is bytes. |
-| format                       | Used to specify the format of the imported 
data, which defaults to CSV. Currently supported formats include: csv, json, 
arrow, csv_with_names (supports filtering the first row of the csv file), 
csv_with_names_and_types (supports filtering the first two rows of the csv 
file), parquet, and orc. For example, to specify the imported data format as 
json, use the command `-H "format:json"`. |
+| format                       | Used to specify the format of the imported 
data, which defaults to CSV. Currently supported formats include: CSV, JSON, 
arrow, csv_with_names (supports filtering the first row of the csv file), 
csv_with_names_and_types (supports filtering the first two rows of the csv 
file), Parquet, and ORC. For example, to specify the imported data format as 
JSON, use the command `-H "format:json"`. |
 | jsonpaths                    | There are two ways to import JSON data 
format: Simple Mode and Matching Mode.  If no jsonpaths are specified, it is 
the simple mode that requires the JSON data to be of the object type.Matching 
mode used when the JSON data is relatively complex and requires matching the 
corresponding values through the jsonpaths parameter.In simple mode, the keys 
in JSON are required to correspond one-to-one with the column names in the 
table. For example, in the JSON dat [...]
 | strip_outer_array            | When `strip_outer_array` is set to true, it 
indicates that the JSON data starts with an array object and flattens the 
objects within the array. The default value is false. When the outermost layer 
of the JSON data is represented by `[]`, which denotes an array, 
`strip_outer_array` should be set to true. For example, with the following 
data, setting `strip_outer_array` to true will result in two rows of data being 
generated when imported into Doris: `[{"k1 [...]
 | json_root                    | `json_root` is a valid jsonpath string that 
specifies the root node of a JSON document, with a default value of "". |
-| merge_type                   | There are three types of data merging: 
APPEND, DELETE, and MERGE. APPEND is the default value, indicating that this 
batch of data needs to be appended to the existing data. DELETE means to remove 
all rows that have the same keys as this batch of data. MERGE semantics need to 
be used in conjunction with delete conditions. It means that data satisfying 
the delete conditions will be processed according to DELETE semantics, while 
the rest will be processed ac [...]
+| merge_type                   | The merge type of data. Three types are 
supported:<br/>- APPEND (default): Indicates that all data in this batch will 
be appended to existing data<br/>- DELETE: Indicates deletion of all rows with 
Keys matching this batch of data<br/>- MERGE: Must be used in conjunction with 
DELETE conditions. Data meeting DELETE conditions will be processed according 
to DELETE semantics, while the rest will be processed according to APPEND 
semantics<br/>For example, to s [...]
 | delete                       | It is only meaningful under MERGE, 
representing the deletion conditions for data. |
 | function_column.sequence_col | It is suitable only for the UNIQUE KEYS 
model. Within the same Key column, it ensures that the Value column is replaced 
according to the specified source_sequence column. The source_sequence can 
either be a column from the data source or an existing column in the table 
structure. |
 | fuzzy_parse                  | It is a boolean type. If set to true, the 
JSON will be parsed with the first row as the schema. Enabling this option can 
improve the efficiency of JSON imports, but it requires that the order of the 
keys in all JSON objects be consistent with the first line. The default is 
false and it is only used for JSON format. |
@@ -637,7 +627,7 @@ When a table with a Unique Key has a Sequence column, the 
value of the Sequence
 curl --location-trusted -u <doris_user>:<doris_password> \
     -H "Expect:100-continue" \
     -H "merge_type: DELETE" \
-    -H "function_column.sequence_col: age" 
+    -H "function_column.sequence_col: age" \
     -H "column_separator:," \
     -H "columns: name, gender, age" 
     -T streamload_example.csv \
@@ -1009,7 +999,7 @@ And use to_bitmap to convert the data into the Bitmap type.
 ```sql
 curl --location-trusted -u <doris_user>:<doris_password> \
     -H "Expect:100-continue" \
-    -H "columns:typ_id,hou,arr,arr=to_bitmap(arr)"
+    -H "columns:typ_id,hou,arr,arr=to_bitmap(arr)" \
     -T streamload_example.csv \
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
@@ -1053,13 +1043,9 @@ curl --location-trusted -u <doris_user>:<doris_password> 
\
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
 
-### Label, loading transaction, multi-table atomicity
-
-All load jobs in Doris are atomically effective. And multiple tables loading 
in the same load job can also guarantee atomicity. At the same time, Doris can 
also use the Label mechanism to ensure that data loading is not lost or 
duplicated. For specific instructions, please refer to the [Import Transactions 
and Atomicity](../../../data-operate/transaction) documentation.
-
 ### Column mapping, derived columns, and filtering
 
-Doris supports a very rich set of column transformations and filtering 
operations in load statements. Supports most built-in functions and UDFs. For 
how to use this feature correctly, please refer to the [Data 
Transformation](../../../data-operate/import/load-data-convert) documentation.
+Doris supports a very rich set of column transformations and filtering 
operations in load statements. Supports most built-in functions. For how to use 
this feature correctly, please refer to the [Data 
Transformation](../../../data-operate/import/load-data-convert) documentation.
 
 ### Enable strict mode import
 
@@ -1072,456 +1058,3 @@ For how to express partial column updates during 
import, please refer to the Dat
 ## More help
 
 For more detailed syntax and best practices on using Stream Load, please refer 
to the [Stream 
Load](../../../sql-manual/sql-statements/Data-Manipulation-Statements/Load/STREAM-LOAD)
 Command Manual. You can also enter HELP STREAM LOAD in the MySql client 
command line to get more help information.
-
-
-
-
-
-
-
-
-
-
-
-
-
-Stream load submits and transfers data through HTTP protocol. Here, the `curl` 
command shows how to submit an import.
-
-Users can also operate through other HTTP clients.
-
-```
-curl --location-trusted -u user:passwd [-H ""...] -T data.file -XPUT 
http://fe_host:http_port/api/{db}/{table}/_stream_load
-
-The properties supported in the header are described in "Load Parameters" below
-The format is: - H "key1: value1"
-```
-
-Examples:
-
-```
-curl --location-trusted -u root -T date -H "label:123" 
http://abc.com:8030/api/test/date/_stream_load
-```
-The detailed syntax for creating imports helps to execute ``HELP STREAM LOAD`` 
view. The following section focuses on the significance of creating some 
parameters of Stream load.
-
-**Signature parameters**
-
-+ user/passwd
-
-  Stream load uses the HTTP protocol to create the imported protocol and signs 
it through the Basic Access authentication. The Doris system verifies user 
identity and import permissions based on signatures.
-
-**Load Parameters**
-
-Stream load uses HTTP protocol, so all parameters related to import tasks are 
set in the header. The significance of some parameters of the import task 
parameters of Stream load is mainly introduced below.
-
-+ label
-
-  Identity of import task. Each import task has a unique label inside a single 
database. Label is a user-defined name in the import command. With this label, 
users can view the execution of the corresponding import task.
-
-  Another function of label is to prevent users from importing the same data 
repeatedly. **It is strongly recommended that users use the same label for the 
same batch of data. This way, repeated requests for the same batch of data will 
only be accepted once, guaranteeing at-Most-Once**
-
-  When the corresponding import operation state of label is CANCELLED, the 
label can be used again.
-
-
-+ column_separator
-
-    Used to specify the column separator in the load file. The default is 
`\t`. If it is an invisible character, you need to add `\x` as a prefix and 
hexadecimal to indicate the separator.
-
-    For example, the separator `\x01` of the hive file needs to be specified 
as `-H "column_separator:\x01"`.
-
-    You can use a combination of multiple characters as the column separator.
-
-+ line_delimiter
-
-   Used to specify the line delimiter in the load file. The default is `\n`.
-
-   You can use a combination of multiple characters as the column separator.
-
-+ max\_filter\_ratio
-
-  The maximum tolerance rate of the import task is 0 by default, and the range 
of values is 0-1. When the import error rate exceeds this value, the import 
fails.
-
-  If the user wishes to ignore the wrong row, the import can be successful by 
setting this parameter greater than 0.
-
-  The calculation formula is as follows:
-
-    ``` (dpp.abnorm.ALL / (dpp.abnorm.ALL + dpp.norm.ALL ) ) > 
max_filter_ratio ```
-
-  ``` dpp.abnorm.ALL``` denotes the number of rows whose data quality is not 
up to standard. Such as type mismatch, column mismatch, length mismatch and so 
on.
-
-  ``` dpp.norm.ALL ``` refers to the number of correct data in the import 
process. The correct amount of data for the import task can be queried by the 
``SHOW LOAD` command.
-
-  The number of rows in the original file = `dpp.abnorm.ALL + dpp.norm.ALL`
-
-+ where
-
-    Import the filter conditions specified by the task. Stream load supports 
filtering of where statements specified for raw data. The filtered data will 
not be imported or participated in the calculation of filter ratio, but will be 
counted as `num_rows_unselected`.
-
-+ partitions
-
-    Partitions information for tables to be imported will not be imported if 
the data to be imported does not belong to the specified Partition. These data 
will be included in `dpp.abnorm.ALL`.
-
-+ columns
-
-    The function transformation configuration of data to be imported includes 
the sequence change of columns and the expression transformation, in which the 
expression transformation method is consistent with the query statement.
-
-    ```
-    Examples of column order transformation: There are three columns of 
original data (src_c1,src_c2,src_c3), and there are also three columns 
（dst_c1,dst_c2,dst_c3) in the doris table at present.
-    when the first column src_c1 of the original file corresponds to the 
dst_c1 column of the target table, while the second column src_c2 of the 
original file corresponds to the dst_c2 column of the target table and the 
third column src_c3 of the original file corresponds to the dst_c3 column of 
the target table,which is written as follows:
-    columns: dst_c1, dst_c2, dst_c3
-    
-    when the first column src_c1 of the original file corresponds to the 
dst_c2 column of the target table, while the second column src_c2 of the 
original file corresponds to the dst_c3 column of the target table and the 
third column src_c3 of the original file corresponds to the dst_c1 column of 
the target table,which is written as follows:
-    columns: dst_c2, dst_c3, dst_c1
-    
-    Example of expression transformation: There are two columns in the 
original file and two columns in the target table (c1, c2). However, both 
columns in the original file need to be transformed by functions to correspond 
to the two columns in the target table.
-    columns: tmp_c1, tmp_c2, c1 = year(tmp_c1), c2 = mouth(tmp_c2)
-    Tmp_* is a placeholder, representing two original columns in the original 
file.
-    ```
-  
-+ format
-
-  Specify the import data format, support csv, json, the default is csv
-
- supports `csv_with_names` (csv file line header filter), 
`csv_with_names_and_types` (csv file first two lines filter), parquet, orc
-
-+ exec\_mem\_limit
-
-    Memory limit. Default is 2GB. Unit is Bytes
-
-+ merge\_type
-
-     The type of data merging supports three types: APPEND, DELETE, and MERGE. 
APPEND is the default value, which means that all this batch of data needs to 
be appended to the existing data. DELETE means to delete all rows with the same 
key as this batch of data. MERGE semantics Need to be used in conjunction with 
the delete condition, which means that the data that meets the delete condition 
is processed according to DELETE semantics and the rest is processed according 
to APPEND semantics
-
-+ two\_phase\_commit
-
-  Stream load import can enable two-stage transaction commit mode: in the 
stream load process, the data is written and the information is returned to the 
user. At this time, the data is invisible and the transaction status is 
`PRECOMMITTED`. After the user manually triggers the commit operation, the data 
is visible.
-
-+ enclose
-  
-  When the csv data field contains row delimiters or column delimiters, to 
prevent accidental truncation, single-byte characters can be specified as 
brackets for protection. For example, the column separator is ",", the bracket 
is "'", and the data is "a,'b,c'", then "b,c" will be parsed as a field.
-  Note: when the bracket is `"`, trim\_double\_quotes must be set to true.
-
-+ escape
-
-  Used to escape characters that appear in a csv field identical to the 
enclosing characters. For example, if the data is "a,'b,'c'", enclose is "'", 
and you want "b,'c to be parsed as a field, you need to specify a single-byte 
escape character, such as "\", and then modify the data to "a,' b,\'c'".
-
-  Example：
-
-    1. Initiate a stream load pre-commit operation
-  ```shell
-  curl  --location-trusted -u user:passwd -H "two_phase_commit:true" -T 
test.txt http://fe_host:http_port/api/{db}/{table}/_stream_load
-  {
-      "TxnId": 18036,
-      "Label": "55c8ffc9-1c40-4d51-b75e-f2265b3602ef",
-      "TwoPhaseCommit": "true",
-      "Status": "Success",
-      "Message": "OK",
-      "NumberTotalRows": 100,
-      "NumberLoadedRows": 100,
-      "NumberFilteredRows": 0,
-      "NumberUnselectedRows": 0,
-      "LoadBytes": 1031,
-      "LoadTimeMs": 77,
-      "BeginTxnTimeMs": 1,
-      "StreamLoadPutTimeMs": 1,
-      "ReadDataTimeMs": 0,
-      "WriteDataTimeMs": 58,
-      "CommitAndPublishTimeMs": 0
-  }
-  ```
-    1. Trigger the commit operation on the transaction.
-      Note 1) requesting to fe and be both works
-      Note 2) `{table}` in url can be omit when commit
-      using txn id
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H "txn_id:18036" -H 
"txn_operation:commit"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "transaction [18036] commit successfully."
-  }
-  ```
-  using label
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H 
"label:55c8ffc9-1c40-4d51-b75e-f2265b3602ef" -H "txn_operation:commit"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "label [55c8ffc9-1c40-4d51-b75e-f2265b3602ef] commit 
successfully."
-  }
-  ```
-    1. Trigger an abort operation on a transaction
-      Note 1) requesting to fe and be both works
-      Note 2) `{table}` in url can be omit when abort
-      using txn id
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H "txn_id:18037" -H 
"txn_operation:abort"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "transaction [18037] abort successfully."
-  }
-  ```
-  using label
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H 
"label:55c8ffc9-1c40-4d51-b75e-f2265b3602ef" -H "txn_operation:abort"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "label [55c8ffc9-1c40-4d51-b75e-f2265b3602ef] abort successfully."
-  }
-  ```
-
-+ enable_profile
-
-  When `enable_profile` is true, the Stream Load profile will be printed to 
logs (be.INFO).
-
-+ memtable_on_sink_node
-
-  Whether to enable MemTable on DataSink node when loading data, default is 
false.
-
-  Build MemTable on DataSink node, and send segments to other backends through 
brpc streaming.
-  It reduces duplicate work among replicas, and saves time in data 
serialization & deserialization.
-- partial_columns
-   Whether to enable partial column updates, Boolean type, True means that use 
partial column update, the default value is false, this parameter is only 
allowed to be set when the table model is Unique and Merge on Write is used.
-
-   eg: `curl  --location-trusted -u root: -H "partial_columns:true" -H 
"column_separator:," -H "columns:id,balance,last_access_time" -T /tmp/test.csv 
http://127.0.0.1:48037/api/db1/user_profile/_stream_load`
-
-### Use stream load with SQL
-
-You can add a `sql` parameter to the `Header` to replace the 
`column_separator`, `line_delimiter`, `where`, `columns` in the previous 
parameter, which is convenient to use.
-
-```
-curl --location-trusted -u user:passwd [-H "sql: ${load_sql}"...] -T data.file 
-XPUT http://fe_host:http_port/api/_http_stream
-
-
-# -- load_sql
-# insert into db.table (col, ...) select stream_col, ... from 
http_stream("property1"="value1");
-
-# http_stream
-# (
-#     "column_separator" = ",",
-#     "format" = "CSV",
-#     ...
-# )
-```
-
-Examples：
-
-```
-curl  --location-trusted -u root: -T test.csv  -H "sql:insert into 
demo.example_tbl_1(user_id, age, cost) select c1, c4, c7 * 2 from 
http_stream("format" = "CSV", "column_separator" = "," ) where age >= 30"  
http://127.0.0.1:28030/api/_http_stream
-```
-
-### Return results
-
-Since Stream load is a synchronous import method, the result of the import is 
directly returned to the user by creating the return value of the import.
-
-Examples:
-
-```
-{
-    "TxnId": 1003,
-    "Label": "b6f3bc78-0d2c-45d9-9e4c-faa0a0149bee",
-    "Status": "Success",
-    "ExistingJobStatus": "FINISHED", // optional
-    "Message": "OK",
-    "NumberTotalRows": 1000000,
-    "NumberLoadedRows": 1000000,
-    "NumberFilteredRows": 1,
-    "NumberUnselectedRows": 0,
-    "LoadBytes": 40888898,
-    "LoadTimeMs": 2144,
-    "BeginTxnTimeMs": 1,
-    "StreamLoadPutTimeMs": 2,
-    "ReadDataTimeMs": 325,
-    "WriteDataTimeMs": 1933,
-    "CommitAndPublishTimeMs": 106,
-    "ErrorURL": 
"http://192.168.1.1:8042/api/_load_error_log?file=__shard_0/error_log_insert_stmt_db18266d4d9b4ee5-abb00ddd64bdf005_db18266d4d9b4ee5_abb00ddd64bdf005";
-}
-```
-
-The following main explanations are given for the Stream load import result 
parameters:
-
-+ TxnId: The imported transaction ID. Users do not perceive.
-
-+ Label: Import Label. User specified or automatically generated by the system.
-
-+ Status: Import completion status.
-
-  "Success": Indicates successful import.
-
-  "Publish Timeout": This state also indicates that the import has been 
completed, except that the data may be delayed and visible without retrying.
-
-  "Label Already Exists": Label duplicate, need to be replaced Label.
-
-  "Fail": Import failed.
-
-+ ExistingJobStatus: The state of the load job corresponding to the existing 
Label.
-
-    This field is displayed only when the status is "Label Already Exists". 
The user can know the status of the load job corresponding to Label through 
this state. "RUNNING" means that the job is still executing, and "FINISHED" 
means that the job is successful.
-
-+ Message: Import error messages.
-
-+ NumberTotalRows: Number of rows imported for total processing.
-
-+ NumberLoadedRows: Number of rows successfully imported.
-
-+ NumberFilteredRows: Number of rows that do not qualify for data quality.
-
-+ NumberUnselectedRows: Number of rows filtered by where condition.
-
-+ LoadBytes: Number of bytes imported.
-
-+ LoadTimeMs: Import completion time. Unit milliseconds.
-
-+ BeginTxnTimeMs: The time cost for RPC to Fe to begin a transaction, Unit 
milliseconds.
-
-+ StreamLoadPutTimeMs: The time cost for RPC to Fe to get a stream load plan, 
Unit milliseconds.
-
-+ ReadDataTimeMs: Read data time, Unit milliseconds.
-
-+ WriteDataTimeMs: Write data time, Unit milliseconds.
-
-+ CommitAndPublishTimeMs: The time cost for RPC to Fe to commit and publish a 
transaction, Unit milliseconds.
-
-+ ErrorURL: If you have data quality problems, visit this URL to see specific 
error lines.
-
-:::info Note
-Since Stream load is a synchronous import mode, import information will not be 
recorded in Doris system. Users cannot see Stream load asynchronously by 
looking at import commands. You need to listen for the return value of the 
create import request to get the import result.
-:::
-
-### Cancel Load
-
-Users can't cancel Stream load manually. Stream load will be cancelled 
automatically by the system after a timeout or import error.
-
-### View Stream Load
-
-Users can view completed stream load tasks through `show stream load`.
-
-By default, BE does not record Stream Load records. If you want to view 
records that need to be enabled on BE, the configuration parameter is: 
`enable_stream_load_record=true`. For details, please refer to [BE 
Configuration Items](../../../admin-manual/config/be-config)
-
-## Relevant System Configuration
-
-### FE configuration
-
-+ stream\_load\_default\_timeout\_second
-
-  The timeout time of the import task (in seconds) will be cancelled by the 
system if the import task is not completed within the set timeout time, and 
will become CANCELLED.
-
-  At present, Stream load does not support custom import timeout time. All 
Stream load import timeout time is uniform. The default timeout time is 600 
seconds. If the imported source file can no longer complete the import within 
the specified time, the FE parameter ```stream_load_default_timeout_second``` 
needs to be adjusted.
-
-+ enable\_pipeline\_load
-
-  Whether or not to enable the Pipeline engine to execute Streamload tasks. 
See the [Import](../../../data-operate/import/load-manual) documentation.
-
-### BE configuration
-
-+ streaming\_load\_max\_mb
-
-  The maximum import size of Stream load is 10G by default, in MB. If the 
user's original file exceeds this value, the BE parameter 
```streaming_load_max_mb``` needs to be adjusted.
-
-## Best Practices
-
-### Application scenarios
-
-The most appropriate scenario for using Stream load is that the original file 
is in memory or on disk. Secondly, since Stream load is a synchronous import 
method, users can also use this import if they want to obtain the import 
results in a synchronous manner.
-
-### Data volume
-
-Since Stream load is based on the BE initiative to import and distribute data, 
the recommended amount of imported data is between 1G and 10G. Since the 
default maximum Stream load import data volume is 10G, the configuration of BE 
```streaming_load_max_mb``` needs to be modified if files exceeding 10G are to 
be imported.
-
-```
-For example, the size of the file to be imported is 15G
-Modify the BE configuration streaming_load_max_mb to 16000
-```
-
-Stream load default timeout is 600 seconds, according to Doris currently the 
largest import speed limit, about more than 3G files need to modify the import 
task default timeout.
-
-```
-Import Task Timeout = Import Data Volume / 10M / s (Specific Average Import 
Speed Requires Users to Calculate Based on Their Cluster Conditions)
-For example, import a 10G file
-Timeout = 1000s -31561;. 20110G / 10M /s
-```
-
-### Complete examples
-
-Data situation: In the local disk path /home/store_sales of the sending and 
importing requester, the imported data is about 15G, and it is hoped to be 
imported into the table store\_sales of the database bj_sales.
-
-Cluster situation: The concurrency of Stream load is not affected by cluster 
size.
-
-+ Step 1: Does the import file size exceed the default maximum import size of 
10G
-
-  ```
-  BE conf
-  streaming_load_max_mb = 16000
-  ```
-+ Step 2: Calculate whether the approximate import time exceeds the default 
timeout value
-
-  ```
-  Import time 15000/10 = 1500s
-  Over the default timeout time, you need to modify the FE configuration
-  stream_load_default_timeout_second = 1500
-  ```
-
-+ Step 3: Create Import Tasks
-
-    ```
-    curl --location-trusted -u user:password -T /home/store_sales -H 
"label:abc" http://abc.com:8030/api/bj_sales/store_sales/_stream_load
-    ```
-
-### Coding with StreamLoad
-
-You can initiate HTTP requests for Stream Load using any language. Before 
initiating HTTP requests, you need to set several necessary headers:
-
-```http
-Content-Type: text/plain; charset=UTF-8
-Expect: 100-continue
-Authorization: Basic <Base64 encoded username and password>
-```
-
-`<Base64 encoded username and password>`: a string consist with Doris's 
`username`, `:` and `password` and then do a base64 encode.
-
-Additionally, it should be noted that if you directly initiate an HTTP request 
to FE, as Doris will redirect to BE, some frameworks will remove the 
`Authorization` HTTP header during this process, which requires manual 
processing.
-
-Doris provides StreamLoad examples in three languages: 
[Java](https://github.com/apache/doris/tree/master/samples/stream_load/java), 
[Go](https://github.com/apache/doris/tree/master/samples/stream_load/go), and 
[Python](https://github.com/apache/doris/tree/master/samples/stream_load/python)
 for reference.
-
-## Common Questions
-
-* Label Already Exists
-
-  The Label repeat checking steps of Stream load are as follows:
-
-  1. Is there an import Label conflict that already exists with other import 
methods?
-
-    Because imported Label in Doris system does not distinguish between import 
methods, there is a problem that other import methods use the same Label.
-
-    Through ``SHOW LOAD WHERE LABEL = "xxx"'``, where XXX is a duplicate Label 
string, see if there is already a Label imported by FINISHED that is the same 
as the Label created by the user.
-
-  2. Are Stream loads submitted repeatedly for the same job?
-
-    Since Stream load is an HTTP protocol submission creation import task, 
HTTP Clients in various languages usually have their own request retry logic. 
After receiving the first request, the Doris system has started to operate 
Stream load, but because the result is not returned to the Client side in time, 
the Client side will retry to create the request. At this point, the Doris 
system is already operating on the first request, so the second request will be 
reported to Label Already Exists.
-
-    To sort out the possible methods mentioned above: Search FE Master's log 
with Label to see if there are two ``redirect load action to destination = 
``redirect load action to destination cases in the same Label. If so, the 
request is submitted repeatedly by the Client side.
-
-    It is recommended that the user calculate the approximate import time 
based on the amount of data currently requested, and change the request 
overtime on the client side to a value greater than the import timeout time 
according to the import timeout time to avoid multiple submissions of the 
request by the client side.
-
-  3. Connection reset abnormal
-
-    In the community version 0.14.0 and earlier versions, the connection reset 
exception occurred after Http V2 was enabled, because the built-in web 
container is tomcat, and Tomcat has pits in 307 (Temporary Redirect). There is 
a problem with the implementation of this protocol. All In the case of using 
Stream load to import a large amount of data, a connect reset exception will 
occur. This is because tomcat started data transmission before the 307 jump, 
which resulted in the lack of au [...]
-
-    After the upgrade, also upgrade the http client version of your program to 
`4.5.13`，Introduce the following dependencies in your pom.xml file
-
-    ```xml
-        <dependency>
-          <groupId>org.apache.httpcomponents</groupId>
-          <artifactId>httpclient</artifactId>
-          <version>4.5.13</version>
-        </dependency>
-    ```
-
-* After enabling the Stream Load record on the BE, the record cannot be queried
-
-  This is caused by the slowness of fetching records, you can try to adjust 
the following parameters:
-
-  1. Increase the BE configuration `stream_load_record_batch_size`. This 
configuration indicates how many Stream load records can be pulled from BE each 
time. The default value is 50, which can be increased to 500.
-  2. Reduce the FE configuration `fetch_stream_load_record_interval_second`, 
this configuration indicates the interval for obtaining Stream load records, 
the default is to fetch once every 120 seconds, and it can be adjusted to 60 
seconds.
-  3. If you want to save more Stream load records (not recommended, it will 
take up more resources of FE), you can increase the configuration 
`max_stream_load_record_size` of FE, the default is 5000.
-
-## More Help
-
-For more detailed syntax used by **Stream Load**,  you can enter `HELP STREAM 
LOAD` on the Mysql client command line for more help.
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
index e1209ae1ad3..65e5a648000 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
@@ -29,23 +29,12 @@ Stream Load 支持通过 HTTP 协议将本地文件或数据流导入到 Doris 
 :::tip
 提示
 
-相比于直接使用 `curl` 的单并发导入，更推荐使用 专用导入工具 Doris Streamloader 该工具是一款用于将数据导入 Doris 
数据库的专用客户端工具，可以提供**多并发导入**的功能，降低大数据量导入的耗时。拥有以下功能：
-
-- 并发导入，实现 Stream Load 的多并发导入。可以通过 `workers` 值设置并发数。
-- 多文件导入，一次导入可以同时导入多个文件及目录，支持设置通配符以及会自动递归获取文件夹下的所有文件。
-- 断点续传，在导入过程中可能出现部分失败的情况，支持在失败点处进行继续传输。
-- 自动重传，在导入出现失败的情况后，无需手动重传，工具会自动重传默认的次数，如果仍然不成功，打印出手动重传的命令。
-
-点击 [Doris Streamloader 文档](../../../ecosystem/doris-streamloader) 了解使用方法与实践详情。
+相比于直接使用 `curl` 的单并发导入，更推荐使用专用导入工具 Doris Streamloader。该工具是一款用于将数据导入 Doris 
数据库的专用客户端工具，可以提供**多并发导入**的功能，降低大数据量导入的耗时。点击 [Doris Streamloader 
文档](../../../ecosystem/doris-streamloader) 了解使用方法与实践详情。
 :::
 
 ## 使用场景
 
-### 支持格式
-
-Stream Load 支持导入 CSV、JSON、Parquet 与 ORC 格式的数据。
-
-### 使用限制
+Stream Load 支持从本地或远程通过 HTTP 的方式导入 CSV、JSON、Parquet 与 ORC 格式的数据。
 
 在导入 CSV 文件时，需要明确区分空值（null）与空字符串：
 
@@ -55,7 +44,7 @@ Stream Load 支持导入 CSV、JSON、Parquet 与 ORC 格式的数据。
 
 ## 基本原理
 
-在使用 Stream Load 时，需要通过 HTTP 协议发起导入作业给 FE 节点，FE 会以轮询方式，重定向（redirect）请求给一个 BE 
节点以达到负载均衡的效果。也可以直接发送 HTTP 请求作业给指定的 BE 节点。在 Stream Load 中，Doris 会选定一个节点做为 
Coordinator 节点。Coordinator 节点负责接受数据并分发数据到其他节点上。
+在使用 Stream Load 时，需要通过 HTTP 协议发起导入作业给 FE 节点，FE 会以轮询方式，重定向（redirect）请求给一个 BE 
节点以达到负载均衡的效果。也可以直接发送 HTTP 请求作业给指定的 BE 节点。在 Stream Load 中，Doris 会选定一个节点作为 
Coordinator 节点。Coordinator 节点负责接受数据并分发数据到其他节点上。
 
 下图展示了 Stream Load 的主要流程：
 
@@ -87,7 +76,7 @@ Stream Load 需要对目标表的 INSERT 权限。如果没有 INSERT 权限，
 
 1. 创建导入数据
 
-    创建 csv 文件 streamload_example.csv 文件。具体内容如下
+    创建 CSV 文件 streamload_example.csv 文件。具体内容如下
 
     ```sql
     1,Emily,25
@@ -290,10 +279,6 @@ Stream Load 操作支持 HTTP 分块导入（HTTP chunked）与 HTTP 非分块
 
 参数描述：Stream Load 默认的超时时间。导入任务的超时时间（以秒为单位），导入任务在设定的 timeout 时间内未完成则会被系统取消，变成 
CANCELLED。如果导入的源文件无法在规定时间内完成导入，用户可以在 Stream Load 请求中设置单独的超时时间。或者调整 FE 
的参数`stream_load_default_timeout_second` 来设置全局的默认超时时间。
 
-2. enable_pipeline_load
-
-  是否开启 Pipeline 引擎执行 Streamload 任务。详见[导入](../load-manual)文档。
-
 **BE 配置**
 
 1. streaming_load_max_mb
@@ -321,16 +306,16 @@ Stream Load 操作支持 HTTP 分块导入（HTTP chunked）与 HTTP 非分块
 | strict_mode                  | 用户指定此次导入是否开启严格模式，默认为关闭。例如，指定开启严格模式，需要指定命令 `-H 
"strict_mode:true"`。 |
 | timezone                     | 
指定本次导入所使用的时区。默认为东八区。该参数会影响所有导入涉及的和时区有关的函数结果。例如，指定导入时区为 Africa/Abidjan，需要指定命令 
`-H "timezone:Africa/Abidjan"`。 |
 | exec_mem_limit               | 导入内存限制。默认为 2GB。单位为字节。                       |
-| format                       | 指定导入数据格式，默认是 CSV 格式。目前支持以下格式：csv, json, 
arrow, csv_with_names（支持 csv 文件行首过滤）csv_with_names_and_types（支持 csv 
文件前两行过滤）parquet, orc 例如，指定导入数据格式为 json，需要指定命令 `-H "format:json"`。 |
+| format                       | 指定导入数据格式，默认是 CSV 格式。目前支持以下格式：CSV, JSON, 
arrow, csv_with_names（支持 csv 文件行首过滤）csv_with_names_and_types（支持 CSV 
文件前两行过滤）Parquet, ORC 例如，指定导入数据格式为 JSON，需要指定命令 `-H "format:json"`。 |
 | jsonpaths                    | 导入 JSON 数据格式有两种方式：简单模式：没有指定 jsonpaths 
为简单模式，这种模式要求 JSON 数据是对象类型匹配模式：用于 JSON 数据相对复杂，需要通过 jsonpaths 参数匹配对应的 value 
在简单模式下，要求 JSON 中的 key 列与表中的列名是一一对应的，如 JSON 数据 {"k1":1, "k2":2, "k3":"hello"}，其中 
k1、k2 及 k3 分别对应表中的列。 |
 | strip_outer_array            | 指定 strip_outer_array 为 true 时表示 JSON 
数据以数组对象开始且将数组对象中进行展平，默认为 false。在 JSON 数据的最外层是 [] 表示的数组时，需要设置 strip_outer_array 
为 true。如以下示例数据，在设置 strip_outer_array 为 true 后，导入 Doris 中生成两行数据`    [{"k1" : 1, 
"v1" : 2},{"k1" : 3, "v1" : 4}]` |
 | json_root                    | json_root 为合法的 jsonpath 字符串，用于指定 json 
document 的根节点，默认值为 ""。 |
-| merge_type                   | 数据的合并类型，一共支持三种类型 APPEND、DELETE、MERGE；APPEND 
是默认值，表示这批数据全部需要追加到现有数据中；DELETE 表示删除与这批数据 key 相同的所有行 MERGE 语义 需要与 DELETE  
条件联合使用，表示满足 DELETE 条件的数据按照 DELETE 语义处理其余的按照 APPEND 语义处理例如，指定合并模式为 
MERGE，需要指定命令`-H "merge_type: MERGE" -H "delete: flag=1"`。 |
+| merge_type                   | 数据的合并类型，支持三种类型：<br/>- 
APPEND（默认值）：表示这批数据全部追加到现有数据中<br/>- DELETE：表示删除与这批数据 Key 相同的所有行<br/>- MERGE：需要与 
DELETE 条件联合使用，表示满足 DELETE 条件的数据按照 DELETE 语义处理，其余的按照 APPEND 语义处理<br/>例如，指定合并模式为 
MERGE：`-H "merge_type: MERGE" -H "delete: flag=1"` |
 | delete                       | 仅在 MERGE 下有意义，表示数据的删除条件                      |
 | function_column.sequence_col | 只适用于 UNIQUE KEYS 模型，相同 Key 列下，保证 Value 列按照 
source_sequence 列进行 REPLACE。source_sequence 可以是数据源中的列，也可以是表结构中的一列。 |
 | fuzzy_parse                  | 布尔类型，为 true 表示 JSON 将以第一行为 schema 
进行解析。开启这个选项可以提高 json 导入效率，但是要求所有 json 对象的 key 的顺序和第一行一致，默认为 false，仅用于 JSON 格式 |
 | num_as_string                | 布尔类型，为 true 表示在解析 JSON 
数据时会将数字类型转为字符串，确保不会出现精度丢失的情况下进行导入。 |
-| read_json_by_line            | 布尔类型，为 true 表示支持每行读取一个 json 对象，默认值为 false。 |
+| read_json_by_line            | 布尔类型，为 true 表示支持每行读取一个 JSON 对象，默认值为 false。 |
 | send_batch_parallelism       | 整型，用于设置发送批处理数据的并行度，如果并行度的值超过 BE 配置中的 
`max_send_batch_parallelism_per_job`，那么作为协调点的 BE 将使用 
`max_send_batch_parallelism_per_job` 的值。 |
 | hidden_columns               | 用于指定导入数据中包含的隐藏列，在 Header 中不包含 Columns 时生效，多个 
hidden column 用逗号分割。系统会使用用户指定的数据导入数据。在下例中，导入数据中最后一列数据为 
`__DORIS_SEQUENCE_COL__`。`hidden_columns: 
__DORIS_DELETE_SIGN__,__DORIS_SEQUENCE_COL__` |
 | load_to_single_tablet        | 布尔类型，为 true 表示支持一个任务只导入数据到对应分区的一个 Tablet，默认值为 
false。该参数只允许在对带有 random 分桶的 OLAP 表导数的时候设置。 |
@@ -453,7 +438,7 @@ curl  --location-trusted -u root: -T test.csv  -H 
"sql:insert into demo.example_
 ```Shell
 curl --location-trusted -u <doris_user>:<doris_password> \
     -H "Expect:100-continue" \
-    -H "timeout:3000"
+    -H "timeout:3000" \
     -H "column_separator:," \
     -H "columns:user_id,name,age" \
     -T streamload_example.csv \
@@ -643,9 +628,9 @@ curl --location-trusted -u <doris_user>:<doris_password> \
 curl --location-trusted -u <doris_user>:<doris_password> \
     -H "Expect:100-continue" \
     -H "merge_type: DELETE" \
-    -H "function_column.sequence_col: age" 
+    -H "function_column.sequence_col: age" \
     -H "column_separator:," \
-    -H "columns: name, gender, age" 
+    -H "columns: name, gender, age" \
     -T streamload_example.csv \
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
@@ -865,7 +850,7 @@ curl --location-trusted -u <doris_user>:<doris_password> \
 
 ### 指定 JSON 根节点导入数据
 
-如果 JSON 数据包含了嵌套 JSON 字段，需要指定导入 json 的根节点。默认值为“”。
+如果 JSON 数据包含了嵌套 JSON 字段，需要指定导入 JSON 的根节点。默认值为“”。
 
 如下列数据，期望将 comment 列中的数据导入到表中：
 
@@ -1015,7 +1000,7 @@ DISTRIBUTED BY HASH(typ_id,hou) BUCKETS 10;
 ```sql
 curl --location-trusted -u <doris_user>:<doris_password> \
     -H "Expect:100-continue" \
-    -H "columns:typ_id,hou,arr,arr=to_bitmap(arr)"
+    -H "columns:typ_id,hou,arr,arr=to_bitmap(arr)" \
     -T streamload_example.csv \
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
@@ -1059,13 +1044,9 @@ curl --location-trusted -u <doris_user>:<doris_password> 
\
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
 
-### Label、导入事务、多表原子性
-
-Doris 中所有导入任务都是原子生效的。并且在同一个导入任务中对多张表的导入也能够保证原子性。同时，Doris 还可以通过 Label 
的机制来保证数据导入的不丢不重。具体说明可以参阅 
[导入事务和原子性](../../../data-operate/import/load-atomicity) 文档。
-
 ### 列映射、衍生列和过滤
 
-Doris 可以在导入语句中支持非常丰富的列转换和过滤操作。支持绝大多数内置函数和 UDF。关于如何正确的使用这个功能，可参阅 
[数据转换](../../../data-operate/import/load-data-convert) 文档。
+Doris 可以在导入语句中支持非常丰富的列转换和过滤操作。支持绝大多数内置函数。关于如何正确的使用这个功能，可参阅 
[数据转换](../../../data-operate/import/load-data-convert) 文档。
 
 ### 启用严格模式导入
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
index 7af495f1fb4..2dfb29f12b6 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
@@ -29,23 +29,12 @@ Stream Load 支持通过 HTTP 协议将本地文件或数据流导入到 Doris 
 :::tip
 提示
 
-相比于直接使用 `curl` 的单并发导入，更推荐使用 专用导入工具 Doris Streamloader 该工具是一款用于将数据导入 Doris 
数据库的专用客户端工具，可以提供**多并发导入**的功能，降低大数据量导入的耗时。拥有以下功能：
-
-- 并发导入，实现 Stream Load 的多并发导入。可以通过 `workers` 值设置并发数。
-- 多文件导入，一次导入可以同时导入多个文件及目录，支持设置通配符以及会自动递归获取文件夹下的所有文件。
-- 断点续传，在导入过程中可能出现部分失败的情况，支持在失败点处进行继续传输。
-- 自动重传，在导入出现失败的情况后，无需手动重传，工具会自动重传默认的次数，如果仍然不成功，打印出手动重传的命令。
-
-点击 [Doris Streamloader 文档](../../../ecosystem/doris-streamloader) 了解使用方法与实践详情。
+相比于直接使用 `curl` 的单并发导入，更推荐使用专用导入工具 Doris Streamloader。该工具是一款用于将数据导入 Doris 
数据库的专用客户端工具，可以提供**多并发导入**的功能，降低大数据量导入的耗时。点击 [Doris Streamloader 
文档](../../../ecosystem/doris-streamloader) 了解使用方法与实践详情。
 :::
 
 ## 使用场景
 
-### 支持格式
-
-Stream Load 支持导入 CSV、JSON、Parquet 与 ORC 格式的数据。
-
-### 使用限制
+Stream Load 支持从本地或远程通过 HTTP 的方式导入 CSV、JSON、Parquet 与 ORC 格式的数据。
 
 在导入 CSV 文件时，需要明确区分空值（null）与空字符串：
 
@@ -55,7 +44,7 @@ Stream Load 支持导入 CSV、JSON、Parquet 与 ORC 格式的数据。
 
 ## 基本原理
 
-在使用 Stream Load 时，需要通过 HTTP 协议发起导入作业给 FE 节点，FE 会以轮询方式，重定向（redirect）请求给一个 BE 
节点以达到负载均衡的效果。也可以直接发送 HTTP 请求作业给指定的 BE 节点。在 Stream Load 中，Doris 会选定一个节点做为 
Coordinator 节点。Coordinator 节点负责接受数据并分发数据到其他节点上。
+在使用 Stream Load 时，需要通过 HTTP 协议发起导入作业给 FE 节点，FE 会以轮询方式，重定向（redirect）请求给一个 BE 
节点以达到负载均衡的效果。也可以直接发送 HTTP 请求作业给指定的 BE 节点。在 Stream Load 中，Doris 会选定一个节点作为 
Coordinator 节点。Coordinator 节点负责接受数据并分发数据到其他节点上。
 
 下图展示了 Stream Load 的主要流程：
 
@@ -87,7 +76,7 @@ Stream Load 需要对目标表的 INSERT 权限。如果没有 INSERT 权限，
 
 1. 创建导入数据
 
-    创建 csv 文件 streamload_example.csv 文件。具体内容如下
+    创建 CSV 文件 streamload_example.csv 文件。具体内容如下
 
     ```sql
     1,Emily,25
@@ -290,10 +279,6 @@ Stream Load 操作支持 HTTP 分块导入（HTTP chunked）与 HTTP 非分块
 
 参数描述：Stream Load 默认的超时时间。导入任务的超时时间（以秒为单位），导入任务在设定的 timeout 时间内未完成则会被系统取消，变成 
CANCELLED。如果导入的源文件无法在规定时间内完成导入，用户可以在 Stream Load 请求中设置单独的超时时间。或者调整 FE 
的参数`stream_load_default_timeout_second` 来设置全局的默认超时时间。
 
-2. enable_pipeline_load
-
-  是否开启 Pipeline 引擎执行 Streamload 任务。详见[导入](../load-manual)文档。
-
 **BE 配置**
 
 1. streaming_load_max_mb
@@ -321,16 +306,16 @@ Stream Load 操作支持 HTTP 分块导入（HTTP chunked）与 HTTP 非分块
 | strict_mode                  | 用户指定此次导入是否开启严格模式，默认为关闭。例如，指定开启严格模式，需要指定命令 `-H 
"strict_mode:true"`。 |
 | timezone                     | 
指定本次导入所使用的时区。默认为东八区。该参数会影响所有导入涉及的和时区有关的函数结果。例如，指定导入时区为 Africa/Abidjan，需要指定命令 
`-H "timezone:Africa/Abidjan"`。 |
 | exec_mem_limit               | 导入内存限制。默认为 2GB。单位为字节。                       |
-| format                       | 指定导入数据格式，默认是 CSV 格式。目前支持以下格式：csv, json, 
arrow, csv_with_names（支持 csv 文件行首过滤）csv_with_names_and_types（支持 csv 
文件前两行过滤）parquet, orc 例如，指定导入数据格式为 json，需要指定命令 `-H "format:json"`。 |
+| format                       | 指定导入数据格式，默认是 CSV 格式。目前支持以下格式：CSV, JSON, 
arrow, csv_with_names（支持 csv 文件行首过滤）csv_with_names_and_types（支持 CSV 
文件前两行过滤）Parquet, ORC 例如，指定导入数据格式为 JSON，需要指定命令 `-H "format:json"`。 |
 | jsonpaths                    | 导入 JSON 数据格式有两种方式：简单模式：没有指定 jsonpaths 
为简单模式，这种模式要求 JSON 数据是对象类型匹配模式：用于 JSON 数据相对复杂，需要通过 jsonpaths 参数匹配对应的 value 
在简单模式下，要求 JSON 中的 key 列与表中的列名是一一对应的，如 JSON 数据 {"k1":1, "k2":2, "k3":"hello"}，其中 
k1、k2 及 k3 分别对应表中的列。 |
 | strip_outer_array            | 指定 strip_outer_array 为 true 时表示 JSON 
数据以数组对象开始且将数组对象中进行展平，默认为 false。在 JSON 数据的最外层是 [] 表示的数组时，需要设置 strip_outer_array 
为 true。如以下示例数据，在设置 strip_outer_array 为 true 后，导入 Doris 中生成两行数据`    [{"k1" : 1, 
"v1" : 2},{"k1" : 3, "v1" : 4}]` |
 | json_root                    | json_root 为合法的 jsonpath 字符串，用于指定 json 
document 的根节点，默认值为 ""。 |
-| merge_type                   | 数据的合并类型，一共支持三种类型 APPEND、DELETE、MERGE；APPEND 
是默认值，表示这批数据全部需要追加到现有数据中；DELETE 表示删除与这批数据 key 相同的所有行 MERGE 语义 需要与 DELETE  
条件联合使用，表示满足 DELETE 条件的数据按照 DELETE 语义处理其余的按照 APPEND 语义处理例如，指定合并模式为 
MERGE，需要指定命令`-H "merge_type: MERGE" -H "delete: flag=1"`。 |
+| merge_type                   | 数据的合并类型，支持三种类型：<br/>- 
APPEND（默认值）：表示这批数据全部追加到现有数据中<br/>- DELETE：表示删除与这批数据 Key 相同的所有行<br/>- MERGE：需要与 
DELETE 条件联合使用，表示满足 DELETE 条件的数据按照 DELETE 语义处理，其余的按照 APPEND 语义处理<br/>例如，指定合并模式为 
MERGE：`-H "merge_type: MERGE" -H "delete: flag=1"` |
 | delete                       | 仅在 MERGE 下有意义，表示数据的删除条件                      |
 | function_column.sequence_col | 只适用于 UNIQUE KEYS 模型，相同 Key 列下，保证 Value 列按照 
source_sequence 列进行 REPLACE。source_sequence 可以是数据源中的列，也可以是表结构中的一列。 |
 | fuzzy_parse                  | 布尔类型，为 true 表示 JSON 将以第一行为 schema 
进行解析。开启这个选项可以提高 json 导入效率，但是要求所有 json 对象的 key 的顺序和第一行一致，默认为 false，仅用于 JSON 格式 |
 | num_as_string                | 布尔类型，为 true 表示在解析 JSON 
数据时会将数字类型转为字符串，确保不会出现精度丢失的情况下进行导入。 |
-| read_json_by_line            | 布尔类型，为 true 表示支持每行读取一个 json 对象，默认值为 false。 |
+| read_json_by_line            | 布尔类型，为 true 表示支持每行读取一个 JSON 对象，默认值为 false。 |
 | send_batch_parallelism       | 整型，用于设置发送批处理数据的并行度，如果并行度的值超过 BE 配置中的 
`max_send_batch_parallelism_per_job`，那么作为协调点的 BE 将使用 
`max_send_batch_parallelism_per_job` 的值。 |
 | hidden_columns               | 用于指定导入数据中包含的隐藏列，在 Header 中不包含 Columns 时生效，多个 
hidden column 用逗号分割。系统会使用用户指定的数据导入数据。在下例中，导入数据中最后一列数据为 
`__DORIS_SEQUENCE_COL__`。`hidden_columns: 
__DORIS_DELETE_SIGN__,__DORIS_SEQUENCE_COL__` |
 | load_to_single_tablet        | 布尔类型，为 true 表示支持一个任务只导入数据到对应分区的一个 Tablet，默认值为 
false。该参数只允许在对带有 random 分桶的 OLAP 表导数的时候设置。 |
@@ -420,13 +405,15 @@ curl --location-trusted -u user:passwd [-H "sql: 
${load_sql}"...] -T data.file -
 load_sql 举例：
 
 ```shell
-insert into db.table (col, ...) select stream_col, ... from 
http_stream("property1"="value1");
+insert into db.table (col1, col2, ...) select c1, c2, ... from 
http_stream("property1"="value1");
 ```
 
 http_stream 支持的参数：
 
 "column_separator" = ",", "format" = "CSV",
 
+导入 CSV 文件时，`select ... from http_stream` 子句中的列名格式必须为 `c1, c2, c3, ...`，见下方示例
+
 ...
 
 示例：
@@ -642,9 +629,9 @@ curl --location-trusted -u <doris_user>:<doris_password> \
 curl --location-trusted -u <doris_user>:<doris_password> \
     -H "Expect:100-continue" \
     -H "merge_type: DELETE" \
-    -H "function_column.sequence_col: age" 
+    -H "function_column.sequence_col: age" \
     -H "column_separator:," \
-    -H "columns: name, gender, age" 
+    -H "columns: name, gender, age" \
     -T streamload_example.csv \
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
@@ -769,15 +756,19 @@ curl --location-trusted -u <doris_user>:<doris_password> \
 表结构：
 
 ```sql
-`id` bigint(30) NOT NULL,
-`order_code` varchar(30) DEFAULT NULL COMMENT '',
-`create_time` datetimev2(3) DEFAULT CURRENT_TIMESTAMP
+CREATE TABLE testDb.testTbl (
+    `id` BIGINT(30) NOT NULL,
+    `order_code` VARCHAR(30) DEFAULT NULL COMMENT '',
+    `create_time` DATETIMEv2(3) DEFAULT CURRENT_TIMESTAMP
+)
+DUPLICATE KEY(id)
+DISTRIBUTED BY HASH(id) BUCKETS 10;
 ```
 
 JSON 数据格式：
 
 ```Plain
-{"id":1,"order_Code":"avc"}
+{"id":1,"order_code":"avc"}
 ```
 
 导入命令：
@@ -864,7 +855,7 @@ curl --location-trusted -u <doris_user>:<doris_password> \
 
 ### 指定 JSON 根节点导入数据
 
-如果 JSON 数据包含了嵌套 JSON 字段，需要指定导入 json 的根节点。默认值为“”。
+如果 JSON 数据包含了嵌套 JSON 字段，需要指定导入 JSON 的根节点。默认值为“”。
 
 如下列数据，期望将 comment 列中的数据导入到表中：
 
@@ -1014,7 +1005,7 @@ DISTRIBUTED BY HASH(typ_id,hou) BUCKETS 10;
 ```sql
 curl --location-trusted -u <doris_user>:<doris_password> \
     -H "Expect:100-continue" \
-    -H "columns:typ_id,hou,arr,arr=to_bitmap(arr)"
+    -H "columns:typ_id,hou,arr,arr=to_bitmap(arr)" \
     -T streamload_example.csv \
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
@@ -1058,17 +1049,13 @@ curl --location-trusted -u 
<doris_user>:<doris_password> \
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
 
-### Label、导入事务、多表原子性
-
-Doris 中所有导入任务都是原子生效的。并且在同一个导入任务中对多张表的导入也能够保证原子性。同时，Doris 还可以通过 Label 
的机制来保证数据导入的不丢不重。具体说明可以参阅 
[导入事务和原子性](../../../data-operate/import/load-atomicity) 文档。
-
 ### 列映射、衍生列和过滤
 
-Doris 可以在导入语句中支持非常丰富的列转换和过滤操作。支持绝大多数内置函数和 UDF。关于如何正确的使用这个功能，可参阅 
[数据转换](../../../data-operate/import/load-data-convert) 文档。
+Doris 可以在导入语句中支持非常丰富的列转换和过滤操作。支持绝大多数内置函数。关于如何正确的使用这个功能，可参阅 
[数据转换](../../../data-operate/import/load-data-convert) 文档。
 
 ### 启用严格模式导入
 
-`strict_mode` 
属性用于设置导入任务是否运行在严格模式下。该属性会对列映射、转换和过滤的结果产生影响，它同时也将控制部分列更新的行为。关于严格模式的具体说明，可参阅 
[严格模式](../../../data-operate/import/load-strict-mode) 文档。
+`strict_mode` 
属性用于设置导入任务是否运行在严格模式下。该属性会对列映射、转换和过滤的结果产生影响，它同时也将控制部分列更新的行为。关于严格模式的具体说明，可参阅 
[错误数据处理](../../../data-operate/import/error-data-handling) 文档。
 
 ### 导入时进行部分列更新
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
index 3b30373f9d1..130c3ac7e46 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
@@ -29,23 +29,12 @@ Stream Load 支持通过 HTTP 协议将本地文件或数据流导入到 Doris 
 :::tip
 提示
 
-相比于直接使用 `curl` 的单并发导入，更推荐使用 专用导入工具 Doris Streamloader 该工具是一款用于将数据导入 Doris 
数据库的专用客户端工具，可以提供**多并发导入**的功能，降低大数据量导入的耗时。拥有以下功能：
-
-- 并发导入，实现 Stream Load 的多并发导入。可以通过 `workers` 值设置并发数。
-- 多文件导入，一次导入可以同时导入多个文件及目录，支持设置通配符以及会自动递归获取文件夹下的所有文件。
-- 断点续传，在导入过程中可能出现部分失败的情况，支持在失败点处进行继续传输。
-- 自动重传，在导入出现失败的情况后，无需手动重传，工具会自动重传默认的次数，如果仍然不成功，打印出手动重传的命令。
-
-点击 [Doris Streamloader 文档](../../../ecosystem/doris-streamloader) 了解使用方法与实践详情。
+相比于直接使用 `curl` 的单并发导入，更推荐使用专用导入工具 Doris Streamloader。该工具是一款用于将数据导入 Doris 
数据库的专用客户端工具，可以提供**多并发导入**的功能，降低大数据量导入的耗时。点击 [Doris Streamloader 
文档](../../../ecosystem/doris-streamloader) 了解使用方法与实践详情。
 :::
 
 ## 使用场景
 
-### 支持格式
-
-Stream Load 支持导入 CSV、JSON、Parquet 与 ORC 格式的数据。
-
-### 使用限制
+Stream Load 支持从本地或远程通过 HTTP 的方式导入 CSV、JSON、Parquet 与 ORC 格式的数据。
 
 在导入 CSV 文件时，需要明确区分空值（null）与空字符串：
 
@@ -55,7 +44,7 @@ Stream Load 支持导入 CSV、JSON、Parquet 与 ORC 格式的数据。
 
 ## 基本原理
 
-在使用 Stream Load 时，需要通过 HTTP 协议发起导入作业给 FE 节点，FE 会以轮询方式，重定向（redirect）请求给一个 BE 
节点以达到负载均衡的效果。也可以直接发送 HTTP 请求作业给指定的 BE 节点。在 Stream Load 中，Doris 会选定一个节点做为 
Coordinator 节点。Coordinator 节点负责接受数据并分发数据到其他节点上。
+在使用 Stream Load 时，需要通过 HTTP 协议发起导入作业给 FE 节点，FE 会以轮询方式，重定向（redirect）请求给一个 BE 
节点以达到负载均衡的效果。也可以直接发送 HTTP 请求作业给指定的 BE 节点。在 Stream Load 中，Doris 会选定一个节点作为 
Coordinator 节点。Coordinator 节点负责接受数据并分发数据到其他节点上。
 
 下图展示了 Stream Load 的主要流程：
 
@@ -87,7 +76,7 @@ Stream Load 需要对目标表的 INSERT 权限。如果没有 INSERT 权限，
 
 1. 创建导入数据
 
-    创建 csv 文件 streamload_example.csv 文件。具体内容如下
+    创建 CSV 文件 streamload_example.csv 文件。具体内容如下
 
     ```sql
     1,Emily,25
@@ -290,10 +279,6 @@ Stream Load 操作支持 HTTP 分块导入（HTTP chunked）与 HTTP 非分块
 
 参数描述：Stream Load 默认的超时时间。导入任务的超时时间（以秒为单位），导入任务在设定的 timeout 时间内未完成则会被系统取消，变成 
CANCELLED。如果导入的源文件无法在规定时间内完成导入，用户可以在 Stream Load 请求中设置单独的超时时间。或者调整 FE 
的参数`stream_load_default_timeout_second` 来设置全局的默认超时时间。
 
-2. enable_pipeline_load
-
-  是否开启 Pipeline 引擎执行 Streamload 任务。详见[导入](../load-manual)文档。
-
 **BE 配置**
 
 1. streaming_load_max_mb
@@ -321,16 +306,16 @@ Stream Load 操作支持 HTTP 分块导入（HTTP chunked）与 HTTP 非分块
 | strict_mode                  | 用户指定此次导入是否开启严格模式，默认为关闭。例如，指定开启严格模式，需要指定命令 `-H 
"strict_mode:true"`。 |
 | timezone                     | 
指定本次导入所使用的时区。默认为东八区。该参数会影响所有导入涉及的和时区有关的函数结果。例如，指定导入时区为 Africa/Abidjan，需要指定命令 
`-H "timezone:Africa/Abidjan"`。 |
 | exec_mem_limit               | 导入内存限制。默认为 2GB。单位为字节。                       |
-| format                       | 指定导入数据格式，默认是 CSV 格式。目前支持以下格式：csv, json, 
arrow, csv_with_names（支持 csv 文件行首过滤）csv_with_names_and_types（支持 csv 
文件前两行过滤）parquet, orc 例如，指定导入数据格式为 json，需要指定命令 `-H "format:json"`。 |
+| format                       | 指定导入数据格式，默认是 CSV 格式。目前支持以下格式：CSV, JSON, 
arrow, csv_with_names（支持 csv 文件行首过滤）csv_with_names_and_types（支持 CSV 
文件前两行过滤）Parquet, ORC 例如，指定导入数据格式为 JSON，需要指定命令 `-H "format:json"`。 |
 | jsonpaths                    | 导入 JSON 数据格式有两种方式：简单模式：没有指定 jsonpaths 
为简单模式，这种模式要求 JSON 数据是对象类型匹配模式：用于 JSON 数据相对复杂，需要通过 jsonpaths 参数匹配对应的 value 
在简单模式下，要求 JSON 中的 key 列与表中的列名是一一对应的，如 JSON 数据 {"k1":1, "k2":2, "k3":"hello"}，其中 
k1、k2 及 k3 分别对应表中的列。 |
 | strip_outer_array            | 指定 strip_outer_array 为 true 时表示 JSON 
数据以数组对象开始且将数组对象中进行展平，默认为 false。在 JSON 数据的最外层是 [] 表示的数组时，需要设置 strip_outer_array 
为 true。如以下示例数据，在设置 strip_outer_array 为 true 后，导入 Doris 中生成两行数据`    [{"k1" : 1, 
"v1" : 2},{"k1" : 3, "v1" : 4}]` |
 | json_root                    | json_root 为合法的 jsonpath 字符串，用于指定 json 
document 的根节点，默认值为 ""。 |
-| merge_type                   | 数据的合并类型，一共支持三种类型 APPEND、DELETE、MERGE；APPEND 
是默认值，表示这批数据全部需要追加到现有数据中；DELETE 表示删除与这批数据 key 相同的所有行 MERGE 语义 需要与 DELETE  
条件联合使用，表示满足 DELETE 条件的数据按照 DELETE 语义处理其余的按照 APPEND 语义处理例如，指定合并模式为 
MERGE，需要指定命令`-H "merge_type: MERGE" -H "delete: flag=1"`。 |
+| merge_type                   | 数据的合并类型，支持三种类型：<br/>- 
APPEND（默认值）：表示这批数据全部追加到现有数据中<br/>- DELETE：表示删除与这批数据 Key 相同的所有行<br/>- MERGE：需要与 
DELETE 条件联合使用，表示满足 DELETE 条件的数据按照 DELETE 语义处理，其余的按照 APPEND 语义处理<br/>例如，指定合并模式为 
MERGE：`-H "merge_type: MERGE" -H "delete: flag=1"` |
 | delete                       | 仅在 MERGE 下有意义，表示数据的删除条件                      |
 | function_column.sequence_col | 只适用于 UNIQUE KEYS 模型，相同 Key 列下，保证 Value 列按照 
source_sequence 列进行 REPLACE。source_sequence 可以是数据源中的列，也可以是表结构中的一列。 |
 | fuzzy_parse                  | 布尔类型，为 true 表示 JSON 将以第一行为 schema 
进行解析。开启这个选项可以提高 json 导入效率，但是要求所有 json 对象的 key 的顺序和第一行一致，默认为 false，仅用于 JSON 格式 |
 | num_as_string                | 布尔类型，为 true 表示在解析 JSON 
数据时会将数字类型转为字符串，确保不会出现精度丢失的情况下进行导入。 |
-| read_json_by_line            | 布尔类型，为 true 表示支持每行读取一个 json 对象，默认值为 false。 |
+| read_json_by_line            | 布尔类型，为 true 表示支持每行读取一个 JSON 对象，默认值为 false。 |
 | send_batch_parallelism       | 整型，用于设置发送批处理数据的并行度，如果并行度的值超过 BE 配置中的 
`max_send_batch_parallelism_per_job`，那么作为协调点的 BE 将使用 
`max_send_batch_parallelism_per_job` 的值。 |
 | hidden_columns               | 用于指定导入数据中包含的隐藏列，在 Header 中不包含 Columns 时生效，多个 
hidden column 用逗号分割。系统会使用用户指定的数据导入数据。在下例中，导入数据中最后一列数据为 
`__DORIS_SEQUENCE_COL__`。`hidden_columns: 
__DORIS_DELETE_SIGN__,__DORIS_SEQUENCE_COL__` |
 | load_to_single_tablet        | 布尔类型，为 true 表示支持一个任务只导入数据到对应分区的一个 Tablet，默认值为 
false。该参数只允许在对带有 random 分桶的 OLAP 表导数的时候设置。 |
@@ -758,7 +743,7 @@ curl --location-trusted -u <doris_user>:<doris_password> \
     -H "Expect:100-continue" \
     -H "column_separator:," \
     -H "enclose:'" \
-    -H "escape:\\" \    
+    -H "escape:\\" \
     -H "columns:username,age,address" \
     -T streamload_example.csv \
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
@@ -870,7 +855,7 @@ curl --location-trusted -u <doris_user>:<doris_password> \
 
 ### 指定 JSON 根节点导入数据
 
-如果 JSON 数据包含了嵌套 JSON 字段，需要指定导入 json 的根节点。默认值为“”。
+如果 JSON 数据包含了嵌套 JSON 字段，需要指定导入 JSON 的根节点。默认值为“”。
 
 如下列数据，期望将 comment 列中的数据导入到表中：
 
@@ -1064,13 +1049,9 @@ curl --location-trusted -u <doris_user>:<doris_password> 
\
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
 
-### Label、导入事务、多表原子性
-
-Doris 中所有导入任务都是原子生效的。并且在同一个导入任务中对多张表的导入也能够保证原子性。同时，Doris 还可以通过 Label 
的机制来保证数据导入的不丢不重。具体说明可以参阅 [导入事务和原子性](../../../data-operate/transaction) 文档。
-
 ### 列映射、衍生列和过滤
 
-Doris 可以在导入语句中支持非常丰富的列转换和过滤操作。支持绝大多数内置函数和 UDF。关于如何正确的使用这个功能，可参阅 
[数据转换](../../../data-operate/import/load-data-convert) 文档。
+Doris 可以在导入语句中支持非常丰富的列转换和过滤操作。支持绝大多数内置函数。关于如何正确的使用这个功能，可参阅 
[数据转换](../../../data-operate/import/load-data-convert) 文档。
 
 ### 启用严格模式导入
 
diff --git 
a/versioned_docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
 
b/versioned_docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
index d51fd7aa5ef..097d6c462c8 100644
--- 
a/versioned_docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
+++ 
b/versioned_docs/version-2.1/data-operate/import/import-way/stream-load-manual.md
@@ -43,13 +43,7 @@ See [Doris 
Streamloader](../../../ecosystem/doris-streamloader) for detailed ins
 
 ## User guide
 
-### Supported formats
-
-Stream Load supports importing data in CSV, JSON, Parquet, and ORC formats.
-
-### Usage limitations
-
-When importing CSV files, it's important to distinguish between null values 
and empty strings:
+Stream Load supports importing CSV, JSON, Parquet, and ORC format data from 
local or remote sources via HTTP.
 
 - Null values: Use `\N` to represent null. For example, `a,\N,b` indicates the 
middle column is null.
 - Empty string: An empty string is represented when there are no characters 
between two delimiters. For example, in `a,,b`, there are no characters between 
the two commas, indicating that the middle column value is an empty string.
@@ -287,10 +281,6 @@ Stream Load operations support both HTTP chunked and 
non-chunked import methods.
 
 Parameter Description: The default timeout for Stream Load. The load job will 
be canceled by the system if it is not completed within the set timeout (in 
seconds). If the source file cannot be imported within the specified time, the 
user can set an individual timeout in the Stream Load request. Alternatively, 
adjust the `stream_load_default_timeout_second` parameter on the FE to set the 
global default timeout.
 
-2. `enable_pipeline_load`
-
-Determines whether to enable the Pipeline engine to execute Streamload tasks. 
See the [import](../load-manual) documentation for more details.
-
 #### BE configuration
 
 1. `streaming_load_max_mb`
@@ -316,11 +306,11 @@ Determines whether to enable the Pipeline engine to 
execute Streamload tasks. Se
 | strict_mode                  | Used to specify whether to enable strict mode 
for this import, which is disabled by default. For example, to enable strict 
mode, use the command `-H "strict_mode:true"`. |
 | timezone                     | Used to specify the timezone to be used for 
this import, which defaults to GMT+8. This parameter affects the results of all 
timezone-related functions involved in the import. For example, to specify the 
import timezone as Africa/Abidjan, use the command `-H 
"timezone:Africa/Abidjan"`. |
 | exec_mem_limit               | The memory limit for the import, which 
defaults to 2GB. The unit is bytes. |
-| format                       | Used to specify the format of the imported 
data, which defaults to CSV. Currently supported formats include: csv, json, 
arrow, csv_with_names (supports filtering the first row of the csv file), 
csv_with_names_and_types (supports filtering the first two rows of the csv 
file), parquet, and orc. For example, to specify the imported data format as 
json, use the command `-H "format:json"`. |
+| format                       | Used to specify the format of the imported 
data, which defaults to CSV. Currently supported formats include: CSV, JSON, 
arrow, csv_with_names (supports filtering the first row of the csv file), 
csv_with_names_and_types (supports filtering the first two rows of the csv 
file), Parquet, and ORC. For example, to specify the imported data format as 
JSON, use the command `-H "format:json"`. |
 | jsonpaths                    | There are two ways to import JSON data 
format: Simple Mode and Matching Mode.  If no jsonpaths are specified, it is 
the simple mode that requires the JSON data to be of the object type.Matching 
mode used when the JSON data is relatively complex and requires matching the 
corresponding values through the jsonpaths parameter.In simple mode, the keys 
in JSON are required to correspond one-to-one with the column names in the 
table. For example, in the JSON dat [...]
 | strip_outer_array            | When `strip_outer_array` is set to true, it 
indicates that the JSON data starts with an array object and flattens the 
objects within the array. The default value is false. When the outermost layer 
of the JSON data is represented by `[]`, which denotes an array, 
`strip_outer_array` should be set to true. For example, with the following 
data, setting `strip_outer_array` to true will result in two rows of data being 
generated when imported into Doris: `[{"k1 [...]
 | json_root                    | `json_root` is a valid jsonpath string that 
specifies the root node of a JSON document, with a default value of "". |
-| merge_type                   | There are three types of data merging: 
APPEND, DELETE, and MERGE. APPEND is the default value, indicating that this 
batch of data needs to be appended to the existing data. DELETE means to remove 
all rows that have the same keys as this batch of data. MERGE semantics need to 
be used in conjunction with delete conditions. It means that data satisfying 
the delete conditions will be processed according to DELETE semantics, while 
the rest will be processed ac [...]
+| merge_type                   | The merge type of data. Three types are 
supported:<br/>- APPEND (default): Indicates that all data in this batch will 
be appended to existing data<br/>- DELETE: Indicates deletion of all rows with 
Keys matching this batch of data<br/>- MERGE: Must be used in conjunction with 
DELETE conditions. Data meeting DELETE conditions will be processed according 
to DELETE semantics, while the rest will be processed according to APPEND 
semantics<br/>For example, to s [...]
 | delete                       | It is only meaningful under MERGE, 
representing the deletion conditions for data. |
 | function_column.sequence_col | It is suitable only for the UNIQUE KEYS 
model. Within the same Key column, it ensures that the Value column is replaced 
according to the specified source_sequence column. The source_sequence can 
either be a column from the data source or an existing column in the table 
structure. |
 | fuzzy_parse                  | It is a boolean type. If set to true, the 
JSON will be parsed with the first row as the schema. Enabling this option can 
improve the efficiency of JSON imports, but it requires that the order of the 
keys in all JSON objects be consistent with the first line. The default is 
false and it is only used for JSON format. |
@@ -636,7 +626,7 @@ When a table with a Unique Key has a Sequence column, the 
value of the Sequence
 curl --location-trusted -u <doris_user>:<doris_password> \
     -H "Expect:100-continue" \
     -H "merge_type: DELETE" \
-    -H "function_column.sequence_col: age" 
+    -H "function_column.sequence_col: age" \
     -H "column_separator:," \
     -H "columns: name, gender, age" 
     -T streamload_example.csv \
@@ -1008,7 +998,7 @@ And use to_bitmap to convert the data into the Bitmap type.
 ```sql
 curl --location-trusted -u <doris_user>:<doris_password> \
     -H "Expect:100-continue" \
-    -H "columns:typ_id,hou,arr,arr=to_bitmap(arr)"
+    -H "columns:typ_id,hou,arr,arr=to_bitmap(arr)" \
     -T streamload_example.csv \
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
@@ -1052,13 +1042,9 @@ curl --location-trusted -u <doris_user>:<doris_password> 
\
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
 
-### Label, loading transaction, multi-table atomicity
-
-All load jobs in Doris are atomically effective. And multiple tables loading 
in the same load job can also guarantee atomicity. At the same time, Doris can 
also use the Label mechanism to ensure that data loading is not lost or 
duplicated. For specific instructions, please refer to the [Import Transactions 
and Atomicity](../../../data-operate/import/load-atomicity) documentation.
-
 ### Column mapping, derived columns, and filtering
 
-Doris supports a very rich set of column transformations and filtering 
operations in load statements. Supports most built-in functions and UDFs. For 
how to use this feature correctly, please refer to the [Data 
Transformation](../../../data-operate/import/load-data-convert) documentation.
+Doris supports a very rich set of column transformations and filtering 
operations in load statements. Supports most built-in functions. For how to use 
this feature correctly, please refer to the [Data 
Transformation](../../../data-operate/import/load-data-convert) documentation.
 
 ### Enable strict mode import
 
@@ -1071,456 +1057,3 @@ For how to express partial column updates during 
import, please refer to the Dat
 ## More help
 
 For more detailed syntax and best practices on using Stream Load, please refer 
to the [Stream 
Load](../../../sql-manual/sql-statements/Data-Manipulation-Statements/Load/STREAM-LOAD)
 Command Manual. You can also enter HELP STREAM LOAD in the MySql client 
command line to get more help information.
-
-
-
-
-
-
-
-
-
-
-
-
-
-Stream load submits and transfers data through HTTP protocol. Here, the `curl` 
command shows how to submit an import.
-
-Users can also operate through other HTTP clients.
-
-```
-curl --location-trusted -u user:passwd [-H ""...] -T data.file -XPUT 
http://fe_host:http_port/api/{db}/{table}/_stream_load
-
-The properties supported in the header are described in "Load Parameters" below
-The format is: - H "key1: value1"
-```
-
-Examples:
-
-```
-curl --location-trusted -u root -T date -H "label:123" 
http://abc.com:8030/api/test/date/_stream_load
-```
-The detailed syntax for creating imports helps to execute ``HELP STREAM LOAD`` 
view. The following section focuses on the significance of creating some 
parameters of Stream load.
-
-**Signature parameters**
-
-+ user/passwd
-
-  Stream load uses the HTTP protocol to create the imported protocol and signs 
it through the Basic Access authentication. The Doris system verifies user 
identity and import permissions based on signatures.
-
-**Load Parameters**
-
-Stream load uses HTTP protocol, so all parameters related to import tasks are 
set in the header. The significance of some parameters of the import task 
parameters of Stream load is mainly introduced below.
-
-+ label
-
-  Identity of import task. Each import task has a unique label inside a single 
database. Label is a user-defined name in the import command. With this label, 
users can view the execution of the corresponding import task.
-
-  Another function of label is to prevent users from importing the same data 
repeatedly. **It is strongly recommended that users use the same label for the 
same batch of data. This way, repeated requests for the same batch of data will 
only be accepted once, guaranteeing at-Most-Once**
-
-  When the corresponding import operation state of label is CANCELLED, the 
label can be used again.
-
-
-+ column_separator
-
-    Used to specify the column separator in the load file. The default is 
`\t`. If it is an invisible character, you need to add `\x` as a prefix and 
hexadecimal to indicate the separator.
-
-    For example, the separator `\x01` of the hive file needs to be specified 
as `-H "column_separator:\x01"`.
-
-    You can use a combination of multiple characters as the column separator.
-
-+ line_delimiter
-
-   Used to specify the line delimiter in the load file. The default is `\n`.
-
-   You can use a combination of multiple characters as the column separator.
-
-+ max\_filter\_ratio
-
-  The maximum tolerance rate of the import task is 0 by default, and the range 
of values is 0-1. When the import error rate exceeds this value, the import 
fails.
-
-  If the user wishes to ignore the wrong row, the import can be successful by 
setting this parameter greater than 0.
-
-  The calculation formula is as follows:
-
-    ``` (dpp.abnorm.ALL / (dpp.abnorm.ALL + dpp.norm.ALL ) ) > 
max_filter_ratio ```
-
-  ``` dpp.abnorm.ALL``` denotes the number of rows whose data quality is not 
up to standard. Such as type mismatch, column mismatch, length mismatch and so 
on.
-
-  ``` dpp.norm.ALL ``` refers to the number of correct data in the import 
process. The correct amount of data for the import task can be queried by the 
``SHOW LOAD` command.
-
-  The number of rows in the original file = `dpp.abnorm.ALL + dpp.norm.ALL`
-
-+ where
-
-    Import the filter conditions specified by the task. Stream load supports 
filtering of where statements specified for raw data. The filtered data will 
not be imported or participated in the calculation of filter ratio, but will be 
counted as `num_rows_unselected`.
-
-+ partitions
-
-    Partitions information for tables to be imported will not be imported if 
the data to be imported does not belong to the specified Partition. These data 
will be included in `dpp.abnorm.ALL`.
-
-+ columns
-
-    The function transformation configuration of data to be imported includes 
the sequence change of columns and the expression transformation, in which the 
expression transformation method is consistent with the query statement.
-
-    ```
-    Examples of column order transformation: There are three columns of 
original data (src_c1,src_c2,src_c3), and there are also three columns 
（dst_c1,dst_c2,dst_c3) in the doris table at present.
-    when the first column src_c1 of the original file corresponds to the 
dst_c1 column of the target table, while the second column src_c2 of the 
original file corresponds to the dst_c2 column of the target table and the 
third column src_c3 of the original file corresponds to the dst_c3 column of 
the target table,which is written as follows:
-    columns: dst_c1, dst_c2, dst_c3
-    
-    when the first column src_c1 of the original file corresponds to the 
dst_c2 column of the target table, while the second column src_c2 of the 
original file corresponds to the dst_c3 column of the target table and the 
third column src_c3 of the original file corresponds to the dst_c1 column of 
the target table,which is written as follows:
-    columns: dst_c2, dst_c3, dst_c1
-    
-    Example of expression transformation: There are two columns in the 
original file and two columns in the target table (c1, c2). However, both 
columns in the original file need to be transformed by functions to correspond 
to the two columns in the target table.
-    columns: tmp_c1, tmp_c2, c1 = year(tmp_c1), c2 = mouth(tmp_c2)
-    Tmp_* is a placeholder, representing two original columns in the original 
file.
-    ```
-  
-+ format
-
-  Specify the import data format, support csv, json, the default is csv
-
- supports `csv_with_names` (csv file line header filter), 
`csv_with_names_and_types` (csv file first two lines filter), parquet, orc
-
-+ exec\_mem\_limit
-
-    Memory limit. Default is 2GB. Unit is Bytes
-
-+ merge\_type
-
-     The type of data merging supports three types: APPEND, DELETE, and MERGE. 
APPEND is the default value, which means that all this batch of data needs to 
be appended to the existing data. DELETE means to delete all rows with the same 
key as this batch of data. MERGE semantics Need to be used in conjunction with 
the delete condition, which means that the data that meets the delete condition 
is processed according to DELETE semantics and the rest is processed according 
to APPEND semantics
-
-+ two\_phase\_commit
-
-  Stream load import can enable two-stage transaction commit mode: in the 
stream load process, the data is written and the information is returned to the 
user. At this time, the data is invisible and the transaction status is 
`PRECOMMITTED`. After the user manually triggers the commit operation, the data 
is visible.
-
-+ enclose
-  
-  When the csv data field contains row delimiters or column delimiters, to 
prevent accidental truncation, single-byte characters can be specified as 
brackets for protection. For example, the column separator is ",", the bracket 
is "'", and the data is "a,'b,c'", then "b,c" will be parsed as a field.
-  Note: when the bracket is `"`, trim\_double\_quotes must be set to true.
-
-+ escape
-
-  Used to escape characters that appear in a csv field identical to the 
enclosing characters. For example, if the data is "a,'b,'c'", enclose is "'", 
and you want "b,'c to be parsed as a field, you need to specify a single-byte 
escape character, such as "\", and then modify the data to "a,' b,\'c'".
-
-  Example：
-
-    1. Initiate a stream load pre-commit operation
-  ```shell
-  curl  --location-trusted -u user:passwd -H "two_phase_commit:true" -T 
test.txt http://fe_host:http_port/api/{db}/{table}/_stream_load
-  {
-      "TxnId": 18036,
-      "Label": "55c8ffc9-1c40-4d51-b75e-f2265b3602ef",
-      "TwoPhaseCommit": "true",
-      "Status": "Success",
-      "Message": "OK",
-      "NumberTotalRows": 100,
-      "NumberLoadedRows": 100,
-      "NumberFilteredRows": 0,
-      "NumberUnselectedRows": 0,
-      "LoadBytes": 1031,
-      "LoadTimeMs": 77,
-      "BeginTxnTimeMs": 1,
-      "StreamLoadPutTimeMs": 1,
-      "ReadDataTimeMs": 0,
-      "WriteDataTimeMs": 58,
-      "CommitAndPublishTimeMs": 0
-  }
-  ```
-    1. Trigger the commit operation on the transaction.
-      Note 1) requesting to fe and be both works
-      Note 2) `{table}` in url can be omit when commit
-      using txn id
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H "txn_id:18036" -H 
"txn_operation:commit"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "transaction [18036] commit successfully."
-  }
-  ```
-  using label
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H 
"label:55c8ffc9-1c40-4d51-b75e-f2265b3602ef" -H "txn_operation:commit"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "label [55c8ffc9-1c40-4d51-b75e-f2265b3602ef] commit 
successfully."
-  }
-  ```
-    1. Trigger an abort operation on a transaction
-      Note 1) requesting to fe and be both works
-      Note 2) `{table}` in url can be omit when abort
-      using txn id
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H "txn_id:18037" -H 
"txn_operation:abort"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "transaction [18037] abort successfully."
-  }
-  ```
-  using label
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H 
"label:55c8ffc9-1c40-4d51-b75e-f2265b3602ef" -H "txn_operation:abort"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "label [55c8ffc9-1c40-4d51-b75e-f2265b3602ef] abort successfully."
-  }
-  ```
-
-+ enable_profile
-
-  When `enable_profile` is true, the Stream Load profile will be printed to 
logs (be.INFO).
-
-+ memtable_on_sink_node
-
-  Whether to enable MemTable on DataSink node when loading data, default is 
false.
-
-  Build MemTable on DataSink node, and send segments to other backends through 
brpc streaming.
-  It reduces duplicate work among replicas, and saves time in data 
serialization & deserialization.
-- partial_columns
-   Whether to enable partial column updates, Boolean type, True means that use 
partial column update, the default value is false, this parameter is only 
allowed to be set when the table model is Unique and Merge on Write is used.
-
-   eg: `curl  --location-trusted -u root: -H "partial_columns:true" -H 
"column_separator:," -H "columns:id,balance,last_access_time" -T /tmp/test.csv 
http://127.0.0.1:48037/api/db1/user_profile/_stream_load`
-
-### Use stream load with SQL
-
-You can add a `sql` parameter to the `Header` to replace the 
`column_separator`, `line_delimiter`, `where`, `columns` in the previous 
parameter, which is convenient to use.
-
-```
-curl --location-trusted -u user:passwd [-H "sql: ${load_sql}"...] -T data.file 
-XPUT http://fe_host:http_port/api/_http_stream
-
-
-# -- load_sql
-# insert into db.table (col, ...) select stream_col, ... from 
http_stream("property1"="value1");
-
-# http_stream
-# (
-#     "column_separator" = ",",
-#     "format" = "CSV",
-#     ...
-# )
-```
-
-Examples：
-
-```
-curl  --location-trusted -u root: -T test.csv  -H "sql:insert into 
demo.example_tbl_1(user_id, age, cost) select c1, c4, c7 * 2 from 
http_stream("format" = "CSV", "column_separator" = "," ) where age >= 30"  
http://127.0.0.1:28030/api/_http_stream
-```
-
-### Return results
-
-Since Stream load is a synchronous import method, the result of the import is 
directly returned to the user by creating the return value of the import.
-
-Examples:
-
-```
-{
-    "TxnId": 1003,
-    "Label": "b6f3bc78-0d2c-45d9-9e4c-faa0a0149bee",
-    "Status": "Success",
-    "ExistingJobStatus": "FINISHED", // optional
-    "Message": "OK",
-    "NumberTotalRows": 1000000,
-    "NumberLoadedRows": 1000000,
-    "NumberFilteredRows": 1,
-    "NumberUnselectedRows": 0,
-    "LoadBytes": 40888898,
-    "LoadTimeMs": 2144,
-    "BeginTxnTimeMs": 1,
-    "StreamLoadPutTimeMs": 2,
-    "ReadDataTimeMs": 325,
-    "WriteDataTimeMs": 1933,
-    "CommitAndPublishTimeMs": 106,
-    "ErrorURL": 
"http://192.168.1.1:8042/api/_load_error_log?file=__shard_0/error_log_insert_stmt_db18266d4d9b4ee5-abb00ddd64bdf005_db18266d4d9b4ee5_abb00ddd64bdf005";
-}
-```
-
-The following main explanations are given for the Stream load import result 
parameters:
-
-+ TxnId: The imported transaction ID. Users do not perceive.
-
-+ Label: Import Label. User specified or automatically generated by the system.
-
-+ Status: Import completion status.
-
-  "Success": Indicates successful import.
-
-  "Publish Timeout": This state also indicates that the import has been 
completed, except that the data may be delayed and visible without retrying.
-
-  "Label Already Exists": Label duplicate, need to be replaced Label.
-
-  "Fail": Import failed.
-
-+ ExistingJobStatus: The state of the load job corresponding to the existing 
Label.
-
-    This field is displayed only when the status is "Label Already Exists". 
The user can know the status of the load job corresponding to Label through 
this state. "RUNNING" means that the job is still executing, and "FINISHED" 
means that the job is successful.
-
-+ Message: Import error messages.
-
-+ NumberTotalRows: Number of rows imported for total processing.
-
-+ NumberLoadedRows: Number of rows successfully imported.
-
-+ NumberFilteredRows: Number of rows that do not qualify for data quality.
-
-+ NumberUnselectedRows: Number of rows filtered by where condition.
-
-+ LoadBytes: Number of bytes imported.
-
-+ LoadTimeMs: Import completion time. Unit milliseconds.
-
-+ BeginTxnTimeMs: The time cost for RPC to Fe to begin a transaction, Unit 
milliseconds.
-
-+ StreamLoadPutTimeMs: The time cost for RPC to Fe to get a stream load plan, 
Unit milliseconds.
-
-+ ReadDataTimeMs: Read data time, Unit milliseconds.
-
-+ WriteDataTimeMs: Write data time, Unit milliseconds.
-
-+ CommitAndPublishTimeMs: The time cost for RPC to Fe to commit and publish a 
transaction, Unit milliseconds.
-
-+ ErrorURL: If you have data quality problems, visit this URL to see specific 
error lines.
-
-:::info Note
-Since Stream load is a synchronous import mode, import information will not be 
recorded in Doris system. Users cannot see Stream load asynchronously by 
looking at import commands. You need to listen for the return value of the 
create import request to get the import result.
-:::
-
-### Cancel Load
-
-Users can't cancel Stream load manually. Stream load will be cancelled 
automatically by the system after a timeout or import error.
-
-### View Stream Load
-
-Users can view completed stream load tasks through `show stream load`.
-
-By default, BE does not record Stream Load records. If you want to view 
records that need to be enabled on BE, the configuration parameter is: 
`enable_stream_load_record=true`. For details, please refer to [BE 
Configuration Items](../../../admin-manual/config/be-config)
-
-## Relevant System Configuration
-
-### FE configuration
-
-+ stream\_load\_default\_timeout\_second
-
-  The timeout time of the import task (in seconds) will be cancelled by the 
system if the import task is not completed within the set timeout time, and 
will become CANCELLED.
-
-  At present, Stream load does not support custom import timeout time. All 
Stream load import timeout time is uniform. The default timeout time is 600 
seconds. If the imported source file can no longer complete the import within 
the specified time, the FE parameter ```stream_load_default_timeout_second``` 
needs to be adjusted.
-
-+ enable\_pipeline\_load
-
-  Whether or not to enable the Pipeline engine to execute Streamload tasks. 
See the [Import](../../../data-operate/import/load-manual) documentation.
-
-### BE configuration
-
-+ streaming\_load\_max\_mb
-
-  The maximum import size of Stream load is 10G by default, in MB. If the 
user's original file exceeds this value, the BE parameter 
```streaming_load_max_mb``` needs to be adjusted.
-
-## Best Practices
-
-### Application scenarios
-
-The most appropriate scenario for using Stream load is that the original file 
is in memory or on disk. Secondly, since Stream load is a synchronous import 
method, users can also use this import if they want to obtain the import 
results in a synchronous manner.
-
-### Data volume
-
-Since Stream load is based on the BE initiative to import and distribute data, 
the recommended amount of imported data is between 1G and 10G. Since the 
default maximum Stream load import data volume is 10G, the configuration of BE 
```streaming_load_max_mb``` needs to be modified if files exceeding 10G are to 
be imported.
-
-```
-For example, the size of the file to be imported is 15G
-Modify the BE configuration streaming_load_max_mb to 16000
-```
-
-Stream load default timeout is 600 seconds, according to Doris currently the 
largest import speed limit, about more than 3G files need to modify the import 
task default timeout.
-
-```
-Import Task Timeout = Import Data Volume / 10M / s (Specific Average Import 
Speed Requires Users to Calculate Based on Their Cluster Conditions)
-For example, import a 10G file
-Timeout = 1000s -31561;. 20110G / 10M /s
-```
-
-### Complete examples
-
-Data situation: In the local disk path /home/store_sales of the sending and 
importing requester, the imported data is about 15G, and it is hoped to be 
imported into the table store\_sales of the database bj_sales.
-
-Cluster situation: The concurrency of Stream load is not affected by cluster 
size.
-
-+ Step 1: Does the import file size exceed the default maximum import size of 
10G
-
-  ```
-  BE conf
-  streaming_load_max_mb = 16000
-  ```
-+ Step 2: Calculate whether the approximate import time exceeds the default 
timeout value
-
-  ```
-  Import time 15000/10 = 1500s
-  Over the default timeout time, you need to modify the FE configuration
-  stream_load_default_timeout_second = 1500
-  ```
-
-+ Step 3: Create Import Tasks
-
-    ```
-    curl --location-trusted -u user:password -T /home/store_sales -H 
"label:abc" http://abc.com:8030/api/bj_sales/store_sales/_stream_load
-    ```
-
-### Coding with StreamLoad
-
-You can initiate HTTP requests for Stream Load using any language. Before 
initiating HTTP requests, you need to set several necessary headers:
-
-```http
-Content-Type: text/plain; charset=UTF-8
-Expect: 100-continue
-Authorization: Basic <Base64 encoded username and password>
-```
-
-`<Base64 encoded username and password>`: a string consist with Doris's 
`username`, `:` and `password` and then do a base64 encode.
-
-Additionally, it should be noted that if you directly initiate an HTTP request 
to FE, as Doris will redirect to BE, some frameworks will remove the 
`Authorization` HTTP header during this process, which requires manual 
processing.
-
-Doris provides StreamLoad examples in three languages: 
[Java](https://github.com/apache/doris/tree/master/samples/stream_load/java), 
[Go](https://github.com/apache/doris/tree/master/samples/stream_load/go), and 
[Python](https://github.com/apache/doris/tree/master/samples/stream_load/python)
 for reference.
-
-## Common Questions
-
-* Label Already Exists
-
-  The Label repeat checking steps of Stream load are as follows:
-
-  1. Is there an import Label conflict that already exists with other import 
methods?
-
-    Because imported Label in Doris system does not distinguish between import 
methods, there is a problem that other import methods use the same Label.
-
-    Through ``SHOW LOAD WHERE LABEL = "xxx"'``, where XXX is a duplicate Label 
string, see if there is already a Label imported by FINISHED that is the same 
as the Label created by the user.
-
-  2. Are Stream loads submitted repeatedly for the same job?
-
-    Since Stream load is an HTTP protocol submission creation import task, 
HTTP Clients in various languages usually have their own request retry logic. 
After receiving the first request, the Doris system has started to operate 
Stream load, but because the result is not returned to the Client side in time, 
the Client side will retry to create the request. At this point, the Doris 
system is already operating on the first request, so the second request will be 
reported to Label Already Exists.
-
-    To sort out the possible methods mentioned above: Search FE Master's log 
with Label to see if there are two ``redirect load action to destination = 
``redirect load action to destination cases in the same Label. If so, the 
request is submitted repeatedly by the Client side.
-
-    It is recommended that the user calculate the approximate import time 
based on the amount of data currently requested, and change the request 
overtime on the client side to a value greater than the import timeout time 
according to the import timeout time to avoid multiple submissions of the 
request by the client side.
-
-  3. Connection reset abnormal
-
-    In the community version 0.14.0 and earlier versions, the connection reset 
exception occurred after Http V2 was enabled, because the built-in web 
container is tomcat, and Tomcat has pits in 307 (Temporary Redirect). There is 
a problem with the implementation of this protocol. All In the case of using 
Stream load to import a large amount of data, a connect reset exception will 
occur. This is because tomcat started data transmission before the 307 jump, 
which resulted in the lack of au [...]
-
-    After the upgrade, also upgrade the http client version of your program to 
`4.5.13`，Introduce the following dependencies in your pom.xml file
-
-    ```xml
-        <dependency>
-          <groupId>org.apache.httpcomponents</groupId>
-          <artifactId>httpclient</artifactId>
-          <version>4.5.13</version>
-        </dependency>
-    ```
-
-* After enabling the Stream Load record on the BE, the record cannot be queried
-
-  This is caused by the slowness of fetching records, you can try to adjust 
the following parameters:
-
-  1. Increase the BE configuration `stream_load_record_batch_size`. This 
configuration indicates how many Stream load records can be pulled from BE each 
time. The default value is 50, which can be increased to 500.
-  2. Reduce the FE configuration `fetch_stream_load_record_interval_second`, 
this configuration indicates the interval for obtaining Stream load records, 
the default is to fetch once every 120 seconds, and it can be adjusted to 60 
seconds.
-  3. If you want to save more Stream load records (not recommended, it will 
take up more resources of FE), you can increase the configuration 
`max_stream_load_record_size` of FE, the default is 5000.
-
-## More Help
-
-For more detailed syntax used by **Stream Load**,  you can enter `HELP STREAM 
LOAD` on the Mysql client command line for more help.
diff --git 
a/versioned_docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
 
b/versioned_docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
index 65ce670fdb0..3849ae1017f 100644
--- 
a/versioned_docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
+++ 
b/versioned_docs/version-3.0/data-operate/import/import-way/stream-load-manual.md
@@ -43,13 +43,7 @@ See [Doris 
Streamloader](../../../ecosystem/doris-streamloader) for detailed ins
 
 ## User guide
 
-### Supported formats
-
-Stream Load supports importing data in CSV, JSON, Parquet, and ORC formats.
-
-### Usage limitations
-
-When importing CSV files, it's important to distinguish between null values 
and empty strings:
+Stream Load supports importing CSV, JSON, Parquet, and ORC format data from 
local or remote sources via HTTP.
 
 - Null values: Use `\N` to represent null. For example, `a,\N,b` indicates the 
middle column is null.
 - Empty string: An empty string is represented when there are no characters 
between two delimiters. For example, in `a,,b`, there are no characters between 
the two commas, indicating that the middle column value is an empty string.
@@ -287,10 +281,6 @@ Stream Load operations support both HTTP chunked and 
non-chunked import methods.
 
 Parameter Description: The default timeout for Stream Load. The load job will 
be canceled by the system if it is not completed within the set timeout (in 
seconds). If the source file cannot be imported within the specified time, the 
user can set an individual timeout in the Stream Load request. Alternatively, 
adjust the `stream_load_default_timeout_second` parameter on the FE to set the 
global default timeout.
 
-2. `enable_pipeline_load`
-
-Determines whether to enable the Pipeline engine to execute Streamload tasks. 
See the [import](../load-manual) documentation for more details.
-
 #### BE configuration
 
 1. `streaming_load_max_mb`
@@ -316,11 +306,11 @@ Determines whether to enable the Pipeline engine to 
execute Streamload tasks. Se
 | strict_mode                  | Used to specify whether to enable strict mode 
for this import, which is disabled by default. For example, to enable strict 
mode, use the command `-H "strict_mode:true"`. |
 | timezone                     | Used to specify the timezone to be used for 
this import, which defaults to GMT+8. This parameter affects the results of all 
timezone-related functions involved in the import. For example, to specify the 
import timezone as Africa/Abidjan, use the command `-H 
"timezone:Africa/Abidjan"`. |
 | exec_mem_limit               | The memory limit for the import, which 
defaults to 2GB. The unit is bytes. |
-| format                       | Used to specify the format of the imported 
data, which defaults to CSV. Currently supported formats include: csv, json, 
arrow, csv_with_names (supports filtering the first row of the csv file), 
csv_with_names_and_types (supports filtering the first two rows of the csv 
file), parquet, and orc. For example, to specify the imported data format as 
json, use the command `-H "format:json"`. |
+| format                       | Used to specify the format of the imported 
data, which defaults to CSV. Currently supported formats include: CSV, JSON, 
arrow, csv_with_names (supports filtering the first row of the csv file), 
csv_with_names_and_types (supports filtering the first two rows of the csv 
file), Parquet, and ORC. For example, to specify the imported data format as 
JSON, use the command `-H "format:json"`. |
 | jsonpaths                    | There are two ways to import JSON data 
format: Simple Mode and Matching Mode.  If no jsonpaths are specified, it is 
the simple mode that requires the JSON data to be of the object type.Matching 
mode used when the JSON data is relatively complex and requires matching the 
corresponding values through the jsonpaths parameter.In simple mode, the keys 
in JSON are required to correspond one-to-one with the column names in the 
table. For example, in the JSON dat [...]
 | strip_outer_array            | When `strip_outer_array` is set to true, it 
indicates that the JSON data starts with an array object and flattens the 
objects within the array. The default value is false. When the outermost layer 
of the JSON data is represented by `[]`, which denotes an array, 
`strip_outer_array` should be set to true. For example, with the following 
data, setting `strip_outer_array` to true will result in two rows of data being 
generated when imported into Doris: `[{"k1 [...]
 | json_root                    | `json_root` is a valid jsonpath string that 
specifies the root node of a JSON document, with a default value of "". |
-| merge_type                   | There are three types of data merging: 
APPEND, DELETE, and MERGE. APPEND is the default value, indicating that this 
batch of data needs to be appended to the existing data. DELETE means to remove 
all rows that have the same keys as this batch of data. MERGE semantics need to 
be used in conjunction with delete conditions. It means that data satisfying 
the delete conditions will be processed according to DELETE semantics, while 
the rest will be processed ac [...]
+| merge_type                   | The merge type of data. Three types are 
supported:<br/>- APPEND (default): Indicates that all data in this batch will 
be appended to existing data<br/>- DELETE: Indicates deletion of all rows with 
Keys matching this batch of data<br/>- MERGE: Must be used in conjunction with 
DELETE conditions. Data meeting DELETE conditions will be processed according 
to DELETE semantics, while the rest will be processed according to APPEND 
semantics<br/>For example, to s [...]
 | delete                       | It is only meaningful under MERGE, 
representing the deletion conditions for data. |
 | function_column.sequence_col | It is suitable only for the UNIQUE KEYS 
model. Within the same Key column, it ensures that the Value column is replaced 
according to the specified source_sequence column. The source_sequence can 
either be a column from the data source or an existing column in the table 
structure. |
 | fuzzy_parse                  | It is a boolean type. If set to true, the 
JSON will be parsed with the first row as the schema. Enabling this option can 
improve the efficiency of JSON imports, but it requires that the order of the 
keys in all JSON objects be consistent with the first line. The default is 
false and it is only used for JSON format. |
@@ -640,7 +630,7 @@ curl --location-trusted -u <doris_user>:<doris_password> \
     -H "merge_type: DELETE" \
     -H "function_column.sequence_col: age" \
     -H "column_separator:," \
-    -H "columns: name, gender, age" \
+    -H "columns: name, gender, age" 
     -T streamload_example.csv \
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
@@ -1058,13 +1048,9 @@ curl --location-trusted -u <doris_user>:<doris_password> 
\
     -XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load
 ```
 
-### Label, loading transaction, multi-table atomicity
-
-All load jobs in Doris are atomically effective. And multiple tables loading 
in the same load job can also guarantee atomicity. At the same time, Doris can 
also use the Label mechanism to ensure that data loading is not lost or 
duplicated. For specific instructions, please refer to the [Import Transactions 
and Atomicity](../../../data-operate/import/load-atomicity) documentation.
-
 ### Column mapping, derived columns, and filtering
 
-Doris supports a very rich set of column transformations and filtering 
operations in load statements. Supports most built-in functions and UDFs. For 
how to use this feature correctly, please refer to the [Data 
Transformation](../../../data-operate/import/load-data-convert) documentation.
+Doris supports a very rich set of column transformations and filtering 
operations in load statements. Supports most built-in functions. For how to use 
this feature correctly, please refer to the [Data 
Transformation](../../../data-operate/import/load-data-convert) documentation.
 
 ### Enable strict mode import
 
@@ -1077,456 +1063,3 @@ For how to express partial column updates during 
import, please refer to the Dat
 ## More help
 
 For more detailed syntax and best practices on using Stream Load, please refer 
to the [Stream 
Load](../../../sql-manual/sql-statements/Data-Manipulation-Statements/Load/STREAM-LOAD)
 Command Manual. You can also enter HELP STREAM LOAD in the MySql client 
command line to get more help information.
-
-
-
-
-
-
-
-
-
-
-
-
-
-Stream load submits and transfers data through HTTP protocol. Here, the `curl` 
command shows how to submit an import.
-
-Users can also operate through other HTTP clients.
-
-```
-curl --location-trusted -u user:passwd [-H ""...] -T data.file -XPUT 
http://fe_host:http_port/api/{db}/{table}/_stream_load
-
-The properties supported in the header are described in "Load Parameters" below
-The format is: - H "key1: value1"
-```
-
-Examples:
-
-```
-curl --location-trusted -u root -T date -H "label:123" 
http://abc.com:8030/api/test/date/_stream_load
-```
-The detailed syntax for creating imports helps to execute ``HELP STREAM LOAD`` 
view. The following section focuses on the significance of creating some 
parameters of Stream load.
-
-**Signature parameters**
-
-+ user/passwd
-
-  Stream load uses the HTTP protocol to create the imported protocol and signs 
it through the Basic Access authentication. The Doris system verifies user 
identity and import permissions based on signatures.
-
-**Load Parameters**
-
-Stream load uses HTTP protocol, so all parameters related to import tasks are 
set in the header. The significance of some parameters of the import task 
parameters of Stream load is mainly introduced below.
-
-+ label
-
-  Identity of import task. Each import task has a unique label inside a single 
database. Label is a user-defined name in the import command. With this label, 
users can view the execution of the corresponding import task.
-
-  Another function of label is to prevent users from importing the same data 
repeatedly. **It is strongly recommended that users use the same label for the 
same batch of data. This way, repeated requests for the same batch of data will 
only be accepted once, guaranteeing at-Most-Once**
-
-  When the corresponding import operation state of label is CANCELLED, the 
label can be used again.
-
-
-+ column_separator
-
-    Used to specify the column separator in the load file. The default is 
`\t`. If it is an invisible character, you need to add `\x` as a prefix and 
hexadecimal to indicate the separator.
-
-    For example, the separator `\x01` of the hive file needs to be specified 
as `-H "column_separator:\x01"`.
-
-    You can use a combination of multiple characters as the column separator.
-
-+ line_delimiter
-
-   Used to specify the line delimiter in the load file. The default is `\n`.
-
-   You can use a combination of multiple characters as the column separator.
-
-+ max\_filter\_ratio
-
-  The maximum tolerance rate of the import task is 0 by default, and the range 
of values is 0-1. When the import error rate exceeds this value, the import 
fails.
-
-  If the user wishes to ignore the wrong row, the import can be successful by 
setting this parameter greater than 0.
-
-  The calculation formula is as follows:
-
-    ``` (dpp.abnorm.ALL / (dpp.abnorm.ALL + dpp.norm.ALL ) ) > 
max_filter_ratio ```
-
-  ``` dpp.abnorm.ALL``` denotes the number of rows whose data quality is not 
up to standard. Such as type mismatch, column mismatch, length mismatch and so 
on.
-
-  ``` dpp.norm.ALL ``` refers to the number of correct data in the import 
process. The correct amount of data for the import task can be queried by the 
``SHOW LOAD` command.
-
-  The number of rows in the original file = `dpp.abnorm.ALL + dpp.norm.ALL`
-
-+ where
-
-    Import the filter conditions specified by the task. Stream load supports 
filtering of where statements specified for raw data. The filtered data will 
not be imported or participated in the calculation of filter ratio, but will be 
counted as `num_rows_unselected`.
-
-+ partitions
-
-    Partitions information for tables to be imported will not be imported if 
the data to be imported does not belong to the specified Partition. These data 
will be included in `dpp.abnorm.ALL`.
-
-+ columns
-
-    The function transformation configuration of data to be imported includes 
the sequence change of columns and the expression transformation, in which the 
expression transformation method is consistent with the query statement.
-
-    ```
-    Examples of column order transformation: There are three columns of 
original data (src_c1,src_c2,src_c3), and there are also three columns 
（dst_c1,dst_c2,dst_c3) in the doris table at present.
-    when the first column src_c1 of the original file corresponds to the 
dst_c1 column of the target table, while the second column src_c2 of the 
original file corresponds to the dst_c2 column of the target table and the 
third column src_c3 of the original file corresponds to the dst_c3 column of 
the target table,which is written as follows:
-    columns: dst_c1, dst_c2, dst_c3
-    
-    when the first column src_c1 of the original file corresponds to the 
dst_c2 column of the target table, while the second column src_c2 of the 
original file corresponds to the dst_c3 column of the target table and the 
third column src_c3 of the original file corresponds to the dst_c1 column of 
the target table,which is written as follows:
-    columns: dst_c2, dst_c3, dst_c1
-    
-    Example of expression transformation: There are two columns in the 
original file and two columns in the target table (c1, c2). However, both 
columns in the original file need to be transformed by functions to correspond 
to the two columns in the target table.
-    columns: tmp_c1, tmp_c2, c1 = year(tmp_c1), c2 = mouth(tmp_c2)
-    Tmp_* is a placeholder, representing two original columns in the original 
file.
-    ```
-  
-+ format
-
-  Specify the import data format, support csv, json, the default is csv
-
- supports `csv_with_names` (csv file line header filter), 
`csv_with_names_and_types` (csv file first two lines filter), parquet, orc
-
-+ exec\_mem\_limit
-
-    Memory limit. Default is 2GB. Unit is Bytes
-
-+ merge\_type
-
-     The type of data merging supports three types: APPEND, DELETE, and MERGE. 
APPEND is the default value, which means that all this batch of data needs to 
be appended to the existing data. DELETE means to delete all rows with the same 
key as this batch of data. MERGE semantics Need to be used in conjunction with 
the delete condition, which means that the data that meets the delete condition 
is processed according to DELETE semantics and the rest is processed according 
to APPEND semantics
-
-+ two\_phase\_commit
-
-  Stream load import can enable two-stage transaction commit mode: in the 
stream load process, the data is written and the information is returned to the 
user. At this time, the data is invisible and the transaction status is 
`PRECOMMITTED`. After the user manually triggers the commit operation, the data 
is visible.
-
-+ enclose
-  
-  When the csv data field contains row delimiters or column delimiters, to 
prevent accidental truncation, single-byte characters can be specified as 
brackets for protection. For example, the column separator is ",", the bracket 
is "'", and the data is "a,'b,c'", then "b,c" will be parsed as a field.
-  Note: when the bracket is `"`, trim\_double\_quotes must be set to true.
-
-+ escape
-
-  Used to escape characters that appear in a csv field identical to the 
enclosing characters. For example, if the data is "a,'b,'c'", enclose is "'", 
and you want "b,'c to be parsed as a field, you need to specify a single-byte 
escape character, such as "\", and then modify the data to "a,' b,\'c'".
-
-  Example：
-
-    1. Initiate a stream load pre-commit operation
-  ```shell
-  curl  --location-trusted -u user:passwd -H "two_phase_commit:true" -T 
test.txt http://fe_host:http_port/api/{db}/{table}/_stream_load
-  {
-      "TxnId": 18036,
-      "Label": "55c8ffc9-1c40-4d51-b75e-f2265b3602ef",
-      "TwoPhaseCommit": "true",
-      "Status": "Success",
-      "Message": "OK",
-      "NumberTotalRows": 100,
-      "NumberLoadedRows": 100,
-      "NumberFilteredRows": 0,
-      "NumberUnselectedRows": 0,
-      "LoadBytes": 1031,
-      "LoadTimeMs": 77,
-      "BeginTxnTimeMs": 1,
-      "StreamLoadPutTimeMs": 1,
-      "ReadDataTimeMs": 0,
-      "WriteDataTimeMs": 58,
-      "CommitAndPublishTimeMs": 0
-  }
-  ```
-    1. Trigger the commit operation on the transaction.
-      Note 1) requesting to fe and be both works
-      Note 2) `{table}` in url can be omit when commit
-      using txn id
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H "txn_id:18036" -H 
"txn_operation:commit"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "transaction [18036] commit successfully."
-  }
-  ```
-  using label
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H 
"label:55c8ffc9-1c40-4d51-b75e-f2265b3602ef" -H "txn_operation:commit"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "label [55c8ffc9-1c40-4d51-b75e-f2265b3602ef] commit 
successfully."
-  }
-  ```
-    1. Trigger an abort operation on a transaction
-      Note 1) requesting to fe and be both works
-      Note 2) `{table}` in url can be omit when abort
-      using txn id
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H "txn_id:18037" -H 
"txn_operation:abort"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "transaction [18037] abort successfully."
-  }
-  ```
-  using label
-  ```shell
-  curl -X PUT --location-trusted -u user:passwd  -H 
"label:55c8ffc9-1c40-4d51-b75e-f2265b3602ef" -H "txn_operation:abort"  
http://fe_host:http_port/api/{db}/{table}/_stream_load_2pc
-  {
-      "status": "Success",
-      "msg": "label [55c8ffc9-1c40-4d51-b75e-f2265b3602ef] abort successfully."
-  }
-  ```
-
-+ enable_profile
-
-  When `enable_profile` is true, the Stream Load profile will be printed to 
logs (be.INFO).
-
-+ memtable_on_sink_node
-
-  Whether to enable MemTable on DataSink node when loading data, default is 
false.
-
-  Build MemTable on DataSink node, and send segments to other backends through 
brpc streaming.
-  It reduces duplicate work among replicas, and saves time in data 
serialization & deserialization.
-- partial_columns
-   Whether to enable partial column updates, Boolean type, True means that use 
partial column update, the default value is false, this parameter is only 
allowed to be set when the table model is Unique and Merge on Write is used.
-
-   eg: `curl  --location-trusted -u root: -H "partial_columns:true" -H 
"column_separator:," -H "columns:id,balance,last_access_time" -T /tmp/test.csv 
http://127.0.0.1:48037/api/db1/user_profile/_stream_load`
-
-### Use stream load with SQL
-
-You can add a `sql` parameter to the `Header` to replace the 
`column_separator`, `line_delimiter`, `where`, `columns` in the previous 
parameter, which is convenient to use.
-
-```
-curl --location-trusted -u user:passwd [-H "sql: ${load_sql}"...] -T data.file 
-XPUT http://fe_host:http_port/api/_http_stream
-
-
-# -- load_sql
-# insert into db.table (col1, col2, ...) select c1, c2, ... from 
http_stream("property1"="value1");
-
-# http_stream
-# (
-#     "column_separator" = ",",
-#     "format" = "CSV",
-#     ...
-# )
-```
-
-Examples：
-
-```
-curl  --location-trusted -u root: -T test.csv  -H "sql:insert into 
demo.example_tbl_1(user_id, age, cost) select c1, c4, c7 * 2 from 
http_stream("format" = "CSV", "column_separator" = "," ) where age >= 30"  
http://127.0.0.1:28030/api/_http_stream
-```
-
-### Return results
-
-Since Stream load is a synchronous import method, the result of the import is 
directly returned to the user by creating the return value of the import.
-
-Examples:
-
-```
-{
-    "TxnId": 1003,
-    "Label": "b6f3bc78-0d2c-45d9-9e4c-faa0a0149bee",
-    "Status": "Success",
-    "ExistingJobStatus": "FINISHED", // optional
-    "Message": "OK",
-    "NumberTotalRows": 1000000,
-    "NumberLoadedRows": 1000000,
-    "NumberFilteredRows": 1,
-    "NumberUnselectedRows": 0,
-    "LoadBytes": 40888898,
-    "LoadTimeMs": 2144,
-    "BeginTxnTimeMs": 1,
-    "StreamLoadPutTimeMs": 2,
-    "ReadDataTimeMs": 325,
-    "WriteDataTimeMs": 1933,
-    "CommitAndPublishTimeMs": 106,
-    "ErrorURL": 
"http://192.168.1.1:8042/api/_load_error_log?file=__shard_0/error_log_insert_stmt_db18266d4d9b4ee5-abb00ddd64bdf005_db18266d4d9b4ee5_abb00ddd64bdf005";
-}
-```
-
-The following main explanations are given for the Stream load import result 
parameters:
-
-+ TxnId: The imported transaction ID. Users do not perceive.
-
-+ Label: Import Label. User specified or automatically generated by the system.
-
-+ Status: Import completion status.
-
-  "Success": Indicates successful import.
-
-  "Publish Timeout": This state also indicates that the import has been 
completed, except that the data may be delayed and visible without retrying.
-
-  "Label Already Exists": Label duplicate, need to be replaced Label.
-
-  "Fail": Import failed.
-
-+ ExistingJobStatus: The state of the load job corresponding to the existing 
Label.
-
-    This field is displayed only when the status is "Label Already Exists". 
The user can know the status of the load job corresponding to Label through 
this state. "RUNNING" means that the job is still executing, and "FINISHED" 
means that the job is successful.
-
-+ Message: Import error messages.
-
-+ NumberTotalRows: Number of rows imported for total processing.
-
-+ NumberLoadedRows: Number of rows successfully imported.
-
-+ NumberFilteredRows: Number of rows that do not qualify for data quality.
-
-+ NumberUnselectedRows: Number of rows filtered by where condition.
-
-+ LoadBytes: Number of bytes imported.
-
-+ LoadTimeMs: Import completion time. Unit milliseconds.
-
-+ BeginTxnTimeMs: The time cost for RPC to Fe to begin a transaction, Unit 
milliseconds.
-
-+ StreamLoadPutTimeMs: The time cost for RPC to Fe to get a stream load plan, 
Unit milliseconds.
-
-+ ReadDataTimeMs: Read data time, Unit milliseconds.
-
-+ WriteDataTimeMs: Write data time, Unit milliseconds.
-
-+ CommitAndPublishTimeMs: The time cost for RPC to Fe to commit and publish a 
transaction, Unit milliseconds.
-
-+ ErrorURL: If you have data quality problems, visit this URL to see specific 
error lines.
-
-:::info Note
-Since Stream load is a synchronous import mode, import information will not be 
recorded in Doris system. Users cannot see Stream load asynchronously by 
looking at import commands. You need to listen for the return value of the 
create import request to get the import result.
-:::
-
-### Cancel Load
-
-Users can't cancel Stream load manually. Stream load will be cancelled 
automatically by the system after a timeout or import error.
-
-### View Stream Load
-
-Users can view completed stream load tasks through `show stream load`.
-
-By default, BE does not record Stream Load records. If you want to view 
records that need to be enabled on BE, the configuration parameter is: 
`enable_stream_load_record=true`. For details, please refer to [BE 
Configuration Items](../../../admin-manual/config/be-config)
-
-## Relevant System Configuration
-
-### FE configuration
-
-+ stream\_load\_default\_timeout\_second
-
-  The timeout time of the import task (in seconds) will be cancelled by the 
system if the import task is not completed within the set timeout time, and 
will become CANCELLED.
-
-  At present, Stream load does not support custom import timeout time. All 
Stream load import timeout time is uniform. The default timeout time is 600 
seconds. If the imported source file can no longer complete the import within 
the specified time, the FE parameter ```stream_load_default_timeout_second``` 
needs to be adjusted.
-
-+ enable\_pipeline\_load
-
-  Whether or not to enable the Pipeline engine to execute Streamload tasks. 
See the [Import](../../../data-operate/import/load-manual) documentation.
-
-### BE configuration
-
-+ streaming\_load\_max\_mb
-
-  The maximum import size of Stream load is 10G by default, in MB. If the 
user's original file exceeds this value, the BE parameter 
```streaming_load_max_mb``` needs to be adjusted.
-
-## Best Practices
-
-### Application scenarios
-
-The most appropriate scenario for using Stream load is that the original file 
is in memory or on disk. Secondly, since Stream load is a synchronous import 
method, users can also use this import if they want to obtain the import 
results in a synchronous manner.
-
-### Data volume
-
-Since Stream load is based on the BE initiative to import and distribute data, 
the recommended amount of imported data is between 1G and 10G. Since the 
default maximum Stream load import data volume is 10G, the configuration of BE 
```streaming_load_max_mb``` needs to be modified if files exceeding 10G are to 
be imported.
-
-```
-For example, the size of the file to be imported is 15G
-Modify the BE configuration streaming_load_max_mb to 16000
-```
-
-Stream load default timeout is 600 seconds, according to Doris currently the 
largest import speed limit, about more than 3G files need to modify the import 
task default timeout.
-
-```
-Import Task Timeout = Import Data Volume / 10M / s (Specific Average Import 
Speed Requires Users to Calculate Based on Their Cluster Conditions)
-For example, import a 10G file
-Timeout = 1000s -31561;. 20110G / 10M /s
-```
-
-### Complete examples
-
-Data situation: In the local disk path /home/store_sales of the sending and 
importing requester, the imported data is about 15G, and it is hoped to be 
imported into the table store\_sales of the database bj_sales.
-
-Cluster situation: The concurrency of Stream load is not affected by cluster 
size.
-
-+ Step 1: Does the import file size exceed the default maximum import size of 
10G
-
-  ```
-  BE conf
-  streaming_load_max_mb = 16000
-  ```
-+ Step 2: Calculate whether the approximate import time exceeds the default 
timeout value
-
-  ```
-  Import time 15000/10 = 1500s
-  Over the default timeout time, you need to modify the FE configuration
-  stream_load_default_timeout_second = 1500
-  ```
-
-+ Step 3: Create Import Tasks
-
-    ```
-    curl --location-trusted -u user:password -T /home/store_sales -H 
"label:abc" http://abc.com:8030/api/bj_sales/store_sales/_stream_load
-    ```
-
-### Coding with StreamLoad
-
-You can initiate HTTP requests for Stream Load using any language. Before 
initiating HTTP requests, you need to set several necessary headers:
-
-```http
-Content-Type: text/plain; charset=UTF-8
-Expect: 100-continue
-Authorization: Basic <Base64 encoded username and password>
-```
-
-`<Base64 encoded username and password>`: a string consist with Doris's 
`username`, `:` and `password` and then do a base64 encode.
-
-Additionally, it should be noted that if you directly initiate an HTTP request 
to FE, as Doris will redirect to BE, some frameworks will remove the 
`Authorization` HTTP header during this process, which requires manual 
processing.
-
-Doris provides StreamLoad examples in three languages: 
[Java](https://github.com/apache/doris/tree/master/samples/stream_load/java), 
[Go](https://github.com/apache/doris/tree/master/samples/stream_load/go), and 
[Python](https://github.com/apache/doris/tree/master/samples/stream_load/python)
 for reference.
-
-## Common Questions
-
-* Label Already Exists
-
-  The Label repeat checking steps of Stream load are as follows:
-
-  1. Is there an import Label conflict that already exists with other import 
methods?
-
-    Because imported Label in Doris system does not distinguish between import 
methods, there is a problem that other import methods use the same Label.
-
-    Through ``SHOW LOAD WHERE LABEL = "xxx"'``, where XXX is a duplicate Label 
string, see if there is already a Label imported by FINISHED that is the same 
as the Label created by the user.
-
-  2. Are Stream loads submitted repeatedly for the same job?
-
-    Since Stream load is an HTTP protocol submission creation import task, 
HTTP Clients in various languages usually have their own request retry logic. 
After receiving the first request, the Doris system has started to operate 
Stream load, but because the result is not returned to the Client side in time, 
the Client side will retry to create the request. At this point, the Doris 
system is already operating on the first request, so the second request will be 
reported to Label Already Exists.
-
-    To sort out the possible methods mentioned above: Search FE Master's log 
with Label to see if there are two ``redirect load action to destination = 
``redirect load action to destination cases in the same Label. If so, the 
request is submitted repeatedly by the Client side.
-
-    It is recommended that the user calculate the approximate import time 
based on the amount of data currently requested, and change the request 
overtime on the client side to a value greater than the import timeout time 
according to the import timeout time to avoid multiple submissions of the 
request by the client side.
-
-  3. Connection reset abnormal
-
-    In the community version 0.14.0 and earlier versions, the connection reset 
exception occurred after Http V2 was enabled, because the built-in web 
container is tomcat, and Tomcat has pits in 307 (Temporary Redirect). There is 
a problem with the implementation of this protocol. All In the case of using 
Stream load to import a large amount of data, a connect reset exception will 
occur. This is because tomcat started data transmission before the 307 jump, 
which resulted in the lack of au [...]
-
-    After the upgrade, also upgrade the http client version of your program to 
`4.5.13`，Introduce the following dependencies in your pom.xml file
-
-    ```xml
-        <dependency>
-          <groupId>org.apache.httpcomponents</groupId>
-          <artifactId>httpclient</artifactId>
-          <version>4.5.13</version>
-        </dependency>
-    ```
-
-* After enabling the Stream Load record on the BE, the record cannot be queried
-
-  This is caused by the slowness of fetching records, you can try to adjust 
the following parameters:
-
-  1. Increase the BE configuration `stream_load_record_batch_size`. This 
configuration indicates how many Stream load records can be pulled from BE each 
time. The default value is 50, which can be increased to 500.
-  2. Reduce the FE configuration `fetch_stream_load_record_interval_second`, 
this configuration indicates the interval for obtaining Stream load records, 
the default is to fetch once every 120 seconds, and it can be adjusted to 60 
seconds.
-  3. If you want to save more Stream load records (not recommended, it will 
take up more resources of FE), you can increase the configuration 
`max_stream_load_record_size` of FE, the default is 5000.
-
-## More Help
-
-For more detailed syntax used by **Stream Load**,  you can enter `HELP STREAM 
LOAD` on the Mysql client command line for more help.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

(doris-website) branch master updated: [doc](stream load) optimize stream load doc (#1330)

Reply via email to