This is an automated email from the ASF dual-hosted git repository.
dataroaring pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 8a5d656eb0a update partial update doc (#3211)
8a5d656eb0a is described below
commit 8a5d656eb0a9a5054e96b6f313697859842709d3
Author: bobhan1 <[email protected]>
AuthorDate: Wed Dec 31 04:51:48 2025 +0800
update partial update doc (#3211)
## Versions
- [x] dev
- [x] 4.x
- [x] 3.x
- [ ] 2.1
## Languages
- [x] Chinese
- [x] English
## Docs Checklist
- [ ] Checked by AI
- [ ] Test Cases Built
---
docs/data-operate/import/import-way/routine-load-manual.md | 1 +
docs/data-operate/update/partial-column-update.md | 2 --
.../current/data-operate/import/import-way/routine-load-manual.md | 1 +
.../current/data-operate/update/partial-column-update.md | 2 --
.../version-3.x/data-operate/import/handling-messy-data.md | 6 +++++-
.../data-operate/import/import-way/routine-load-manual.md | 1 +
.../data-operate/import/import-way/stream-load-manual.md | 1 +
.../version-3.x/data-operate/update/partial-column-update.md | 7 +++++--
.../data-operate/import/import-way/routine-load-manual.md | 1 +
.../version-4.x/data-operate/update/partial-column-update.md | 2 --
.../version-3.x/data-operate/import/handling-messy-data.md | 6 +++++-
.../data-operate/import/import-way/routine-load-manual.md | 1 +
.../data-operate/import/import-way/stream-load-manual.md | 1 +
.../version-3.x/data-operate/update/partial-column-update.md | 7 +++++--
.../data-operate/import/import-way/routine-load-manual.md | 1 +
.../version-4.x/data-operate/update/partial-column-update.md | 2 --
16 files changed, 28 insertions(+), 14 deletions(-)
diff --git a/docs/data-operate/import/import-way/routine-load-manual.md
b/docs/data-operate/import/import-way/routine-load-manual.md
index 0771dff3b88..5dd95494d2d 100644
--- a/docs/data-operate/import/import-way/routine-load-manual.md
+++ b/docs/data-operate/import/import-way/routine-load-manual.md
@@ -407,6 +407,7 @@ Here are the available parameters for the job_properties
clause:
| send_batch_parallelism | Used to set the parallelism of sending batch
data. If the parallelism value exceeds the `max_send_batch_parallelism_per_job`
in BE configuration, the coordinating BE will use the value of
`max_send_batch_parallelism_per_job`. |
| load_to_single_tablet | Supports importing data to only one tablet in
the corresponding partition per task. Default value is false. This parameter
can only be set when importing data to OLAP tables with random bucketing. |
| partial_columns | Specifies whether to enable partial column
update feature. Default value is false. This parameter can only be set when the
table model is Unique and uses Merge on Write. Multi-table streaming does not
support this parameter. For details, refer to [Partial Column
Update](../../../data-operate/update/update-of-unique-model) |
+| partial_update_new_key_behavior | When performing partial column updates on
Unique Merge on Write table, this parameter controls how new rows are handled.
There are two types: `APPEND` and `ERROR`.<br/>- `APPEND`: Allows inserting new
row data<br/>- `ERROR`: Fails and reports an error when inserting new rows |
| max_filter_ratio | The maximum allowed filter ratio within the
sampling window. Must be between 0 and 1 inclusive. Default value is 1.0,
indicating any error rows can be tolerated. The sampling window is
`max_batch_rows * 10`. If the ratio of error rows to total rows within the
sampling window exceeds `max_filter_ratio`, the routine job will be suspended
and require manual intervention to check data quality issues. Rows filtered by
WHERE conditions are not counted as error rows. |
| enclose | Specifies the enclosing character. When CSV data
fields contain line or column separators, a single-byte character can be
specified as an enclosing character for protection to prevent accidental
truncation. For example, if the column separator is "," and the enclosing
character is "'", the data "a,'b,c'" will have "b,c" parsed as one field. |
| escape | Specifies the escape character. Used to escape
characters in fields that are identical to the enclosing character. For
example, if the data is "a,'b,'c'", the enclosing character is "'", and you
want "b,'c" to be parsed as one field, you need to specify a single-byte escape
character, such as "\", and modify the data to "a,'b,\'c'". |
diff --git a/docs/data-operate/update/partial-column-update.md
b/docs/data-operate/update/partial-column-update.md
index 161b40e1c57..42daf124af4 100644
--- a/docs/data-operate/update/partial-column-update.md
+++ b/docs/data-operate/update/partial-column-update.md
@@ -78,8 +78,6 @@ SET enable_unique_key_partial_update=true;
INSERT INTO order_tbl (order_id, order_status) VALUES (1, 'Pending Shipment');
```
-Note that the session variable `enable_insert_strict` defaults to true,
enabling strict mode by default. In strict mode, partial column updates do not
allow updating non-existent keys. To insert non-existent keys using the insert
statement for partial column updates, set `enable_unique_key_partial_update` to
true and `enable_insert_strict` to false.
-
#### Flink Connector
If using Flink Connector, add the following configuration:
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/routine-load-manual.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/routine-load-manual.md
index f33cd334a8b..98d2e385c2d 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/routine-load-manual.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/routine-load-manual.md
@@ -418,6 +418,7 @@ job_properties 子句具体参数选项如下:
| send_batch_parallelism | 用于设置发送批量数据的并行度。如果并行度的值超过 BE 配置中的
`max_send_batch_parallelism_per_job`,那么作为协调点的 BE 将使用
`max_send_batch_parallelism_per_job` 的值。 |
| load_to_single_tablet | 支持一个任务只导入数据到对应分区的一个 tablet,默认值为 false,该参数只允许在对带有
random 分桶的 olap 表导数的时候设置。 |
| partial_columns | 指定是否开启部分列更新功能。默认值为 false。该参数只允许在表模型为 Unique 且采用
Merge on Write
时设置。一流多表不支持此参数。具体参考文档[列更新](../../../data-operate/update/partial-column-update.md)
|
+| partial_update_new_key_behavior | 在 Unique Merge on Write
表上进行部分列更新时,对新插入行的处理方式。有两种类型 `APPEND`, `ERROR`。<br/>-`APPEND`:
允许插入新行数据<br/>-`ERROR`: 插入新行时倒入失败并报错 |
| max_filter_ratio | 采样窗口内,允许的最大过滤率。必须在大于等于 0 到小于等于 1 之间。默认值是
1.0,表示可以容忍任何错误行。采样窗口为 `max_batch_rows * 10`。即如果在采样窗口内,错误行数/总行数大于
`max_filter_ratio`,则会导致例行作业被暂停,需要人工介入检查数据质量问题。被 where 条件过滤掉的行不算错误行。 |
| enclose | 指定包围符。当 CSV
数据字段中含有行分隔符或列分隔符时,为防止意外截断,可指定单字节字符作为包围符起到保护作用。例如列分隔符为 ",",包围符为 "'",数据为
"a,'b,c'",则 "b,c" 会被解析为一个字段。 |
| escape | 指定转义符。用于转义在字段中出现的与包围符相同的字符。例如数据为 "a,'b,'c'",包围符为
"'",希望 "b,'c 被作为一个字段解析,则需要指定单字节转义符,例如"\",将数据修改为 "a,'b,\'c'"。 |
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/update/partial-column-update.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/update/partial-column-update.md
index 96abe0468d4..4ccc8f9ac9d 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/update/partial-column-update.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/update/partial-column-update.md
@@ -80,8 +80,6 @@ SET enable_unique_key_partial_update=true;
INSERT INTO order_tbl (order_id, order_status) VALUES (1, '待发货');
```
-需要注意的是,控制 insert 语句是否开启严格模式的会话变量 `enable_insert_strict` 的默认值为 true,即 insert
语句默认开启严格模式。在严格模式下进行部分列更新不允许更新不存在的 key。所以,在使用 insert 语句进行部分列更新时,如果希望能插入不存在的
key,需要在 `enable_unique_key_partial_update` 设置为 true 的基础上,同时将
`enable_insert_strict` 设置为 false。
-
#### Flink Connector
如果使用 Flink Connector,需要添加如下配置:
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/handling-messy-data.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/handling-messy-data.md
index 0aa4db77164..5cf52458c22 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/handling-messy-data.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/handling-messy-data.md
@@ -17,7 +17,7 @@
严格模式具有两个主要功能:
1. 对导入过程中发生列类型转换失败的数据行进行过滤。
-2. 在部分列更新场景中,限制只能更新已存在的列。
+2. 在部分列更新场景中,限制只能更新已存在的列(3.0.x 版本前,自 3.1.0 版本起,该行为由导入属性/会话变量
`partial_update_new_key_behavior` 控制)。
### 列类型转换失败的过滤策略
@@ -64,6 +64,10 @@
### 限定部分列更新只能更新已有的列
+:::tip
+3.0.x 版本前,自 3.1.0 版本起,该行为由导入属性/会话变量 `partial_update_new_key_behavior` 控制
+:::
+
在严格模式下,部分列更新插入的每一行数据必须满足该行数据的 Key 在表中已经存在。而在非严格模式下,进行部分列更新时可以更新 Key
已经存在的行,也可以插入 Key 不存在的新行。
例如有表结构如下:
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/import-way/routine-load-manual.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/import-way/routine-load-manual.md
index ec3858d3cfc..7081c20d539 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/import-way/routine-load-manual.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/import-way/routine-load-manual.md
@@ -412,6 +412,7 @@ job_properties 子句具体参数选项如下:
| send_batch_parallelism | 用于设置发送批量数据的并行度。如果并行度的值超过 BE 配置中的
`max_send_batch_parallelism_per_job`,那么作为协调点的 BE 将使用
`max_send_batch_parallelism_per_job` 的值。 |
| load_to_single_tablet | 支持一个任务只导入数据到对应分区的一个 tablet,默认值为 false,该参数只允许在对带有
random 分桶的 olap 表导数的时候设置。 |
| partial_columns | 指定是否开启部分列更新功能。默认值为 false。该参数只允许在表模型为 Unique 且采用
Merge on Write
时设置。一流多表不支持此参数。具体参考文档[部分列更新](../../../data-operate/update/update-of-unique-model)
|
+| partial_update_new_key_behavior<br/>(自 3.1.0 版本起) | 在 Unique Merge on Write
表上进行部分列更新时,对新插入行的处理方式。有两种类型 `APPEND`, `ERROR`。<br/>-`APPEND`:
允许插入新行数据<br/>-`ERROR`: 插入新行时倒入失败并报错 |
| max_filter_ratio | 采样窗口内,允许的最大过滤率。必须在大于等于 0 到小于等于 1 之间。默认值是
1.0,表示可以容忍任何错误行。采样窗口为 `max_batch_rows * 10`。即如果在采样窗口内,错误行数/总行数大于
`max_filter_ratio`,则会导致例行作业被暂停,需要人工介入检查数据质量问题。被 where 条件过滤掉的行不算错误行。 |
| enclose | 指定包围符。当 CSV
数据字段中含有行分隔符或列分隔符时,为防止意外截断,可指定单字节字符作为包围符起到保护作用。例如列分隔符为 ",",包围符为 "'",数据为
"a,'b,c'",则 "b,c" 会被解析为一个字段。 |
| escape | 指定转义符。用于转义在字段中出现的与包围符相同的字符。例如数据为 "a,'b,'c'",包围符为
"'",希望 "b,'c 被作为一个字段解析,则需要指定单字节转义符,例如"\",将数据修改为 "a,'b,\'c'"。 |
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/import-way/stream-load-manual.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/import-way/stream-load-manual.md
index 9f0b420dfce..19d1bceb659 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/import-way/stream-load-manual.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/import-way/stream-load-manual.md
@@ -308,6 +308,7 @@ Stream Load 操作支持 HTTP 分块导入(HTTP chunked)与 HTTP 非分块
| escape | 指定转义符。用于转义在字段中出现的与包围符相同的字符。例如数据为
"a,'b,'c'",包围符为 "'",希望 "b,'c 被作为一个字段解析,则需要指定单字节转义符,例如"\",将数据修改为 "a,'b,\'c'"。 |
| memtable_on_sink_node | 导入数据的时候是否开启 MemTable 前移,默认为 false。 |
| unique_key_update_mode | Unique 表上的更新模式,目前仅对 Merge-On-Write Unique
表有效,一共支持三种类型 `UPSERT`, `UPDATE_FIXED_COLUMNS`, `UPDATE_FLEXIBLE_COLUMNS`。
`UPSERT`: 表示以 upsert 语义导入数据; `UPDATE_FIXED_COLUMNS`:
表示以[部分列更新](../../../data-operate/update/update-of-unique-model)的方式导入数据;
`UPDATE_FLEXIBLE_COLUMNS`:
表示以[灵活部分列更新](../../../data-operate/update/update-of-unique-model)的方式导入数据|
+| partial_update_new_key_behavior<br/>(自 3.1.0 版本起) | Unique
表上进行部分列更新或灵活列更新时,对新插入行的处理方式。有两种类型 `APPEND`, `ERROR`。<br/>-`APPEND`:
允许插入新行数据<br/>-`ERROR`: 插入新行时倒入失败并报错 |
### 导入返回值
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/update/partial-column-update.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/update/partial-column-update.md
index 96abe0468d4..1ce4e7eeb5e 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/update/partial-column-update.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/update/partial-column-update.md
@@ -80,7 +80,9 @@ SET enable_unique_key_partial_update=true;
INSERT INTO order_tbl (order_id, order_status) VALUES (1, '待发货');
```
-需要注意的是,控制 insert 语句是否开启严格模式的会话变量 `enable_insert_strict` 的默认值为 true,即 insert
语句默认开启严格模式。在严格模式下进行部分列更新不允许更新不存在的 key。所以,在使用 insert 语句进行部分列更新时,如果希望能插入不存在的
key,需要在 `enable_unique_key_partial_update` 设置为 true 的基础上,同时将
`enable_insert_strict` 设置为 false。
+:::caution 注意:
+需要注意的是,控制 insert 语句是否开启严格模式的会话变量 `enable_insert_strict` 的默认值为 true,即 insert
语句默认开启严格模式。在 3.0.x 及更低版本中,在严格模式下进行部分列更新不允许更新不存在的 key。所以,在使用 insert
语句进行部分列更新时,如果希望能插入不存在的 key,需要在 `enable_unique_key_partial_update` 设置为 true
的基础上,同时将 `enable_insert_strict` 设置为 false。
+:::
#### Flink Connector
@@ -267,7 +269,8 @@ MySQL [email protected]:d1> select * from t1;
### 部分列更新/灵活列更新中对新插入的行的处理
-session variable或导入属性`partial_update_new_key_behavior`用于控制部分列更新和灵活列更新中插入的新行的行为。
+3.0.x
版本中,导入是否开启了严格模式会控制部分列更新中插入的新行的行为,具体可参考[严格模式](../import/handling-messy-data.md#限定部分列更新只能更新已有的列)文档。
+自 3.1.0 版本起,session
variable或导入属性`partial_update_new_key_behavior`用于控制部分列更新和灵活列更新中插入的新行的行为。
当`partial_update_new_key_behavior=ERROR`时,插入的每一行数据必须满足该行数据的 Key
在表中已经存在。而当`partial_update_new_key_behavior=APPEND`时,进行部分列更新或灵活列更新时可以更新 Key
已经存在的行,也可以插入 Key 不存在的新行。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/routine-load-manual.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/routine-load-manual.md
index 0f57c86b527..c6cc28a4f00 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/routine-load-manual.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/routine-load-manual.md
@@ -418,6 +418,7 @@ job_properties 子句具体参数选项如下:
| send_batch_parallelism | 用于设置发送批量数据的并行度。如果并行度的值超过 BE 配置中的
`max_send_batch_parallelism_per_job`,那么作为协调点的 BE 将使用
`max_send_batch_parallelism_per_job` 的值。 |
| load_to_single_tablet | 支持一个任务只导入数据到对应分区的一个 tablet,默认值为 false,该参数只允许在对带有
random 分桶的 olap 表导数的时候设置。 |
| partial_columns | 指定是否开启部分列更新功能。默认值为 false。该参数只允许在表模型为 Unique 且采用
Merge on Write
时设置。一流多表不支持此参数。具体参考文档[部分列更新](../../../data-operate/update/update-of-unique-model)
|
+| partial_update_new_key_behavior | 在 Unique Merge on Write
表上进行部分列更新时,对新插入行的处理方式。有两种类型 `APPEND`, `ERROR`。<br/>-`APPEND`:
允许插入新行数据<br/>-`ERROR`: 插入新行时倒入失败并报错 |
| max_filter_ratio | 采样窗口内,允许的最大过滤率。必须在大于等于 0 到小于等于 1 之间。默认值是
1.0,表示可以容忍任何错误行。采样窗口为 `max_batch_rows * 10`。即如果在采样窗口内,错误行数/总行数大于
`max_filter_ratio`,则会导致例行作业被暂停,需要人工介入检查数据质量问题。被 where 条件过滤掉的行不算错误行。 |
| enclose | 指定包围符。当 CSV
数据字段中含有行分隔符或列分隔符时,为防止意外截断,可指定单字节字符作为包围符起到保护作用。例如列分隔符为 ",",包围符为 "'",数据为
"a,'b,c'",则 "b,c" 会被解析为一个字段。 |
| escape | 指定转义符。用于转义在字段中出现的与包围符相同的字符。例如数据为 "a,'b,'c'",包围符为
"'",希望 "b,'c 被作为一个字段解析,则需要指定单字节转义符,例如"\",将数据修改为 "a,'b,\'c'"。 |
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/update/partial-column-update.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/update/partial-column-update.md
index 96abe0468d4..4ccc8f9ac9d 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/update/partial-column-update.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/update/partial-column-update.md
@@ -80,8 +80,6 @@ SET enable_unique_key_partial_update=true;
INSERT INTO order_tbl (order_id, order_status) VALUES (1, '待发货');
```
-需要注意的是,控制 insert 语句是否开启严格模式的会话变量 `enable_insert_strict` 的默认值为 true,即 insert
语句默认开启严格模式。在严格模式下进行部分列更新不允许更新不存在的 key。所以,在使用 insert 语句进行部分列更新时,如果希望能插入不存在的
key,需要在 `enable_unique_key_partial_update` 设置为 true 的基础上,同时将
`enable_insert_strict` 设置为 false。
-
#### Flink Connector
如果使用 Flink Connector,需要添加如下配置:
diff --git
a/versioned_docs/version-3.x/data-operate/import/handling-messy-data.md
b/versioned_docs/version-3.x/data-operate/import/handling-messy-data.md
index 8f2a867330b..23f5ae2a4c7 100644
--- a/versioned_docs/version-3.x/data-operate/import/handling-messy-data.md
+++ b/versioned_docs/version-3.x/data-operate/import/handling-messy-data.md
@@ -19,7 +19,7 @@ This makes it easier to handle data loading problems and
keeps data management s
Strict mode serves two primary purposes:
1. Filtering out data rows where column type conversion fails during load
-2. Restricting updates to existing columns only in partial column update
scenarios
+2. Restricting updates to existing columns only in partial column update
scenarios(before 3.0.x, since 3.1.0, this behavior is controlled by load
property/session var `partial_update_new_key_behavior`)
### Filtering Strategy for Column Type Conversion Failures
@@ -65,6 +65,10 @@ The system employs different strategies based on the strict
mode setting:
### Restricting Partial Column Updates to Existing Columns Only
+:::tip
+before 3.0.x, since 3.1.0, this behavior is controlled by load
property/session var `partial_update_new_key_behavior`
+:::
+
In strict mode, each row in a partial column update must have its Key already
exist in the table. In non-strict mode, partial column updates can both update
existing rows (where Key exists) and insert new rows (where Key doesn't exist).
For example, given a table structure as follows:
diff --git
a/versioned_docs/version-3.x/data-operate/import/import-way/routine-load-manual.md
b/versioned_docs/version-3.x/data-operate/import/import-way/routine-load-manual.md
index 22c280afe4c..bfe4993038c 100644
---
a/versioned_docs/version-3.x/data-operate/import/import-way/routine-load-manual.md
+++
b/versioned_docs/version-3.x/data-operate/import/import-way/routine-load-manual.md
@@ -399,6 +399,7 @@ Here are the available parameters for the job_properties
clause:
| send_batch_parallelism | Used to set the parallelism of sending batch
data. If the parallelism value exceeds the `max_send_batch_parallelism_per_job`
in BE configuration, the coordinating BE will use the value of
`max_send_batch_parallelism_per_job`. |
| load_to_single_tablet | Supports importing data to only one tablet in
the corresponding partition per task. Default value is false. This parameter
can only be set when importing data to OLAP tables with random bucketing. |
| partial_columns | Specifies whether to enable partial column
update feature. Default value is false. This parameter can only be set when the
table model is Unique and uses Merge on Write. Multi-table streaming does not
support this parameter. For details, refer to [Partial Column
Update](../../../data-operate/update/update-of-unique-model) |
+| partial_update_new_key_behavior<br/>(since 3.1.0) | When performing partial
column updates on Unique Merge on Write table, this parameter controls how new
rows are handled. There are two types: `APPEND` and `ERROR`.<br/>- `APPEND`:
Allows inserting new row data<br/>- `ERROR`: Fails and reports an error when
inserting new rows |
| max_filter_ratio | The maximum allowed filter ratio within the
sampling window. Must be between 0 and 1 inclusive. Default value is 1.0,
indicating any error rows can be tolerated. The sampling window is
`max_batch_rows * 10`. If the ratio of error rows to total rows within the
sampling window exceeds `max_filter_ratio`, the routine job will be suspended
and require manual intervention to check data quality issues. Rows filtered by
WHERE conditions are not counted as error rows. |
| enclose | Specifies the enclosing character. When CSV data
fields contain line or column separators, a single-byte character can be
specified as an enclosing character for protection to prevent accidental
truncation. For example, if the column separator is "," and the enclosing
character is "'", the data "a,'b,c'" will have "b,c" parsed as one field. |
| escape | Specifies the escape character. Used to escape
characters in fields that are identical to the enclosing character. For
example, if the data is "a,'b,'c'", the enclosing character is "'", and you
want "b,'c" to be parsed as one field, you need to specify a single-byte escape
character, such as "\", and modify the data to "a,'b,\'c'". |
diff --git
a/versioned_docs/version-3.x/data-operate/import/import-way/stream-load-manual.md
b/versioned_docs/version-3.x/data-operate/import/import-way/stream-load-manual.md
index 522840bbcf5..9bef0573094 100644
---
a/versioned_docs/version-3.x/data-operate/import/import-way/stream-load-manual.md
+++
b/versioned_docs/version-3.x/data-operate/import/import-way/stream-load-manual.md
@@ -310,6 +310,7 @@ Parameter Description: The default timeout for Stream Load.
The load job will be
| escape | Specify the escape character. It is used to
escape characters that are the same as the enclosure character within a field.
For example, if the data is "a,'b,'c'", and the enclosure is "'", and you want
"b,'c" to be parsed as a single field, you need to specify a single-byte escape
character, such as "", and modify the data to "a,'b','c'". |
| memtable_on_sink_node | Whether to enable MemTable on DataSink node
when loading data, default is false. |
|unique_key_update_mode | The update modes on Unique tables, currently
are only effective for Merge-On-Write Unique tables. Supporting three types:
`UPSERT`, `UPDATE_FIXED_COLUMNS`, and `UPDATE_FLEXIBLE_COLUMNS`. `UPSERT`:
Indicates that data is loaded with upsert semantics; `UPDATE_FIXED_COLUMNS`:
Indicates that data is loaded through partial updates;
`UPDATE_FLEXIBLE_COLUMNS`: Indicates that data is loaded through flexible
partial updates.|
+| partial_update_new_key_behavior<br/>(since 3.1.0) | When performing partial
column updates or flexible column updates on Unique tables, this parameter
controls how new rows are handled. There are two types: `APPEND` and
`ERROR`.<br/>- `APPEND`: Allows inserting new row data<br/>- `ERROR`: Fails and
reports an error when inserting new rows |
### Load return value
diff --git
a/versioned_docs/version-3.x/data-operate/update/partial-column-update.md
b/versioned_docs/version-3.x/data-operate/update/partial-column-update.md
index 161b40e1c57..dd096b680b4 100644
--- a/versioned_docs/version-3.x/data-operate/update/partial-column-update.md
+++ b/versioned_docs/version-3.x/data-operate/update/partial-column-update.md
@@ -78,7 +78,9 @@ SET enable_unique_key_partial_update=true;
INSERT INTO order_tbl (order_id, order_status) VALUES (1, 'Pending Shipment');
```
-Note that the session variable `enable_insert_strict` defaults to true,
enabling strict mode by default. In strict mode, partial column updates do not
allow updating non-existent keys. To insert non-existent keys using the insert
statement for partial column updates, set `enable_unique_key_partial_update` to
true and `enable_insert_strict` to false.
+:::caution Note:
+Note that the session variable `enable_insert_strict` defaults to true,
enabling strict mode by default. In version 3.0.x, partial column updates do
not allow updating non-existent keys in strict mode. To insert non-existent
keys using the insert statement for partial column updates, set
`enable_unique_key_partial_update` to true and `enable_insert_strict` to false.
+:::
#### Flink Connector
@@ -265,7 +267,8 @@ MySQL [email protected]:d1> select * from t1;
### Handling New Rows in Partial Column Updates
-The session variable or import property `partial_update_new_key_behavior`
controls the behavior when inserting new rows during partial column updates.
+In the 3.0.x series, whether strict mode is enabled during import controls the
behavior of newly inserted rows in Partial Column Update. For details, see the
documentation on [Strict
Mode](../import/handling-messy-data.md#restricting-partial-column-updates-to-existing-columns-only).
+Starting from version 3.1.0, the session variable or import property
`partial_update_new_key_behavior` controls the behavior when inserting new rows
during partial column updates.
When `partial_update_new_key_behavior=ERROR`, each inserted row must have a
key that already exists in the table. When
`partial_update_new_key_behavior=APPEND`, partial column updates can update
existing rows with matching keys or insert new rows with keys that do not exist
in the table.
diff --git
a/versioned_docs/version-4.x/data-operate/import/import-way/routine-load-manual.md
b/versioned_docs/version-4.x/data-operate/import/import-way/routine-load-manual.md
index 0771dff3b88..5dd95494d2d 100644
---
a/versioned_docs/version-4.x/data-operate/import/import-way/routine-load-manual.md
+++
b/versioned_docs/version-4.x/data-operate/import/import-way/routine-load-manual.md
@@ -407,6 +407,7 @@ Here are the available parameters for the job_properties
clause:
| send_batch_parallelism | Used to set the parallelism of sending batch
data. If the parallelism value exceeds the `max_send_batch_parallelism_per_job`
in BE configuration, the coordinating BE will use the value of
`max_send_batch_parallelism_per_job`. |
| load_to_single_tablet | Supports importing data to only one tablet in
the corresponding partition per task. Default value is false. This parameter
can only be set when importing data to OLAP tables with random bucketing. |
| partial_columns | Specifies whether to enable partial column
update feature. Default value is false. This parameter can only be set when the
table model is Unique and uses Merge on Write. Multi-table streaming does not
support this parameter. For details, refer to [Partial Column
Update](../../../data-operate/update/update-of-unique-model) |
+| partial_update_new_key_behavior | When performing partial column updates on
Unique Merge on Write table, this parameter controls how new rows are handled.
There are two types: `APPEND` and `ERROR`.<br/>- `APPEND`: Allows inserting new
row data<br/>- `ERROR`: Fails and reports an error when inserting new rows |
| max_filter_ratio | The maximum allowed filter ratio within the
sampling window. Must be between 0 and 1 inclusive. Default value is 1.0,
indicating any error rows can be tolerated. The sampling window is
`max_batch_rows * 10`. If the ratio of error rows to total rows within the
sampling window exceeds `max_filter_ratio`, the routine job will be suspended
and require manual intervention to check data quality issues. Rows filtered by
WHERE conditions are not counted as error rows. |
| enclose | Specifies the enclosing character. When CSV data
fields contain line or column separators, a single-byte character can be
specified as an enclosing character for protection to prevent accidental
truncation. For example, if the column separator is "," and the enclosing
character is "'", the data "a,'b,c'" will have "b,c" parsed as one field. |
| escape | Specifies the escape character. Used to escape
characters in fields that are identical to the enclosing character. For
example, if the data is "a,'b,'c'", the enclosing character is "'", and you
want "b,'c" to be parsed as one field, you need to specify a single-byte escape
character, such as "\", and modify the data to "a,'b,\'c'". |
diff --git
a/versioned_docs/version-4.x/data-operate/update/partial-column-update.md
b/versioned_docs/version-4.x/data-operate/update/partial-column-update.md
index 161b40e1c57..42daf124af4 100644
--- a/versioned_docs/version-4.x/data-operate/update/partial-column-update.md
+++ b/versioned_docs/version-4.x/data-operate/update/partial-column-update.md
@@ -78,8 +78,6 @@ SET enable_unique_key_partial_update=true;
INSERT INTO order_tbl (order_id, order_status) VALUES (1, 'Pending Shipment');
```
-Note that the session variable `enable_insert_strict` defaults to true,
enabling strict mode by default. In strict mode, partial column updates do not
allow updating non-existent keys. To insert non-existent keys using the insert
statement for partial column updates, set `enable_unique_key_partial_update` to
true and `enable_insert_strict` to false.
-
#### Flink Connector
If using Flink Connector, add the following configuration:
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]