(doris-website) branch master updated: [doc](data update)Address comment on update of agg model and translate en doc by LLM (#1721)

zhangchen Mon, 13 Jan 2025 22:08:20 -0800

This is an automated email from the ASF dual-hosted git repository.

zhangchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git



The following commit(s) were added to refs/heads/master by this push:
     new 1816a0eeaa [doc](data update)Address comment on update of agg model 
and translate en doc by LLM (#1721)
1816a0eeaa is described below

commit 1816a0eeaa066f8655ac80d5fb998193fd7e82af
Author: zhannngchen <zhangc...@selectdb.com>
AuthorDate: Tue Jan 14 14:08:07 2025 +0800

    [doc](data update)Address comment on update of agg model and translate en 
doc by LLM (#1721)
    
    ## Versions
    
    - [x] dev
    - [x] 3.0
    - [x] 2.1
    - [ ] 2.0
    
    ## Languages
    
    - [x] Chinese
    - [x] English
    
    ## Docs Checklist
    
    - [x] Checked by AI
    - [x] Test Cases Built
---
 .../update/update-of-aggregate-model.md            | 50 +++++++++-------------
 .../update/update-of-aggregate-model.md            | 10 +----
 .../update/update-of-aggregate-model.md            | 10 +----
 .../update/update-of-aggregate-model.md            | 10 +----
 .../update/update-of-aggregate-model.md            | 50 +++++++++-------------
 .../update/update-of-aggregate-model.md            | 50 +++++++++-------------
 6 files changed, 69 insertions(+), 111 deletions(-)

diff --git a/docs/data-operate/update/update-of-aggregate-model.md 
b/docs/data-operate/update/update-of-aggregate-model.md
index 4b5ef90675..3fde759a85 100644
--- a/docs/data-operate/update/update-of-aggregate-model.md
+++ b/docs/data-operate/update/update-of-aggregate-model.md
@@ -1,7 +1,7 @@
----
+-
 {
-    "title": "Updating Data on Aggregate Key Model",
-    "language": "en"
+  "title": "Updating Data on Aggregate Key Model",
+  "language": "en"
 }
 ---
 
@@ -24,23 +24,21 @@ specific language governing permissions and limitations
 under the License.
 -->
 
+This document primarily introduces how to update the Doris Aggregate model 
based on data load.
 
+## Whole Row Update
 
-This guide is about ingestion-based data updates for the Aggregate Key model 
in Doris.
-
-## Update all columns
-
-When importing data into an Aggregate Key model in Doris by methods like 
Stream Load, Broker Load, Routine Load, and Insert Into, the new values are 
combined with the old values to produce new aggregated values based on the 
column's aggregation function. These values might be generated during insertion 
or produced asynchronously during compaction. However, when querying, users 
will always receive the same returned values.
+When loading data into the Aggregate model table using Doris-supported methods 
such as Stream Load, Broker Load, Routine Load, Insert Into, etc., the new 
values will be aggregated with the old values according to the column's 
aggregation function to produce new aggregated values. This value may be 
produced at the time of insertion or during asynchronous compaction, but users 
will get the same return value when querying.
 
-## Partial column update for Aggregate Key model
+## Partial Column Update of Aggregate Model
 
-Tables in the Aggregate Key model are primarily used in cases with 
pre-aggregation requirements rather than data updates, but Doris allows partial 
column updates for them, too. Simply set the aggregation function to 
`REPLACE_IF_NOT_NULL`.
+The Aggregate table is mainly used in pre-aggregation scenarios rather than 
data update scenarios, but partial column updates can be achieved by setting 
the aggregation function to REPLACE_IF_NOT_NULL.
 
-**Create table**
+**Create Table**
 
-For the columns that need to be updated, set the aggregation function to 
`REPLACE_IF_NOT_NULL`.
+Set the aggregation function of the fields that need to be updated to 
`REPLACE_IF_NOT_NULL`.
 
-```Plain
+```sql
 CREATE TABLE order_tbl (
   order_id int(11) NULL,
   order_amount int(11) REPLACE_IF_NOT_NULL NULL,
@@ -52,38 +50,32 @@ DISTRIBUTED BY HASH(order_id) BUCKETS 1
 PROPERTIES (
 "replication_allocation" = "tag.location.default: 1"
 );
-+----------+--------------+-----------------+
-| order_id | order_amount | order_status    |
-+----------+--------------+-----------------+
-| 1        |          100 | Pending Payment |
-+----------+--------------+-----------------+
-1 row in set (0.01 sec)
 ```
 
-**Ingest data**
+**Data Insertion**
 
-For Stream Load, Broker Load, Routine Load, or INSERT INTO, you can directly 
write the updates to the fields.
+Whether it is Stream Load, Broker Load, Routine Load, or `INSERT INTO`, 
directly write the data of the fields to be updated.
 
 **Example**
 
-Using the same example as above, the corresponding Stream Load command would 
be (no additional headers required):
+Similar to the previous example, the corresponding Stream Load command is (no 
additional header required):
 
 ```shell
 $ cat update.csv
 
 1,To be shipped
 
-$ curl  --location-trusted -u root: -H "column_separator:," -H 
"columns:order_id,order_status" -T /tmp/update.csv 
http://127.0.0.1:8030/api/db1/order_tbl/_stream_load
+curl  --location-trusted -u root: -H "column_separator:," -H 
"columns:order_id,order_status" -T /tmp/update.csv 
http://127.0.0.1:8030/api/db1/order_tbl/_stream_load
 ```
 
-The corresponding `INSERT INTO` statement would be (no additional session 
variables required):
+The corresponding `INSERT INTO` statement is (no additional session variable 
settings required):
 
-```Plain
-INSERT INTO order_tbl (order_id, order_status) values (1,'Delivery Pending');
+```sql
+INSERT INTO order_tbl (order_id, order_status) values (1,'Shipped');
 ```
 
-## Note
+## Notes on Partial Column Updates
 
-The Aggregate Key model does not perform additional data processing during 
data writing, so the writing performance in this model is the same as other 
models. However, aggregation during queries can result in performance loss. 
Typical aggregation queries can be 5~10 times slower than queries on 
Merge-on-Write tables in the Unique Key model.
+The Aggregate Key model does not perform any additional processing during the 
write process, so the write performance is not affected and is the same as 
normal data load. However, the cost of aggregation during query is relatively 
high, and the typical aggregation query performance is 5-10 times lower than 
the Merge-on-Write implementation of the Unique Key model.
 
-Under this circumstance, users cannot set a field from non-NULL to NULL, 
because NULL values written will be automatically neglected by the 
REPLACE_IF_NOT_NULL aggregation function.
+Since the `REPLACE_IF_NOT_NULL` aggregation function only takes effect when 
the value is not NULL, users cannot change a field value to NULL.
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/update/update-of-aggregate-model.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/update/update-of-aggregate-model.md
index fe2fbe3f98..b1ad17ce72 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/update/update-of-aggregate-model.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/update/update-of-aggregate-model.md
@@ -26,7 +26,7 @@ under the License.
 
 这篇文档主要介绍 Doris 聚合模型上基于导入的更新。
 
-## 所有列更新
+## 整行更新
 
 使用 Doris 支持的 Stream Load，Broker Load，Routine Load，Insert Into 等导入方式，往聚合模型（Agg 
模型）中进行数据导入时，都会将新的值与旧的聚合值，根据列的聚合函数产出新的聚合值，这个值可能是插入时产出，也可能是异步 Compaction 
时产出，但是用户查询时，都会得到一样的返回值。
 
@@ -50,12 +50,6 @@ DISTRIBUTED BY HASH(order_id) BUCKETS 1
 PROPERTIES (
 "replication_allocation" = "tag.location.default: 1"
 );
-+----------+--------------+--------------+
-| order_id | order_amount | order_status |
-+----------+--------------+--------------+
-| 1        |          100 | 待付款        |
-+----------+--------------+--------------+
-1 row in set (0.01 sec)
 ```
 
 **数据写入**
@@ -84,4 +78,4 @@ INSERT INTO order_tbl (order_id, order_status) values 
(1,'待发货');
 
 Aggregate Key 
模型在写入过程中不做任何额外处理，所以写入性能不受影响，与普通的数据导入相同。但是在查询时进行聚合的代价较大，典型的聚合查询性能相比 Unique Key 
模型的 Merge-on-Write 实现会有 5-10 倍的下降。
 
-用户无法通过将某个字段由非 NULL 设置为 NULL，写入的 NULL 值在`REPLACE_IF_NOT_NULL`聚合函数的处理中会自动忽略。
+由于 `REPLACE_IF_NOT_NULL` 聚合函数仅在非 NULL 值时才会生效，因此用户无法将某个字段值修改为NULL值。
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/update/update-of-aggregate-model.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/update/update-of-aggregate-model.md
index fe2fbe3f98..b1ad17ce72 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/update/update-of-aggregate-model.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/update/update-of-aggregate-model.md
@@ -26,7 +26,7 @@ under the License.
 
 这篇文档主要介绍 Doris 聚合模型上基于导入的更新。
 
-## 所有列更新
+## 整行更新
 
 使用 Doris 支持的 Stream Load，Broker Load，Routine Load，Insert Into 等导入方式，往聚合模型（Agg 
模型）中进行数据导入时，都会将新的值与旧的聚合值，根据列的聚合函数产出新的聚合值，这个值可能是插入时产出，也可能是异步 Compaction 
时产出，但是用户查询时，都会得到一样的返回值。
 
@@ -50,12 +50,6 @@ DISTRIBUTED BY HASH(order_id) BUCKETS 1
 PROPERTIES (
 "replication_allocation" = "tag.location.default: 1"
 );
-+----------+--------------+--------------+
-| order_id | order_amount | order_status |
-+----------+--------------+--------------+
-| 1        |          100 | 待付款        |
-+----------+--------------+--------------+
-1 row in set (0.01 sec)
 ```
 
 **数据写入**
@@ -84,4 +78,4 @@ INSERT INTO order_tbl (order_id, order_status) values 
(1,'待发货');
 
 Aggregate Key 
模型在写入过程中不做任何额外处理，所以写入性能不受影响，与普通的数据导入相同。但是在查询时进行聚合的代价较大，典型的聚合查询性能相比 Unique Key 
模型的 Merge-on-Write 实现会有 5-10 倍的下降。
 
-用户无法通过将某个字段由非 NULL 设置为 NULL，写入的 NULL 值在`REPLACE_IF_NOT_NULL`聚合函数的处理中会自动忽略。
+由于 `REPLACE_IF_NOT_NULL` 聚合函数仅在非 NULL 值时才会生效，因此用户无法将某个字段值修改为NULL值。
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/update/update-of-aggregate-model.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/update/update-of-aggregate-model.md
index fe2fbe3f98..b1ad17ce72 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/update/update-of-aggregate-model.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/update/update-of-aggregate-model.md
@@ -26,7 +26,7 @@ under the License.
 
 这篇文档主要介绍 Doris 聚合模型上基于导入的更新。
 
-## 所有列更新
+## 整行更新
 
 使用 Doris 支持的 Stream Load，Broker Load，Routine Load，Insert Into 等导入方式，往聚合模型（Agg 
模型）中进行数据导入时，都会将新的值与旧的聚合值，根据列的聚合函数产出新的聚合值，这个值可能是插入时产出，也可能是异步 Compaction 
时产出，但是用户查询时，都会得到一样的返回值。
 
@@ -50,12 +50,6 @@ DISTRIBUTED BY HASH(order_id) BUCKETS 1
 PROPERTIES (
 "replication_allocation" = "tag.location.default: 1"
 );
-+----------+--------------+--------------+
-| order_id | order_amount | order_status |
-+----------+--------------+--------------+
-| 1        |          100 | 待付款        |
-+----------+--------------+--------------+
-1 row in set (0.01 sec)
 ```
 
 **数据写入**
@@ -84,4 +78,4 @@ INSERT INTO order_tbl (order_id, order_status) values 
(1,'待发货');
 
 Aggregate Key 
模型在写入过程中不做任何额外处理，所以写入性能不受影响，与普通的数据导入相同。但是在查询时进行聚合的代价较大，典型的聚合查询性能相比 Unique Key 
模型的 Merge-on-Write 实现会有 5-10 倍的下降。
 
-用户无法通过将某个字段由非 NULL 设置为 NULL，写入的 NULL 值在`REPLACE_IF_NOT_NULL`聚合函数的处理中会自动忽略。
+由于 `REPLACE_IF_NOT_NULL` 聚合函数仅在非 NULL 值时才会生效，因此用户无法将某个字段值修改为NULL值。
diff --git 
a/versioned_docs/version-2.1/data-operate/update/update-of-aggregate-model.md 
b/versioned_docs/version-2.1/data-operate/update/update-of-aggregate-model.md
index 1a7dedcad7..3fde759a85 100644
--- 
a/versioned_docs/version-2.1/data-operate/update/update-of-aggregate-model.md
+++ 
b/versioned_docs/version-2.1/data-operate/update/update-of-aggregate-model.md
@@ -1,7 +1,7 @@
----
+-
 {
-    "title": "Updating Data on Aggregate Key Model",
-    "language": "en"
+  "title": "Updating Data on Aggregate Key Model",
+  "language": "en"
 }
 ---
 
@@ -24,23 +24,21 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Update for Aggregate Load
-
-This guide is about ingestion-based data updates for the Aggregate Key model 
in Doris.
+This document primarily introduces how to update the Doris Aggregate model 
based on data load.
 
-## Update all columns
+## Whole Row Update
 
-When importing data into an Aggregate Key model in Doris by methods like 
Stream Load, Broker Load, Routine Load, and Insert Into, the new values are 
combined with the old values to produce new aggregated values based on the 
column's aggregation function. These values might be generated during insertion 
or produced asynchronously during compaction. However, when querying, users 
will always receive the same returned values.
+When loading data into the Aggregate model table using Doris-supported methods 
such as Stream Load, Broker Load, Routine Load, Insert Into, etc., the new 
values will be aggregated with the old values according to the column's 
aggregation function to produce new aggregated values. This value may be 
produced at the time of insertion or during asynchronous compaction, but users 
will get the same return value when querying.
 
-## Partial column update for Aggregate Key model
+## Partial Column Update of Aggregate Model
 
-Tables in the Aggregate Key model are primarily used in cases with 
pre-aggregation requirements rather than data updates, but Doris allows partial 
column updates for them, too. Simply set the aggregation function to 
`REPLACE_IF_NOT_NULL`.
+The Aggregate table is mainly used in pre-aggregation scenarios rather than 
data update scenarios, but partial column updates can be achieved by setting 
the aggregation function to REPLACE_IF_NOT_NULL.
 
-**Create table**
+**Create Table**
 
-For the columns that need to be updated, set the aggregation function to 
`REPLACE_IF_NOT_NULL`.
+Set the aggregation function of the fields that need to be updated to 
`REPLACE_IF_NOT_NULL`.
 
-```Plain
+```sql
 CREATE TABLE order_tbl (
   order_id int(11) NULL,
   order_amount int(11) REPLACE_IF_NOT_NULL NULL,
@@ -52,38 +50,32 @@ DISTRIBUTED BY HASH(order_id) BUCKETS 1
 PROPERTIES (
 "replication_allocation" = "tag.location.default: 1"
 );
-+----------+--------------+-----------------+
-| order_id | order_amount | order_status    |
-+----------+--------------+-----------------+
-| 1        |          100 | Pending Payment |
-+----------+--------------+-----------------+
-1 row in set (0.01 sec)
 ```
 
-**Ingest data**
+**Data Insertion**
 
-For Stream Load, Broker Load, Routine Load, or INSERT INTO, you can directly 
write the updates to the fields.
+Whether it is Stream Load, Broker Load, Routine Load, or `INSERT INTO`, 
directly write the data of the fields to be updated.
 
 **Example**
 
-Using the same example as above, the corresponding Stream Load command would 
be (no additional headers required):
+Similar to the previous example, the corresponding Stream Load command is (no 
additional header required):
 
 ```shell
 $ cat update.csv
 
 1,To be shipped
 
-$ curl  --location-trusted -u root: -H "column_separator:," -H 
"columns:order_id,order_status" -T /tmp/update.csv 
http://127.0.0.1:8030/api/db1/order_tbl/_stream_load
+curl  --location-trusted -u root: -H "column_separator:," -H 
"columns:order_id,order_status" -T /tmp/update.csv 
http://127.0.0.1:8030/api/db1/order_tbl/_stream_load
 ```
 
-The corresponding `INSERT INTO` statement would be (no additional session 
variables required):
+The corresponding `INSERT INTO` statement is (no additional session variable 
settings required):
 
-```Plain
-INSERT INTO order_tbl (order_id, order_status) values (1,'Delivery Pending');
+```sql
+INSERT INTO order_tbl (order_id, order_status) values (1,'Shipped');
 ```
 
-## Note
+## Notes on Partial Column Updates
 
-The Aggregate Key model does not perform additional data processing during 
data writing, so the writing performance in this model is the same as other 
models. However, aggregation during queries can result in performance loss. 
Typical aggregation queries can be 5~10 times slower than queries on 
Merge-on-Write tables in the Unique Key model.
+The Aggregate Key model does not perform any additional processing during the 
write process, so the write performance is not affected and is the same as 
normal data load. However, the cost of aggregation during query is relatively 
high, and the typical aggregation query performance is 5-10 times lower than 
the Merge-on-Write implementation of the Unique Key model.
 
-Under this circumstance, users cannot set a field from non-NULL to NULL, 
because NULL values written will be automatically neglected by the 
REPLACE_IF_NOT_NULL aggregation function.
+Since the `REPLACE_IF_NOT_NULL` aggregation function only takes effect when 
the value is not NULL, users cannot change a field value to NULL.
diff --git 
a/versioned_docs/version-3.0/data-operate/update/update-of-aggregate-model.md 
b/versioned_docs/version-3.0/data-operate/update/update-of-aggregate-model.md
index 4b5ef90675..3fde759a85 100644
--- 
a/versioned_docs/version-3.0/data-operate/update/update-of-aggregate-model.md
+++ 
b/versioned_docs/version-3.0/data-operate/update/update-of-aggregate-model.md
@@ -1,7 +1,7 @@
----
+-
 {
-    "title": "Updating Data on Aggregate Key Model",
-    "language": "en"
+  "title": "Updating Data on Aggregate Key Model",
+  "language": "en"
 }
 ---
 
@@ -24,23 +24,21 @@ specific language governing permissions and limitations
 under the License.
 -->
 
+This document primarily introduces how to update the Doris Aggregate model 
based on data load.
 
+## Whole Row Update
 
-This guide is about ingestion-based data updates for the Aggregate Key model 
in Doris.
-
-## Update all columns
-
-When importing data into an Aggregate Key model in Doris by methods like 
Stream Load, Broker Load, Routine Load, and Insert Into, the new values are 
combined with the old values to produce new aggregated values based on the 
column's aggregation function. These values might be generated during insertion 
or produced asynchronously during compaction. However, when querying, users 
will always receive the same returned values.
+When loading data into the Aggregate model table using Doris-supported methods 
such as Stream Load, Broker Load, Routine Load, Insert Into, etc., the new 
values will be aggregated with the old values according to the column's 
aggregation function to produce new aggregated values. This value may be 
produced at the time of insertion or during asynchronous compaction, but users 
will get the same return value when querying.
 
-## Partial column update for Aggregate Key model
+## Partial Column Update of Aggregate Model
 
-Tables in the Aggregate Key model are primarily used in cases with 
pre-aggregation requirements rather than data updates, but Doris allows partial 
column updates for them, too. Simply set the aggregation function to 
`REPLACE_IF_NOT_NULL`.
+The Aggregate table is mainly used in pre-aggregation scenarios rather than 
data update scenarios, but partial column updates can be achieved by setting 
the aggregation function to REPLACE_IF_NOT_NULL.
 
-**Create table**
+**Create Table**
 
-For the columns that need to be updated, set the aggregation function to 
`REPLACE_IF_NOT_NULL`.
+Set the aggregation function of the fields that need to be updated to 
`REPLACE_IF_NOT_NULL`.
 
-```Plain
+```sql
 CREATE TABLE order_tbl (
   order_id int(11) NULL,
   order_amount int(11) REPLACE_IF_NOT_NULL NULL,
@@ -52,38 +50,32 @@ DISTRIBUTED BY HASH(order_id) BUCKETS 1
 PROPERTIES (
 "replication_allocation" = "tag.location.default: 1"
 );
-+----------+--------------+-----------------+
-| order_id | order_amount | order_status    |
-+----------+--------------+-----------------+
-| 1        |          100 | Pending Payment |
-+----------+--------------+-----------------+
-1 row in set (0.01 sec)
 ```
 
-**Ingest data**
+**Data Insertion**
 
-For Stream Load, Broker Load, Routine Load, or INSERT INTO, you can directly 
write the updates to the fields.
+Whether it is Stream Load, Broker Load, Routine Load, or `INSERT INTO`, 
directly write the data of the fields to be updated.
 
 **Example**
 
-Using the same example as above, the corresponding Stream Load command would 
be (no additional headers required):
+Similar to the previous example, the corresponding Stream Load command is (no 
additional header required):
 
 ```shell
 $ cat update.csv
 
 1,To be shipped
 
-$ curl  --location-trusted -u root: -H "column_separator:," -H 
"columns:order_id,order_status" -T /tmp/update.csv 
http://127.0.0.1:8030/api/db1/order_tbl/_stream_load
+curl  --location-trusted -u root: -H "column_separator:," -H 
"columns:order_id,order_status" -T /tmp/update.csv 
http://127.0.0.1:8030/api/db1/order_tbl/_stream_load
 ```
 
-The corresponding `INSERT INTO` statement would be (no additional session 
variables required):
+The corresponding `INSERT INTO` statement is (no additional session variable 
settings required):
 
-```Plain
-INSERT INTO order_tbl (order_id, order_status) values (1,'Delivery Pending');
+```sql
+INSERT INTO order_tbl (order_id, order_status) values (1,'Shipped');
 ```
 
-## Note
+## Notes on Partial Column Updates
 
-The Aggregate Key model does not perform additional data processing during 
data writing, so the writing performance in this model is the same as other 
models. However, aggregation during queries can result in performance loss. 
Typical aggregation queries can be 5~10 times slower than queries on 
Merge-on-Write tables in the Unique Key model.
+The Aggregate Key model does not perform any additional processing during the 
write process, so the write performance is not affected and is the same as 
normal data load. However, the cost of aggregation during query is relatively 
high, and the typical aggregation query performance is 5-10 times lower than 
the Merge-on-Write implementation of the Unique Key model.
 
-Under this circumstance, users cannot set a field from non-NULL to NULL, 
because NULL values written will be automatically neglected by the 
REPLACE_IF_NOT_NULL aggregation function.
+Since the `REPLACE_IF_NOT_NULL` aggregation function only takes effect when 
the value is not NULL, users cannot change a field value to NULL.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

(doris-website) branch master updated: [doc](data update)Address comment on update of agg model and translate en doc by LLM (#1721)

Reply via email to