This is an automated email from the ASF dual-hosted git repository. luzhijing pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new 9a4fea77de2 [blog](update) Update data partition blog (#996) 9a4fea77de2 is described below commit 9a4fea77de2beaf9900e5569b53585ded62d33f4 Author: KassieZ <139741991+kass...@users.noreply.github.com> AuthorDate: Thu Aug 15 10:29:14 2024 +0800 [blog](update) Update data partition blog (#996) --- blog/apache-doris-vs-rockset.md | 2 +- blog/auto-partition-in-apache-doris.md | 483 +++++++++++++++++++++ ...olution-of-the-apache-doris-execution-engine.md | 2 - blog/migrate-lakehouse-from-bigquery-to-doris.md | 2 +- blog/release-note-2.0.14.md | 2 +- src/components/recent-blogs/recent-blogs.data.ts | 10 +- src/constant/newsletter.data.ts | 15 +- static/images/auto-partition-in-apache-doris.jpg | Bin 0 -> 540079 bytes .../images/auto-partition-ingestion-workflow-2.png | Bin 0 -> 85479 bytes .../images/auto-partition-ingestion-workflow.png | Bin 0 -> 125212 bytes static/images/auto-partition-performance-2.png | Bin 0 -> 73728 bytes static/images/auto-partition-performance.png | Bin 0 -> 167793 bytes ...olution-of-partitioning-strategies-in-Doris.png | Bin 0 -> 80474 bytes static/images/performance-comparison.png | Bin 0 -> 136496 bytes 14 files changed, 499 insertions(+), 17 deletions(-) diff --git a/blog/apache-doris-vs-rockset.md b/blog/apache-doris-vs-rockset.md index 79e7e82d98a..a7a443be944 100644 --- a/blog/apache-doris-vs-rockset.md +++ b/blog/apache-doris-vs-rockset.md @@ -7,7 +7,7 @@ 'author': 'Zaki Lu', 'tags': ['Top News'], 'picked': "true", - 'order': "3", + 'order': "4", "image": '/images/doris-vs-rockset.jpeg' } diff --git a/blog/auto-partition-in-apache-doris.md b/blog/auto-partition-in-apache-doris.md new file mode 100644 index 00000000000..ec1be7fe55a --- /dev/null +++ b/blog/auto-partition-in-apache-doris.md @@ -0,0 +1,483 @@ +--- +{ + 'title': 'Automatic and flexible data sharding: Auto Partition in Apache Doris', + 'summary': "Apache Doris 2.1.0 introduces Auto Partition. It supports partitioning data by RANGE or by LIST and further enhances flexibility on top of automatic partitioning.", + 'description': "Apache Doris 2.1.0 introduces Auto Partition. It supports partitioning data by RANGE or by LIST and further enhances flexibility on top of automatic partitioning.", + 'date': '2024-08-14', + 'author': 'Apache Doris', + 'tags': ['Tech Sharing'], + 'picked': "true", + 'order': "1", + "image": '/images/auto-partition-in-apache-doris.jpg' +} + +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +To handle large datasets, distributed databases introduce strategies like partitioning and bucketing. Data is divided into smaller units based on specific rules and distributed across different nodes, so databases can perform parallel processing for higher performance and data management flexibility. + +Like in many databases, [Apache Doris](https://doris.apache.org) shards data into partitions, and then a partition is further divided into buckets. **Partitions** are typically defined by time or other continuous values. This allows query engines to quickly locate the target data during queries by pruning irrelevant data ranges. + +**Bucketing**, on the other hand, distributes data based on the hash values of one or more columns, which prevents data skew. + +Prior to version [2.1.0](https://doris.apache.org/blog/release-note-2.1.0), there are two way you can create data partitions in Apache Doris: + +- **[Manual Partition](https://doris.apache.org/docs/table-design/data-partition/#manual-partitioning)**: Users specify the partitions in the table creation statement, or modify them through DDL statements afterwards. + +- **[Dynamic Partition](https://doris.apache.org/docs/table-design/data-partition/#dynamic-partition)**: The system automatically maintains partitions within a pre-defined range based on the data ingestion time. + +In Apache Doris 2.1.0, we have introduced **[Auto Partition](https://doris.apache.org/docs/table-design/data-partition/#auto-partition)**. It supports partitioning data by RANGE or by LIST and further enhances flexibility on top of automatic partitioning. + +## Evolution of partitioning strategies in Doris + +In the design of data distribution, we focus more on partition planning, because the choice of partition columns and partition intervals heavily depends on the actual data distribution patterns, and a good partition design can largely improve the query and storage efficiency of the table. + +In Doris, the data table is divided into partitions and then buckets in a hierarchical manner. The data within the same bucket then forms a data **tablet**, which is the minimum physical storage unit in Doris for data replication, inter-cluster data scheduling, and load balancing. + + + + +### Manual Partition + +Doris allows users to manually create data partitions by RANGE and by LIST. + +For time-stamped data like logs and transaction records, users typically create partitions based on the time dimension. Here's an example of the CREATE TABLE statement: + +```sql +CREATE TABLE IF NOT EXISTS example_range_tbl +( + `user_id` LARGEINT NOT NULL COMMENT "User ID", + `date` DATE NOT NULL COMMENT "Data import date", + `timestamp` DATETIME NOT NULL COMMENT "Data import timestamp", + `city` VARCHAR(20) COMMENT "Location of user", + `age` SMALLINT COMMENT "Age of user", + `sex` TINYINT COMMENT "Sex of user", + `last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "Last visit date of user", + `cost` BIGINT SUM DEFAULT "0" COMMENT "User consumption", + `max_dwell_time` INT MAX DEFAULT "0" COMMENT "Maximum dwell time of user", + `min_dwell_time` INT MIN DEFAULT "99999" COMMENT "Minimum dwell time of user" +) +ENGINE=OLAP +AGGREGATE KEY(`user_id`, `date`, `timestamp`, `city`, `age`, `sex`) +PARTITION BY RANGE(`date`) +( + PARTITION `p201701` VALUES LESS THAN ("2017-02-01"), + PARTITION `p201702` VALUES LESS THAN ("2017-03-01"), + PARTITION `p201703` VALUES LESS THAN ("2017-04-01"), + PARTITION `p2018` VALUES [("2018-01-01"), ("2019-01-01")) +) +DISTRIBUTED BY HASH(`user_id`) BUCKETS 16 +PROPERTIES +( + "replication_num" = "1" +); +``` + +The table is partitioned by the data import date `date`, and 4 partitions have been pre-created. Within each partition, the data is further divided into 16 buckets based on the hash value of the `user_id`. + +With this partitioning and bucketing design, when querying data from 2018 onwards, the system only need to scan the `p2018` partition. This is what the query SQL looks like: + +```sql +mysql> desc select count() from example_range_tbl where date >= '20180101'; ++--------------------------------------------------------------------------------------+ +| Explain String(Nereids Planner) | ++--------------------------------------------------------------------------------------+ +| PLAN FRAGMENT 0 | +| OUTPUT EXPRS: | +| count(*)[#11] | +| PARTITION: UNPARTITIONED | +| | +| ...... | +| | +| 0:VOlapScanNode(193) | +| TABLE: test.example_range_tbl(example_range_tbl), PREAGGREGATION: OFF. | +| PREDICATES: (date[#1] >= '2018-01-01') | +| partitions=1/4 (p2018), tablets=16/16, tabletList=561490,561492,561494 ... | +| cardinality=0, avgRowSize=0.0, numNodes=1 | +| pushAggOp=NONE | +| | ++--------------------------------------------------------------------------------------+ +``` + +If the data is distributed unevenly across partitions, the hash-based bucketing mechanism can further divide the data based on the `user_id`. This helps to avoid load imbalance on some machines during querying and storage. + +However, in real-world business scenarios, one cluster may have tens of thousands of tables, which means it is impossible to manage them manually. + +```sql +CREATE TABLE `DAILY_TRADE_VALUE` +( + `TRADE_DATE` datev2 NOT NULL COMMENT 'Trade date', + `TRADE_ID` varchar(40) NOT NULL COMMENT 'Trade ID', + ...... +) +UNIQUE KEY(`TRADE_DATE`, `TRADE_ID`) +PARTITION BY RANGE(`TRADE_DATE`) +( + PARTITION p_200001 VALUES [('2000-01-01'), ('2000-02-01')), + PARTITION p_200002 VALUES [('2000-02-01'), ('2000-03-01')), + PARTITION p_200003 VALUES [('2000-03-01'), ('2000-04-01')), + PARTITION p_200004 VALUES [('2000-04-01'), ('2000-05-01')), + PARTITION p_200005 VALUES [('2000-05-01'), ('2000-06-01')), + PARTITION p_200006 VALUES [('2000-06-01'), ('2000-07-01')), + PARTITION p_200007 VALUES [('2000-07-01'), ('2000-08-01')), + PARTITION p_200008 VALUES [('2000-08-01'), ('2000-09-01')), + PARTITION p_200009 VALUES [('2000-09-01'), ('2000-10-01')), + PARTITION p_200010 VALUES [('2000-10-01'), ('2000-11-01')), + PARTITION p_200011 VALUES [('2000-11-01'), ('2000-12-01')), + PARTITION p_200012 VALUES [('2000-12-01'), ('2001-01-01')), + PARTITION p_200101 VALUES [('2001-01-01'), ('2001-02-01')), + ...... +) +DISTRIBUTED BY HASH(`TRADE_DATE`) BUCKETS 10 +PROPERTIES ( + ...... +); +``` + +In the above example, data is partitioned on a monthly basis. This requires the database administrator (DBA) to manually add a new partition each month and maintain table schema regularly. Imagine the case of real-time data processing, where you might need to create partitions daily or even hourly, manually doing this is no long a choice. That's why we introduced Dynamic Partition. + +### Dynamic Partition + +By Dynamic Partition, Doris automatically creates and reclaims data partitions as long as the user specifies the partition unit, the number of historical partitions, and the number of future partitions. This functionality relies on a fixed thread on the Doris Frontend. It continuously polls and checks for new partitions to be created or old partitions to be reclaimed, and updates the partition schema of the table. + +This is an example CREATE TABLE statement for a table which is partitioned by day. The `start` and `end` parameters are set to `-7` and `3`, respectively, meaning that data partitions for the next 3 days will be pre-created and the historical partitions that are older than 7 days will be reclaimed. + +```sql +CREATE TABLE `DAILY_TRADE_VALUE` +( + `TRADE_DATE` datev2 NOT NULL COMMENT 'Trade date', + `TRADE_ID` varchar(40) NOT NULL COMMENT 'Trade ID', + ...... +) +UNIQUE KEY(`TRADE_DATE`, `TRADE_ID`) +PARTITION BY RANGE(`TRADE_DATE`) () +DISTRIBUTED BY HASH(`TRADE_DATE`) BUCKETS 10 +PROPERTIES ( + "dynamic_partition.enable" = "true", + "dynamic_partition.time_unit" = "DAY", + "dynamic_partition.start" = "-7", + "dynamic_partition.end" = "3", + "dynamic_partition.prefix" = "p", + "dynamic_partition.buckets" = "10" +); +``` + +Over time, the table will always maintain partitions within the range of `[current date - 7, current date + 3]`. Dynamic Partition is particularly useful for real-time data ingestion scenarios, such as when the ODS (Operational Data Store) layer directly receives data from external sources like Kafka. + +The `start` and `end` parameters define a fixed range for the partitions, allowing the user to manage the partitions only within this range. However, if the user needs to include more historical data, they would have to dial up the `start` value, and that could lead to unnecessary metadata overhead in the cluster. + +Therefore, when applying Dynamic Partition, there is a trade-off between the convenience and efficiency of metadata management. + +## Developers' words + +As the complexity of business adds up, Dynamic Partition becomes inadequate because: + +- It only supports partitioning by RANGE but not by LIST. + +- It can only be applied to the current real-world timestamps. + +- It only supports a single continuous partition range, and cannot accommodate partitions outside of that range. + +Given these functional limitations, we started to plan a new partitioning mechanism that can both automate partition management and simplify data table maintenance. + +We figured out that the ideal partitioning implementation should: + +- Save the need for manually creating partitions after table creation; + +- Be able to accommodate all ingested data in corresponding partitions. + +**The former stands for automation and the latter for flexibility. The essence of realizing them both is associating partition creation with the actual data.** + +Then we started to think about: What if we hold off the creation of partitions until the data is ingested, rather than doing it during table creation or through regular polling. Instead of pre-constructing the partition distribution, we can define the "data-to-partition" mapping rules, so the partitions are created after data arrives. + +Compared to Manual Partition, this whole process would be fully automated, eliminating the need for human maintenance. Compared to Dynamic Partition, it avoids having partitions that are not used, or partitions that are needed but not present. + +## Auto Partition + +With [Apache Doris 2.1.0](https://doris.apache.org/blog/release-note-2.1.0), we bring the above plan into fruition. During data ingestion, Doris creates data partitions based on the configured rules. The Doris Backend nodes that are responsible for data processing and distribution will attempt to find the appropriate partition for each row of data in the DataSink operator of the execution plan. It no longer filters out data that does not fit into any existing partition or reports an erro [...] + +### Auto Partition by RANGE + +Auto Partition by RANGE provides an optimized partitioning solution based on the time dimension. It is more flexible than Dynamic Partition in terms of parameter configuration. The syntax for it is as follows: + +```sql +AUTO PARTITION BY RANGE (FUNC_CALL_EXPR) +() +FUNC_CALL_EXPR ::= DATE_TRUNC ( <partition_column>, '<interval>' ) +``` + +The `<partition_column>` above is the partition column (i.e., the column that the partitioning is based on). `<interval>`specifies the partition unit, which is the desired width of each partition. + +For example, if the partition column is `k0` and you want to partition by month, the partition statement would be `AUTO PARTITION BY RANGE (DATE_TRUNC(k0, 'month'))`. For all the imported data, the system will call `DATE_TRUNC(k0, 'month')` to calculate the left endpoint of the partition, and then the right endpoint by adding one `interval`. + +Now, we can apply Auto Partition to the `DAILY_TRADE_VALUE` table introduced in the previous section on Dynamic Partition. + +```sql +CREATE TABLE DAILY_TRADE_VALUE +( + `TRADE_DATE` DATEV2 NOT NULL COMMENT 'Trade Date', + `TRADE_ID` VARCHAR(40) NOT NULL COMMENT 'Trade ID', + ...... +) +AUTO PARTITION BY RANGE (DATE_TRUNC(`TRADE_DATE`, 'month')) +() +DISTRIBUTED BY HASH(`TRADE_DATE`) BUCKETS 10 +PROPERTIES +( + ...... +); +``` + +After importing some data, these are the partitions we get: + +```sql +mysql> show partitions from DAILY_TRADE_VALUE; +Empty set (0.10 sec) + +mysql> insert into DAILY_TRADE_VALUE values ('2015-01-01', 1), ('2020-01-01', 2), ('2024-03-05', 10000), ('2024-03-06', 10001); +Query OK, 4 rows affected (0.24 sec) +{'label':'label_2a7353a3f991400e_ae731988fa2bc568', 'status':'VISIBLE', 'txnId':'85097'} + +mysql> show partitions from DAILY_TRADE_VALUE; ++-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +| PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | IsMutable | SyncWithBaseTables | UnsyncTables | ++-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +| 588395 | p20150101000000 | 2 | 2024-06-01 19:02:40 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [2015-01-01]; ..types: [DATEV2]; keys: [2015-02-01]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | +| 588437 | p20200101000000 | 2 | 2024-06-01 19:02:40 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [2020-01-01]; ..types: [DATEV2]; keys: [2020-02-01]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | +| 588416 | p20240301000000 | 2 | 2024-06-01 19:02:40 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [2024-03-01]; ..types: [DATEV2]; keys: [2024-04-01]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | ++-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +3 rows in set (0.09 sec) +``` + +As is shown, partitions are automatically created for the imported data, and it doesn't create partitions that are beyond the range of the existing data. + +### Auto Partition by LIST + +Auto Partition by LIST is to shard data based on non-time-based dimensions, such as `region` and `department`. It fills that gap for Dynamic Partition, which does not support data partitioning by LIST. + +Auto Partition by RANGE provides an optimized partitioning solution based on the time dimension. It is more flexible than Dynamic Partition in terms of parameter configuration. The syntax for it is as follows: + +```sql +AUTO PARTITION BY LIST (`partition_col`) +() +``` + +This is an example of Auto Partition by LIST using `city` as the partition column: + +```SQL +mysql> CREATE TABLE `str_table` ( + -> `city` VARCHAR NOT NULL, + -> ...... + -> ) + -> DUPLICATE KEY(`city`) + -> AUTO PARTITION BY LIST (`city`) + -> () + -> DISTRIBUTED BY HASH(`city`) BUCKETS 10 + -> PROPERTIES ( + -> ...... + -> ); +Query OK, 0 rows affected (0.09 sec) + +mysql> insert into str_table values ("Denver"), ("Boston"), ("Los_Angeles"); +Query OK, 3 rows affected (0.25 sec) + +mysql> show partitions from str_table; ++-------------+-----------------+----------------+---------------------+--------+--------------+-------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +| PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | IsMutable | SyncWithBaseTables | UnsyncTables | ++-------------+-----------------+----------------+---------------------+--------+--------------+-------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +| 589685 | pDenver7 | 2 | 2024-06-01 20:12:37 | NORMAL | city | [types: [VARCHAR]; keys: [Denver]; ] | city | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | +| 589643 | pLos5fAngeles11 | 2 | 2024-06-01 20:12:37 | NORMAL | city | [types: [VARCHAR]; keys: [Los_Angeles]; ] | city | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | +| 589664 | pBoston8 | 2 | 2024-06-01 20:12:37 | NORMAL | city | [types: [VARCHAR]; keys: [Boston]; ] | city | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | ++-------------+-----------------+----------------+---------------------+--------+--------------+-------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +3 rows in set (0.10 sec) +``` + +After inserting data for the cities of Denver, Boston, and Los Angeles, the system automatically created corresponding partitions based on the city names. Previously, this type of custom partitioning could only be achieved through manual DDL statements. This is how Auto Partition by LIST simplifies database maintenance. + +### Tips & notes + +**Manually adjust historical partitions** + +For tables that receive both real-time data and occasional historical updates, since Auto Partition does not automatically reclaim historical partitions, we recommend two options: + +- Use Auto Partition, which will automatically create partitions for the occasional historical data updates. + +- Use Auto Partition and manually create a `LESS THAN` partition to accommodate the historical updates. This allows for a clearer separation of historical and real-time data, and makes data management easier. + +```sql +mysql> CREATE TABLE DAILY_TRADE_VALUE + -> ( + -> `TRADE_DATE` DATEV2 NOT NULL COMMENT 'Trade Date', + -> `TRADE_ID` VARCHAR(40) NOT NULL COMMENT 'Trade ID' + -> ) + -> AUTO PARTITION BY RANGE (DATE_TRUNC(`TRADE_DATE`, 'DAY')) + -> ( + -> PARTITION `pHistory` VALUES LESS THAN ("2024-01-01") + -> ) + -> DISTRIBUTED BY HASH(`TRADE_DATE`) BUCKETS 10 + -> PROPERTIES + -> ( + -> "replication_num" = "1" + -> ); +Query OK, 0 rows affected (0.11 sec) + +mysql> insert into DAILY_TRADE_VALUE values ('2015-01-01', 1), ('2020-01-01', 2), ('2024-03-05', 10000), ('2024-03-06', 10001); +Query OK, 4 rows affected (0.25 sec) +{'label':'label_96dc3d20c6974f4a_946bc1a674d24733', 'status':'VISIBLE', 'txnId':'85092'} + +mysql> show partitions from DAILY_TRADE_VALUE; ++-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +| PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | IsMutable | SyncWithBaseTables | UnsyncTables | ++-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +| 577871 | pHistory | 2 | 2024-06-01 08:53:49 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [0000-01-01]; ..types: [DATEV2]; keys: [2024-01-01]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | +| 577940 | p20240305000000 | 2 | 2024-06-01 08:53:49 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [2024-03-05]; ..types: [DATEV2]; keys: [2024-03-06]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | +| 577919 | p20240306000000 | 2 | 2024-06-01 08:53:49 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [2024-03-06]; ..types: [DATEV2]; keys: [2024-03-07]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | ++-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +3 rows in set (0.10 sec) +``` + +**NULL partition** + +With Auto Partition by LIST, Doris supports storing NULL values in NULL partitions. For example: + +```sql +mysql> CREATE TABLE list_nullable + -> ( + -> `str` varchar NULL + -> ) + -> AUTO PARTITION BY LIST (`str`) + -> () + -> DISTRIBUTED BY HASH(`str`) BUCKETS auto + -> PROPERTIES + -> ( + -> "replication_num" = "1" + -> ); +Query OK, 0 rows affected (0.10 sec) + +mysql> insert into list_nullable values ('123'), (''), (NULL); +Query OK, 3 rows affected (0.24 sec) +{'label':'label_f5489769c2f04f0d_bfb65510f9737fff', 'status':'VISIBLE', 'txnId':'85089'} + +mysql> show partitions from list_nullable; ++-------------+---------------+----------------+---------------------+--------+--------------+------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +| PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | IsMutable | SyncWithBaseTables | UnsyncTables | ++-------------+---------------+----------------+---------------------+--------+--------------+------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +| 577297 | pX | 2 | 2024-06-01 08:19:21 | NORMAL | str | [types: [VARCHAR]; keys: [NULL]; ] | str | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | +| 577276 | p0 | 2 | 2024-06-01 08:19:21 | NORMAL | str | [types: [VARCHAR]; keys: []; ] | str | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | +| 577255 | p1233 | 2 | 2024-06-01 08:19:21 | NORMAL | str | [types: [VARCHAR]; keys: [123]; ] | str | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | true | NULL | ++-------------+---------------+----------------+---------------------+--------+--------------+------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+--------------------+--------------+ +3 rows in set (0.11 sec) +``` + +However, Auto Partition by RANGE does not support NULL partitions, because the NULL values will be stored in the smallest `LESS THAN` partition, and it is impossible to reliably determine the appropriate range for it. If Auto Partition were to create a NULL partition with a range of (-INFINITY, MIN_VALUE), there would be a risk of this partition being inadvertently deleted in production, as the MIN_VALUE boundary may not accurately represent the intended business logic. + +### Summary + +Auto Partition covers most of the use cases of Dynamic Partition, while introducing the benefit of upfront partition rule definition. Once the rules are defined, the bulk of partition creation work is automatically handled by Doris instead of a DBA. + +Before utilizing Auto Partition, it's important to understand the relevant limitations: + +1. Auto Partition by LIST supports partitioning based on **multiple columns**, but each automatically created partition only contains one single value, and the partition name cannot exceed 50 characters in length. Note that the partition names follow specific naming conventions, which have particular implications for metadata management. That means not all of the 50-character space is at the user's disposal. + +2. Auto Partition by RANGE only supports a **single partition column**, which must be of type **DATE** or **DATETIME**. + +3. Auto Partition by LIST supports **NULLABLE** partition column and inserting NULL values. Auto Partition by RANGE does not support NULLABLE partition column. + +4. It is not recommended to use Auto Partition in conjunction with Dynamic Partition after Apache Doris 2.1.3. + +## Performance comparison + +The main functional differences between Auto Partition and Dynamic Partition lie in partition creation and deletion, supported partition types, and their impact on import performance. + +Dynamic Partition uses fixed threads to periodically create and reclaim partitions. It only supports partitioning by RANGE. In contrast, Auto Partition supports both partitioning by RANGE and by LIST. It automatically creates partitions on-demand based on specific rules during data ingestion, providing a higher level of automation and flexibility. + +Dynamic Partition does not slow down data ingestion speed, while Auto Partition causes certain time overheads because it firstly checks for existing partitions and then creates new ones on demand. We will present the performance test results. + + + +## Auto Partition: ingestion workflow + +This part is about how data ingestion is implemented with the Auto Partition mechanism, and we use [Stream Load](https://doris.apache.org/docs/data-operate/import/stream-load-manual) as an example. When Doris initiates a data import, one of the Doris Backend nodes takes on the role of the Coordinator. It is responsible for the initial data processing work and then dispatching the data to the appropriate BE nodes, known as the Executors, for execution. + + + + +In the final Datasink Node of the Coordinator's execution pipeline, the data needs to be routed to the correct partitions, buckets, and Doris Backend node locations before it can be successfully transmitted and stored. + +To enable this data transfer, the Coordinator and Executor nodes establish a communication channels: + +- The sending end is called the Node Channel. + +- The receiving end is called the Tablets Channel. + +This is how Auto Partition comes into play during the process of determining the correct partitions for the data: + + + + +Previously, without Auto Partition, when a table does not have the required partition, the behavior in Doris is for the BE nodes to accumulate errors until a `DATA_QUALITY_ERROR` is reported. Now, with Auto Partition enabled, a request will be initiated to the Doris Frontend to create the necessary partition on-the-fly. After the partition creation transaction is completed, the Doris Frontend responds to the Coordinator, which then opens the corresponding communication channels (Node Cha [...] + +In a real-world cluster environment, the time spent by the Coordinator waiting for the Doris Frontend to complete partition creation can incur large overheads. This is due to the inherent latency of Thrift RPC calls, as well as lock contention on the Frontend under high load conditions. + +To improve the data ingestion efficiency in Auto Partition, Doris has implemented batching to largely reduce the number of RPC calls made to the FE. This brings a notable performance enhancement for data write operations. + +Note that when the FE Master completes the partition creation transaction, the new partition becomes immediately visible. However, if the import process ultimately fails or is canceled, the created partitions are not automatically reclaimed. + +## Auto Partition performance + +We tested the performance and stability of Auto Partition in Doris, covering different use cases: + +**Case 1**: 1 Frontend + 3 Backend; 6 randomly generated datasets, each having 100 million rows and 2,000 partitions; ingested the 6 datasets concurrently into 6 tables + +- **Objective**: Evaluate the performance of Auto Partition under high pressure and check for any performance degradation. + +- **Results**: Auto Partition brings an **average performance loss less than 5%**, with all import transactions running stably. + + + +**Case 2**: 1 Frontend + 3 Backend; ingesting 100 rows per second from Flink by Routine Load; testing with 1, 10, and 20 concurrent transactions (tables), respectively + +- **Objective**: Identify any potential or data backlog issues that could arise with Auto Partition under different concurrency levels. + +- **Results**: With or without Auto Partition enabled, the data ingestion was successful without any backpressure issues across all the concurrency levels tested, even at 20 concurrent transactions when the CPU utilization reached close to 100%. + + + + +To conclude the results of these tests, the impact of enabling Auto Partition on data ingestion performance is minimal. + +## Conclusion and future plans + +Auto Partition has simplified DDL and partition management since Apache Doris 2.1.0. It is useful in large-scale data processing and makes it easy for users to migrate from other database systems to Apache Doris. + +Moreover, we are committed to expanding the capabilities of Auto Partition to support more complex data types. + +Plans for Auto Partition by RANGE: + +- Support numeric values; + +- Allowing users to specify the left and right boundaries of the partition range. + +Plans for Auto Partition by LIST: + +- Allow merging multiple values into the same partition based on specific rules. + +Join [Apache Doris open-source community](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2gmq5o30h-455W226d79zP3L96ZhXIoQ) for more information and further guidance. \ No newline at end of file diff --git a/blog/evolution-of-the-apache-doris-execution-engine.md b/blog/evolution-of-the-apache-doris-execution-engine.md index f9d3255e177..e2ae4d63e44 100644 --- a/blog/evolution-of-the-apache-doris-execution-engine.md +++ b/blog/evolution-of-the-apache-doris-execution-engine.md @@ -6,8 +6,6 @@ 'date': '2024-06-18', 'author': 'Apache Doris', 'tags': ['Tech Sharing'], - 'picked': "true", - 'order': "4", "image": '/images/evolution-of-the-apache-doris-execution-engine.jpg' } diff --git a/blog/migrate-lakehouse-from-bigquery-to-doris.md b/blog/migrate-lakehouse-from-bigquery-to-doris.md index dbbe238a521..536ae4727a6 100644 --- a/blog/migrate-lakehouse-from-bigquery-to-doris.md +++ b/blog/migrate-lakehouse-from-bigquery-to-doris.md @@ -7,7 +7,7 @@ 'author': 'Dien, Tran Thanh', 'tags': ['Best Practice'], 'picked': "true", - 'order': "2", + 'order': "3", "image": '/images/migrate-lakehouse-from-bigquery-to-apache-doris.jpg' } diff --git a/blog/release-note-2.0.14.md b/blog/release-note-2.0.14.md index 083809df92d..1e52d95e26e 100644 --- a/blog/release-note-2.0.14.md +++ b/blog/release-note-2.0.14.md @@ -7,7 +7,7 @@ 'author': 'Apache Doris', 'tags': ['Release Notes'], 'picked': "true", - 'order': "1", + 'order': "2", "image": '/images/2.0.14.jpg' } --- diff --git a/src/components/recent-blogs/recent-blogs.data.ts b/src/components/recent-blogs/recent-blogs.data.ts index 3737e728480..36c0a344f39 100644 --- a/src/components/recent-blogs/recent-blogs.data.ts +++ b/src/components/recent-blogs/recent-blogs.data.ts @@ -1,4 +1,8 @@ export const RECENT_BLOGS_POSTS = [ + { + label: 'Apache Doris version 2.0.14 has been released', + link: 'https://doris.apache.org/blog/release-note-2.0.14', + }, { label: 'Migrate data lakehouse from BigQuery to Apache Doris, saving $4,500 per month', link: 'https://doris.apache.org/blog/migrate-lakehouse-from-bigquery-to-doris', @@ -10,9 +14,5 @@ export const RECENT_BLOGS_POSTS = [ { label: `Steps to industry-leading query speed: evolution of the Apache Doris execution engine`, link: 'https://doris.apache.org/blog/evolution-of-the-apache-doris-execution-engine', - }, - { - label: `Another lifesaver for data engineers: Apache Doris Job Scheduler for task automation`, - link: 'https://doris.apache.org/blog/job-scheduler-for-task-automation', - }, + } ]; diff --git a/src/constant/newsletter.data.ts b/src/constant/newsletter.data.ts index 658e18e1359..a063afa1c6e 100644 --- a/src/constant/newsletter.data.ts +++ b/src/constant/newsletter.data.ts @@ -1,4 +1,11 @@ export const NEWSLETTER_DATA = [ + { + tags: ['Tech Sharing'], + title: "Automatic and flexible data sharding: Auto Partition in Apache Doris", + content: `Apache Doris 2.1.0 introduces Auto Partition. It supports partitioning data by RANGE or by LIST and further enhances flexibility on top of automatic partitioning.`, + to: '/blog/auto-partition-in-apache-doris', + image: 'auto-partition-in-apache-doris.jpg', + }, { tags: ['Release Note'], title: "Apache Doris version 2.0.14 has been released", @@ -20,12 +27,6 @@ export const NEWSLETTER_DATA = [ to: '/blog/apache-doris-vs-rockset', image: 'doris-vs-rockset.jpeg', }, - { - tags: ['Recent Events'], - title: "Steps to industry-leading query speed: evolution of the Apache Doris execution engine", - content: `From the Volcano Model to the Pipeline Execution Engine, and now PipelineX, Apache Doris brings its computation efficiency to a higher level with each iteration.`, - to: '/blog/evolution-of-the-apache-doris-execution-engine', - image: 'evolution-of-the-apache-doris-execution-engine.jpg', - }, + ]; diff --git a/static/images/auto-partition-in-apache-doris.jpg b/static/images/auto-partition-in-apache-doris.jpg new file mode 100644 index 00000000000..bc26f576ba0 Binary files /dev/null and b/static/images/auto-partition-in-apache-doris.jpg differ diff --git a/static/images/auto-partition-ingestion-workflow-2.png b/static/images/auto-partition-ingestion-workflow-2.png new file mode 100644 index 00000000000..bda96e58226 Binary files /dev/null and b/static/images/auto-partition-ingestion-workflow-2.png differ diff --git a/static/images/auto-partition-ingestion-workflow.png b/static/images/auto-partition-ingestion-workflow.png new file mode 100644 index 00000000000..bbcee300280 Binary files /dev/null and b/static/images/auto-partition-ingestion-workflow.png differ diff --git a/static/images/auto-partition-performance-2.png b/static/images/auto-partition-performance-2.png new file mode 100644 index 00000000000..e281550a647 Binary files /dev/null and b/static/images/auto-partition-performance-2.png differ diff --git a/static/images/auto-partition-performance.png b/static/images/auto-partition-performance.png new file mode 100644 index 00000000000..0c61cf2c6f7 Binary files /dev/null and b/static/images/auto-partition-performance.png differ diff --git a/static/images/evolution-of-partitioning-strategies-in-Doris.png b/static/images/evolution-of-partitioning-strategies-in-Doris.png new file mode 100644 index 00000000000..d556aa55a55 Binary files /dev/null and b/static/images/evolution-of-partitioning-strategies-in-Doris.png differ diff --git a/static/images/performance-comparison.png b/static/images/performance-comparison.png new file mode 100644 index 00000000000..926702b39cd Binary files /dev/null and b/static/images/performance-comparison.png differ --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org