This is an automated email from the ASF dual-hosted git repository. zykkk pushed a commit to branch branch-2.0 in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/branch-2.0 by this push: new abe04a4ecf0 [branch-2.0] delete auto partition docs (#28001) abe04a4ecf0 is described below commit abe04a4ecf02d41cdae29a5d1f2b63b8e8d77d5a Author: zclllyybb <zhaochan...@selectdb.com> AuthorDate: Tue Dec 5 14:42:03 2023 +0800 [branch-2.0] delete auto partition docs (#28001) --- docs/en/docs/advanced/partition/auto-partition.md | 190 --------------------- docs/sidebars.json | 3 +- .../docs/advanced/partition/auto-partition.md | 190 --------------------- 3 files changed, 1 insertion(+), 382 deletions(-) diff --git a/docs/en/docs/advanced/partition/auto-partition.md b/docs/en/docs/advanced/partition/auto-partition.md deleted file mode 100644 index c718f75454d..00000000000 --- a/docs/en/docs/advanced/partition/auto-partition.md +++ /dev/null @@ -1,190 +0,0 @@ ---- -{ - "title": "Auto Partition", - "language": "en" -} ---- - -<!-- -Licensed to the Apache Software Foundation (ASF) under one -or more contributor license agreements. See the NOTICE file -distributed with this work for additional information -regarding copyright ownership. The ASF licenses this file -to you under the Apache License, Version 2.0 (the -"License"); you may not use this file except in compliance -with the License. You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - -Unless required by applicable law or agreed to in writing, -software distributed under the License is distributed on an -"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -KIND, either express or implied. See the License for the -specific language governing permissions and limitations -under the License. ---> - -# AUTO PARTITION - -<version since="2.0.3"> - -</version> - -The Auto Partitioning feature supports automatic detection of whether the corresponding partition exists during the data import process. If it does not exist, the partition will be created automatically and imported normally. - -## Grammer - -When building a table, use the following syntax to populate [CREATE-TABLE](../../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md) with the `partition_info` section: - -1. AUTO RANGE PARTITION: - - ```SQL - AUTO PARTITION BY RANGE FUNC_CALL_EXPR - ( - ) - ``` - where - ```SQL - FUNC_CALL_EXPR ::= date_trunc ( <partition_column>, '<interval>' ) - ``` - -2. AUTO LIST PARTITION: - - ```SQL - AUTO PARTITION BY LIST(`partition_col`) - ( - ) - ``` - -### Usage example - -1. AUTO RANGE PARTITION - - ```SQL - CREATE TABLE `${tblDate}` ( - `TIME_STAMP` datev2 NOT NULL COMMENT 'Date of collection' - ) ENGINE=OLAP - DUPLICATE KEY(`TIME_STAMP`) - AUTO PARTITION BY RANGE date_trunc(`TIME_STAMP`, 'month') - ( - ) - DISTRIBUTED BY HASH(`TIME_STAMP`) BUCKETS 10 - PROPERTIES ( - "replication_allocation" = "tag.location.default: 1" - ); - ``` - -2. AUTO LIST PARTITION - - ```SQL - CREATE TABLE `${tblName1}` ( - `str` varchar not null - ) ENGINE=OLAP - DUPLICATE KEY(`str`) - AUTO PARTITION BY LIST (`str`) - ( - ) - DISTRIBUTED BY HASH(`str`) BUCKETS 10 - PROPERTIES ( - "replication_allocation" = "tag.location.default: 1" - ); - ``` - -### Using constraints - -1. Currently the AUTO RANGE PARTITION function supports only one partition column; -2. In AUTO RANGE PARTITION, the partition function supports only `date_trunc` and the partition column supports only `DATEV2` or `DATETIMEV2` format; -3. In AUTO LIST PARTITION, function calls are not supported. Partitioned columns support BOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, DATE, DATETIME, CHAR, VARCHAR datatypes, and partitioned values are enum values. -4. In AUTO LIST PARTITION, a separate new PARTITION is created for each fetch of a partition column for which the corresponding partition does not currently exist. - -## Sample Scenarios - -In the [Dynamic Partitioning](./dynamic-partition.md) feature, we support the automatic creation of new partitions to accommodate real-time data at specific time periods. However, in more complex scenarios, such as processing non-real-time data, the partition columns are independent of the current system time. In this case, if you need to partition the data, you need to manually organise the partitions you belong to and create them before importing the data. This is cumbersome when the n [...] - -For example, we have a table as follows: - -```sql -CREATE TABLE `DAILY_TRADE_VALUE` -( - `TRADE_DATE` datev2 NULL - `TRADE_ID` varchar(40) NULL, - ...... -) -UNIQUE KEY(`TRADE_DATE`, `TRADE_ID`) -PARTITION BY RANGE(`TRADE_DATE`) -( - PARTITION p_2000 VALUES [('2000-01-01'), ('2001-01-01')), - PARTITION p_2001 VALUES [('2001-01-01'), ('2002-01-01')), - PARTITION p_2002 VALUES [('2002-01-01'), ('2003-01-01')), - PARTITION p_2003 VALUES [('2003-01-01'), ('2004-01-01')), - PARTITION p_2004 VALUES [('2004-01-01'), ('2005-01-01')), - PARTITION p_2005 VALUES [('2005-01-01'), ('2006-01-01')), - PARTITION p_2006 VALUES [('2006-01-01'), ('2007-01-01')), - PARTITION p_2007 VALUES [('2007-01-01'), ('2008-01-01')), - PARTITION p_2008 VALUES [('2008-01-01'), ('2009-01-01')), - PARTITION p_2009 VALUES [('2009-01-01'), ('2010-01-01')), - PARTITION p_2010 VALUES [('2010-01-01'), ('2011-01-01')), - PARTITION p_2011 VALUES [('2011-01-01'), ('2012-01-01')), - PARTITION p_2012 VALUES [('2012-01-01'), ('2013-01-01')), - PARTITION p_2013 VALUES [('2013-01-01'), ('2014-01-01')), - PARTITION p_2014 VALUES [('2014-01-01'), ('2015-01-01')), - PARTITION p_2015 VALUES [('2015-01-01'), ('2016-01-01')), - PARTITION p_2016 VALUES [('2016-01-01'), ('2017-01-01')), - PARTITION p_2017 VALUES [('2017-01-01'), ('2018-01-01')), - PARTITION p_2018 VALUES [('2018-01-01'), ('2019-01-01')), - PARTITION p_2019 VALUES [('2019-01-01'), ('2020-01-01')), - PARTITION p_2020 VALUES [('2020-01-01'), ('2021-01-01')), - PARTITION p_2021 VALUES [('2021-01-01'), ('2022-01-01')) -) -DISTRIBUTED BY HASH(`TRADE_DATE`) BUCKETS 10 -PROPERTIES ( - "replication_num" = "1" -); -``` - -The table stores a large amount of business history data, partitioned based on the date the transaction occurred. As you can see when building the table, we need to manually create the partitions in advance. If the data range of the partitioned columns changes, for example, 2022 is added to the above table, we need to create a partition by [ALTER-TABLE-PARTITION](../../sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-PARTITION.md) to make changes to the table partiti [...] - -```SQL -CREATE TABLE `DAILY_TRADE_VALUE` -( - `TRADE_DATE` datev2 NULL, - `TRADE_ID` varchar(40) NULL, - ...... -) -UNIQUE KEY(`TRADE_DATE`, `TRADE_ID`) -AUTO PARTITION BY RANGE date_trunc(`TRADE_DATE`, 'year') -( -) -DISTRIBUTED BY HASH(`TRADE_DATE`) BUCKETS 10 -PROPERTIES ( - "replication_num" = "1" -); -``` - -At this point the new table does not have a default partition: -```SQL -mysql> show partitions from `DAILY_TRADE_VALUE`; -Empty set (0.12 sec) -``` - -After inserting the data and then viewing it again, we could found that the table has been created with corresponding partitions: -```SQL -mysql> insert into `DAILY_TRADE_VALUE` values ('2012-12-13', 1), ('2008-02-03', 2), ('2014-11-11', 3); -Query OK, 3 rows affected (0.88 sec) -{'label':'insert_754e2a3926a345ea_854793fb2638f0ec', 'status':'VISIBLE', 'txnId':'20014'} - -mysql> show partitions from `DAILY_TRADE_VALUE`; -+-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+ -| PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | IsMutable | -+-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+ -| 180060 | p20080101000000 | 2 | 2023-09-18 21:49:29 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [2008-01-01]; ..types: [DATEV2]; keys: [2009-01-01]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | -| 180039 | p20120101000000 | 2 | 2023-09-18 21:49:29 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [2012-01-01]; ..types: [DATEV2]; keys: [2013-01-01]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | -| 180018 | p20140101000000 | 2 | 2023-09-18 21:49:29 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [2014-01-01]; ..types: [DATEV2]; keys: [2015-01-01]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | -+-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+ -3 rows in set (0.12 sec) -``` - -## caveat - -- If a partition is created during the insertion or importation of data and the process eventually fails, the created partition is not automatically deleted. -- Tables that use AUTO PARTITION only have their partitions created automatically instead of manually. The original use of the table and the partitions it creates is the same as for non-AUTO PARTITION tables or partitions. diff --git a/docs/sidebars.json b/docs/sidebars.json index db0caa5e08e..601c843701f 100644 --- a/docs/sidebars.json +++ b/docs/sidebars.json @@ -142,8 +142,7 @@ "label": "Doris Partition", "items": [ "advanced/partition/dynamic-partition", - "advanced/partition/table-temp-partition", - "advanced/partition/auto-partition" + "advanced/partition/table-temp-partition" ] }, { diff --git a/docs/zh-CN/docs/advanced/partition/auto-partition.md b/docs/zh-CN/docs/advanced/partition/auto-partition.md deleted file mode 100644 index 193a0868b37..00000000000 --- a/docs/zh-CN/docs/advanced/partition/auto-partition.md +++ /dev/null @@ -1,190 +0,0 @@ ---- -{ - "title": "自动分区", - "language": "zh-CN" -} ---- - -<!-- -Licensed to the Apache Software Foundation (ASF) under one -or more contributor license agreements. See the NOTICE file -distributed with this work for additional information -regarding copyright ownership. The ASF licenses this file -to you under the Apache License, Version 2.0 (the -"License"); you may not use this file except in compliance -with the License. You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - -Unless required by applicable law or agreed to in writing, -software distributed under the License is distributed on an -"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -KIND, either express or implied. See the License for the -specific language governing permissions and limitations -under the License. ---> - -# 自动分区 - -<version since="2.0.3"> - -</version> - -自动分区功能支持了在导入数据过程中自动检测是否存在对应所属分区。如果不存在,则会自动创建分区并正常进行导入。 - -## 语法 - -建表时,使用以下语法填充[CREATE-TABLE](../../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md)时的`partition_info`部分: - -1. AUTO RANGE PARTITION: - - ```SQL - AUTO PARTITION BY RANGE FUNC_CALL_EXPR - ( - ) - ``` - 其中 - ```SQL - FUNC_CALL_EXPR ::= date_trunc ( <partition_column>, '<interval>' ) - ``` - -2. AUTO LIST PARTITION: - - ```SQL - AUTO PARTITION BY LIST(`partition_col`) - ( - ) - ``` - -### 用法示例 - -1. AUTO RANGE PARTITION - - ```SQL - CREATE TABLE `${tblDate}` ( - `TIME_STAMP` datev2 NOT NULL COMMENT '采集日期' - ) ENGINE=OLAP - DUPLICATE KEY(`TIME_STAMP`) - AUTO PARTITION BY RANGE date_trunc(`TIME_STAMP`, 'month') - ( - ) - DISTRIBUTED BY HASH(`TIME_STAMP`) BUCKETS 10 - PROPERTIES ( - "replication_allocation" = "tag.location.default: 1" - ); - ``` - -2. AUTO LIST PARTITION - - ```SQL - CREATE TABLE `${tblName1}` ( - `str` varchar not null - ) ENGINE=OLAP - DUPLICATE KEY(`str`) - AUTO PARTITION BY LIST (`str`) - ( - ) - DISTRIBUTED BY HASH(`str`) BUCKETS 10 - PROPERTIES ( - "replication_allocation" = "tag.location.default: 1" - ); - ``` - -### 约束 - -1. 当前自动分区功能仅支持一个分区列; -2. 在AUTO RANGE PARTITION中,分区函数仅支持`date_trunc`,分区列仅支持`DATEV2`或者`DATETIMEV2`格式; -3. 在AUTO LIST PARTITION中,不支持函数调用,分区列支持 BOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, DATE, DATETIME, CHAR, VARCHAR 数据类型,分区值为枚举值。 -4. 在AUTO LIST PARTITION中,分区列的每个当前不存在对应分区的取值,都会创建一个独立的新PARTITION。 - -## 场景示例 - -在[动态分区](./dynamic-partition.md)功能中,我们支持了按特定时间周期自动创建新分区以容纳实时数据。但在更复杂的场景下,例如处理非实时数据时,分区列与当前系统时间无关。此时如果需要进行数据分区操作,则需要用户手动整理所属分区并在数据导入前进行创建。在分区列基数较大时比较繁琐。自动分区功能解决了这一问题。 - -例如,我们有一张表如下: - -```sql -CREATE TABLE `DAILY_TRADE_VALUE` -( - `TRADE_DATE` datev2 NULL COMMENT '交易日期', - `TRADE_ID` varchar(40) NULL COMMENT '交易编号', - ...... -) -UNIQUE KEY(`TRADE_DATE`, `TRADE_ID`) -PARTITION BY RANGE(`TRADE_DATE`) -( - PARTITION p_2000 VALUES [('2000-01-01'), ('2001-01-01')), - PARTITION p_2001 VALUES [('2001-01-01'), ('2002-01-01')), - PARTITION p_2002 VALUES [('2002-01-01'), ('2003-01-01')), - PARTITION p_2003 VALUES [('2003-01-01'), ('2004-01-01')), - PARTITION p_2004 VALUES [('2004-01-01'), ('2005-01-01')), - PARTITION p_2005 VALUES [('2005-01-01'), ('2006-01-01')), - PARTITION p_2006 VALUES [('2006-01-01'), ('2007-01-01')), - PARTITION p_2007 VALUES [('2007-01-01'), ('2008-01-01')), - PARTITION p_2008 VALUES [('2008-01-01'), ('2009-01-01')), - PARTITION p_2009 VALUES [('2009-01-01'), ('2010-01-01')), - PARTITION p_2010 VALUES [('2010-01-01'), ('2011-01-01')), - PARTITION p_2011 VALUES [('2011-01-01'), ('2012-01-01')), - PARTITION p_2012 VALUES [('2012-01-01'), ('2013-01-01')), - PARTITION p_2013 VALUES [('2013-01-01'), ('2014-01-01')), - PARTITION p_2014 VALUES [('2014-01-01'), ('2015-01-01')), - PARTITION p_2015 VALUES [('2015-01-01'), ('2016-01-01')), - PARTITION p_2016 VALUES [('2016-01-01'), ('2017-01-01')), - PARTITION p_2017 VALUES [('2017-01-01'), ('2018-01-01')), - PARTITION p_2018 VALUES [('2018-01-01'), ('2019-01-01')), - PARTITION p_2019 VALUES [('2019-01-01'), ('2020-01-01')), - PARTITION p_2020 VALUES [('2020-01-01'), ('2021-01-01')), - PARTITION p_2021 VALUES [('2021-01-01'), ('2022-01-01')) -) -DISTRIBUTED BY HASH(`TRADE_DATE`) BUCKETS 10 -PROPERTIES ( - "replication_num" = "1" -); -``` - -该表内存储了大量业务历史数据,依据交易发生的日期进行分区。可以看到在建表时,我们需要预先手动创建分区。如果分区列的数据范围发生变化,例如上表中增加了2022年的数据,则我们需要通过[ALTER-TABLE-PARTITION](../../sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-PARTITION.md)对表的分区进行更改。在使用AUTO PARTITION后,该表DDL可以改为: - -```SQL -CREATE TABLE `DAILY_TRADE_VALUE` -( - `TRADE_DATE` datev2 NULL COMMENT '交易日期', - `TRADE_ID` varchar(40) NULL COMMENT '交易编号', - ...... -) -UNIQUE KEY(`TRADE_DATE`, `TRADE_ID`) -AUTO PARTITION BY RANGE date_trunc(`TRADE_DATE`, 'year') -( -) -DISTRIBUTED BY HASH(`TRADE_DATE`) BUCKETS 10 -PROPERTIES ( - "replication_num" = "1" -); -``` - -此时新表没有默认分区: -```SQL -mysql> show partitions from `DAILY_TRADE_VALUE`; -Empty set (0.12 sec) -``` - -经过插入数据后再查看,发现该表已经创建了对应的分区: -```SQL -mysql> insert into `DAILY_TRADE_VALUE` values ('2012-12-13', 1), ('2008-02-03', 2), ('2014-11-11', 3); -Query OK, 3 rows affected (0.88 sec) -{'label':'insert_754e2a3926a345ea_854793fb2638f0ec', 'status':'VISIBLE', 'txnId':'20014'} - -mysql> show partitions from `DAILY_TRADE_VALUE`; -+-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+ -| PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | IsMutable | -+-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+ -| 180060 | p20080101000000 | 2 | 2023-09-18 21:49:29 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [2008-01-01]; ..types: [DATEV2]; keys: [2009-01-01]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | -| 180039 | p20120101000000 | 2 | 2023-09-18 21:49:29 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [2012-01-01]; ..types: [DATEV2]; keys: [2013-01-01]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | -| 180018 | p20140101000000 | 2 | 2023-09-18 21:49:29 | NORMAL | TRADE_DATE | [types: [DATEV2]; keys: [2014-01-01]; ..types: [DATEV2]; keys: [2015-01-01]; ) | TRADE_DATE | 10 | 1 | HDD | 9999-12-31 23:59:59 | | NULL | 0.000 | false | tag.location.default: 1 | true | -+-------------+-----------------+----------------+---------------------+--------+--------------+--------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+-----------+ -3 rows in set (0.12 sec) -``` - -## 注意事项 - -- 在数据的插入或导入过程中如果创建了分区,而最终整个过程失败,被创建的分区不会被自动删除。 -- 使用AUTO PARTITION的表,只是分区创建方式上由手动转为了自动。表及其所创建分区的原本使用方法都与非AUTO PARTITION的表或分区相同。 --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org