This is an automated email from the ASF dual-hosted git repository. luzhijing pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new 13bcd109472d [docs](doc)Update en doc (#461) 13bcd109472d is described below commit 13bcd109472d8967882fd2bd0f1d8f01230a8841 Author: Hu Yanjun <100749531+httpshir...@users.noreply.github.com> AuthorDate: Sat Mar 23 14:48:57 2024 +0800 [docs](doc)Update en doc (#461) --- static/images/aggregate-key-model-example.png | Bin 0 -> 355467 bytes static/images/create-table-example.png | Bin 0 -> 390782 bytes static/images/duplicate-key-model-example.png | Bin 0 -> 372902 bytes .../version-2.0/admin-manual/data-admin/ccr.md | 4 +- .../admin-manual/query-admin/kill-query.md | 61 ++- .../data-operate/delete/atomicity-replace.md | 43 +- .../data-operate/delete/truncate-manual.md | 32 +- .../data-operate/update/aggregate-load-update.md | 60 ++- .../version-2.0/db-connect/database-connect.md | 6 +- .../cluster-deployment/standard-deployment.md | 48 +- .../install/source-install/compilation-arm.md | 4 +- .../install/source-install/compilation-mac.md | 12 +- .../compilation-with-ldb-toolchain.md | 12 +- .../version-2.0/lakehouse/datalake/hive.md | 525 ++++++++++++--------- .../version-2.0/table-design/best-practice.md | 483 +++++++++++++++---- 15 files changed, 937 insertions(+), 353 deletions(-) diff --git a/static/images/aggregate-key-model-example.png b/static/images/aggregate-key-model-example.png new file mode 100644 index 000000000000..2474a1bcc938 Binary files /dev/null and b/static/images/aggregate-key-model-example.png differ diff --git a/static/images/create-table-example.png b/static/images/create-table-example.png new file mode 100644 index 000000000000..469c79842ec1 Binary files /dev/null and b/static/images/create-table-example.png differ diff --git a/static/images/duplicate-key-model-example.png b/static/images/duplicate-key-model-example.png new file mode 100644 index 000000000000..3935adb30410 Binary files /dev/null and b/static/images/duplicate-key-model-example.png differ diff --git a/versioned_docs/version-2.0/admin-manual/data-admin/ccr.md b/versioned_docs/version-2.0/admin-manual/data-admin/ccr.md index 7f16c6822bb2..3bd8e54a836f 100644 --- a/versioned_docs/version-2.0/admin-manual/data-admin/ccr.md +++ b/versioned_docs/version-2.0/admin-manual/data-admin/ccr.md @@ -562,7 +562,9 @@ download_binlog_rate_limit_kbs=1024 # This configuration limits the size to 1MB. ## IS_BEING_SYNCED -:::tip Doris v2.0 "is_being_synced" = "true" ::: +:::tip +Doris v2.0 "is_being_synced" = "true" +::: During data synchronization using CCR, replica tables (referred to as target tables) are created in the target cluster for the tables within the synchronization scope of the source cluster (referred to as source tables). However, certain functionalities and attributes need to be disabled or cleared when creating replica tables to ensure the correctness of the synchronization process. For example: diff --git a/versioned_docs/version-2.0/admin-manual/query-admin/kill-query.md b/versioned_docs/version-2.0/admin-manual/query-admin/kill-query.md index 532ac3e62629..0e0ec8ecddd4 100644 --- a/versioned_docs/version-2.0/admin-manual/query-admin/kill-query.md +++ b/versioned_docs/version-2.0/admin-manual/query-admin/kill-query.md @@ -24,4 +24,63 @@ specific language governing permissions and limitations under the License. --> -Coming Soon \ No newline at end of file +# Kill Query +## Kill connection + +In Doris, each connection runs in a separate thread. You can terminate a thread using the `KILL processlist_id`statement. + +The `processlist_id` for the thread can be found in the Id column from the SHOW PROCESSLIST output. Or you can use the `SELECT CONNECTION_ID()` command to query the current connection id. + +Syntax: + +```SQL +KILL [CONNECTION] processlist_id +``` + +## Kill query + +You can also terminate the query command under execution based on the processlist_id or the query_id. + +Syntax: + +```SQL +KILL QUERY processlist_id | query_id +``` + +## Example + +1. Check the current connection id. + +```SQL +mysql select connection_id(); ++-----------------+ +| connection_id() | ++-----------------+ +| 48 | ++-----------------+ +1 row in set (0.00 sec) +``` + +2. Check all connection id. + +```SQL +mysql SHOW PROCESSLIST; ++------------------+------+------+--------------------+---------------------+----------+---------+---------+------+-------+-----------------------------------+---------------------------------------------------------------------------------------+ +| CurrentConnected | Id | User | Host | LoginTime | Catalog | Db | Command | Time | State | QueryId | Info | ++------------------+------+------+--------------------+---------------------+----------+---------+---------+------+-------+-----------------------------------+---------------------------------------------------------------------------------------+ +| Yes | 48 | root | 10.16.xx.xx:44834 | 2023-12-29 16:49:47 | internal | test | Query | 0 | OK | e6e4ce9567b04859-8eeab8d6b5513e38 | SHOW PROCESSLIST | +| | 50 | root | 192.168.xx.xx:52837 | 2023-12-29 16:51:34 | internal | | Sleep | 1837 | EOF | deaf13c52b3b4a3b-b25e8254b50ff8cb | SELECT @@session.transaction_isolation | +| | 51 | root | 192.168.xx.xx:52843 | 2023-12-29 16:51:35 | internal | | Sleep | 907 | EOF | 437f219addc0404f-9befe7f6acf9a700 | /* ApplicationName=DBeaver Ultimate 23.1.3 - Metadata */ SHOW STATUS | +| | 55 | root | 192.168.xx.xx:55533 | 2023-12-29 17:09:32 | internal | test | Sleep | 271 | EOF | f02603dc163a4da3-beebbb5d1ced760c | /* ApplicationName=DBeaver Ultimate 23.1.3 - SQLEditor <Console> */ SELECT DATABASE() | +| | 47 | root | 10.16.xx.xx:35678 | 2023-12-29 16:21:56 | internal | test | Sleep | 3528 | EOF | f4944c543dc34a99-b0d0f3986c8f1c98 | select * from test | ++------------------+------+------+--------------------+---------------------+----------+---------+---------+------+-------+-----------------------------------+---------------------------------------------------------------------------------------+ +5 rows in set (0.00 sec) +``` + +3. Kill the currently running query, which will then be displayed as canceled. + +```SQL +mysql kill query 55; +Query OK, 0 rows affected (0.01 sec) +``` + diff --git a/versioned_docs/version-2.0/data-operate/delete/atomicity-replace.md b/versioned_docs/version-2.0/data-operate/delete/atomicity-replace.md index a3da6b10cb0c..d3a2b9a867fc 100644 --- a/versioned_docs/version-2.0/data-operate/delete/atomicity-replace.md +++ b/versioned_docs/version-2.0/data-operate/delete/atomicity-replace.md @@ -24,4 +24,45 @@ specific language governing permissions and limitations under the License. --> -Coming Soon \ No newline at end of file +# Atomicity Replace + +Doris supports atomic table replacement operations for two tables. This is only applicable to OLAP tables. + +## Applicable scenarios + +- Atomic overwrite operations +- In certain cases, users may want to rewrite data in a table. However, the "delete and load" approach causes a data invisibility window. To solve that, Doris allows users to create a new table of the same schema using the CREATE TABLE LIKE statement, import the new data into this new table, and then atomically replace the old table with the new table. For atomic replacement at the partition level, please refer to the [temporary partition](https://doris.apache.org/docs/2.0/data-operate/d [...] + +## Syntax + +```Plain +ALTER TABLE [db.]tbl1 REPLACE WITH TABLE tbl2 +[PROPERTIES('swap' = 'true')]; +``` + +Replace table tbl1 with table tbl2. + +If `swap` is `true`, after the replacement, data in `tbl1` will be replaced by that in `tbl2`, while data in `tbl2` will be replaced by that in `tbl1`. In other words, the two tables will swap data. + +If `swap` is `false`, after the replacement, data in `tbl1` will be replaced by that in `tbl2` and `tbl2` will be deleted. + +## Implementation + +In fact, table replacement is to combine the following operations into one atomic operation. + +Assuming that table A is to be replaced with table B, and `swap` is set to `true`. The operations to be implemented are as follows: + +1. Rename table B to table A. +2. Rename table A to table B. + +If `swap` is set to `false`, the operations are as follows: + +1. Delete table A. +2. Rename table B to table A. + +## Note + +- `swap` defaults to `true`, meaning to swap the data between two tables. +- If `swap` is set to `false`, the table being replaced (table A) will be deleted and cannot be recovered. +- The replacement operation can only be implemented between two OLAP tables and it does not check for table schema consistency. +- The replacement operation does not change the existing privilege settings because privilege checks are based on table names. diff --git a/versioned_docs/version-2.0/data-operate/delete/truncate-manual.md b/versioned_docs/version-2.0/data-operate/delete/truncate-manual.md index 9debf3729078..a02bd205f244 100644 --- a/versioned_docs/version-2.0/data-operate/delete/truncate-manual.md +++ b/versioned_docs/version-2.0/data-operate/delete/truncate-manual.md @@ -24,4 +24,34 @@ specific language governing permissions and limitations under the License. --> -Coming Soon \ No newline at end of file +# Truncate + +This statement is used to clear the data of a specified table or partition in Doris + +## Syntax + +```SQL +TRUNCATE TABLE [db.]tbl[ PARTITION(p1, p2, ...)]; +``` + +- This statement only clears the data within a table or partition but preserves the table or partition itself. +- Unlike DELETE, this statement can only clear the specified table or partition as a whole and cannot be added with filter conditions. +- Unlike DELETE, truncating data will not affect query performance. +- The data deleted by this operation is not recoverable. +- When using this command, the table status must be NORMAL, which means that tables undergoing SCHEMA CHANGE can not be truncated. +- This command may cause ongoing imports to fail. + +## Example + +1. Truncate the table `tbl` under `example_db`. + +```SQL +TRUNCATE TABLE example_db.tbl; +``` + +2. Truncate partitions `p1` and `p2` of table `tbl`. + +```SQL +TRUNCATE TABLE tbl PARTITION(p1, p2); +``` + diff --git a/versioned_docs/version-2.0/data-operate/update/aggregate-load-update.md b/versioned_docs/version-2.0/data-operate/update/aggregate-load-update.md index 1514201f4190..e66af6ae004d 100644 --- a/versioned_docs/version-2.0/data-operate/update/aggregate-load-update.md +++ b/versioned_docs/version-2.0/data-operate/update/aggregate-load-update.md @@ -24,4 +24,62 @@ specific language governing permissions and limitations under the License. --> -Coming soon \ No newline at end of file +# Update for Aggregate Load + +This guide is about ingestion-based data updates for the Aggregate Key model in Doris. + +## Update all columns + +When importing data into an Aggregate Key model in Doris by methods like Stream Load, Broker Load, Routine Load, and Insert Into, the new values are combined with the old values to produce new aggregated values based on the column's aggregation function. These values might be generated during insertion or produced asynchronously during compaction. However, when querying, users will always receive the same returned values. + +## Partial column update for Aggregate Key model + +Tables in the Aggregate Key model are primarily used in cases with pre-aggregation requirements rather than data updates, but Doris allows partial column updates for them, too. Simply set the aggregation function to `REPLACE_IF_NOT_NULL`. + +**Create table** + +For the columns that need to be updated, set the aggregation function to `REPLACE_IF_NOT_NULL`. + +```Plain +CREATE TABLE order_tbl ( + order_id int(11) NULL, + order_amount int(11) REPLACE_IF_NOT_NULL NULL, + order_status varchar(100) REPLACE_IF_NOT_NULL NULL +) ENGINE=OLAP +AGGREGATE KEY(order_id) +COMMENT 'OLAP' +DISTRIBUTED BY HASH(order_id) BUCKETS 1 +PROPERTIES ( +"replication_allocation" = "tag.location.default: 1" +); ++----------+--------------+-----------------+ +| order_id | order_amount | order_status | ++----------+--------------+-----------------+ +| 1 | 100 | Pending Payment | ++----------+--------------+-----------------+ +1 row in set (0.01 sec) +``` + +**Ingest data** + +For Stream Load, Broker Load, Routine Load, or INSERT INTO, you can directly write the updates to the fields. + +**Example** + +Using the same example as above, the corresponding Stream Load command would be (no additional headers required): + +```Plain +curl --location-trusted -u root: -H "column_separator:," -H "columns:order_id,order_status" -T /tmp/update.csv http://127.0.0.1:48037/api/db1/order_tbl/_stream_load +``` + +The corresponding `INSERT INTO` statement would be (no additional session variables required): + +```Plain +INSERT INTO order_tbl (order_id, order_status) values (1,'Delivery Pending'); +``` + +## Note + +The Aggregate Key model does not perform additional data processing during data writing, so the writing performance in this model is the same as other models. However, aggregation during queries can result in performance loss. Typical aggregation queries can be 5~10 times slower than queries on Merge-on-Write tables in the Unique Key model. + +Under this circumstance, users cannot set a field from non-NULL to NULL, because NULL values written will be automatically neglected by the REPLACE_IF_NOT_NULL aggregation function. diff --git a/versioned_docs/version-2.0/db-connect/database-connect.md b/versioned_docs/version-2.0/db-connect/database-connect.md index 8aa246e2b0f6..1b5bd7cb4ede 100644 --- a/versioned_docs/version-2.0/db-connect/database-connect.md +++ b/versioned_docs/version-2.0/db-connect/database-connect.md @@ -101,6 +101,8 @@ ALTER SYSTEM ADD BACKEND "be_host_ip:heartbeat_service_port";  -:::tip For successful execution of statements that are not related to specific databases/tables in the Playground, it is necessary to randomly select a database from the left-hand database panel. This limitation will be removed later. +:::tip +For successful execution of statements that are not related to specific databases/tables in the Playground, it is necessary to randomly select a database from the left-hand database panel. This limitation will be removed later. -The current built-in web console cannot execute SET type SQL statements. Therefore, the web console does not support statements like SET PASSWORD FOR 'user' = PASSWORD('user_password'). ::: +The current built-in web console cannot execute SET type SQL statements. Therefore, the web console does not support statements like SET PASSWORD FOR 'user' = PASSWORD('user_password'). +::: diff --git a/versioned_docs/version-2.0/install/cluster-deployment/standard-deployment.md b/versioned_docs/version-2.0/install/cluster-deployment/standard-deployment.md index 6b6ea3fa0501..7536e8117180 100644 --- a/versioned_docs/version-2.0/install/cluster-deployment/standard-deployment.md +++ b/versioned_docs/version-2.0/install/cluster-deployment/standard-deployment.md @@ -83,13 +83,14 @@ Doris can run on servers with x86-64 architecture or ARM64 architecture. | Frontend | 8+ Cores | 8+ GB | SSD or SATA, 10+ GB | 1 or 10 GbE | 1 | | Backend | 8+ Cores | 16+ GB | SSD or SATA, 50+ GB | 1 or 10 GbE | 1 | -:::tip Description: - +:::tip +Description: - In a validation testing environment, you can deploy the FE and BE on the same server. - It is generally recommended to deploy only **one BE instance on a single machine**. Meanwhile, you can only deploy **one** **FE** **on one machine**. - If you need three replicas of data, you will need at least three machines and deploy one BE instance on each of them, instead of deploying three BE instances on one single machine. - **The clocks of multiple servers hosting FEs must be synchronized, allowing a maximum clock deviation of 5 seconds.** -- In a testing environment, you can also perform tests with just one BE. In the actual production environment, the number of BE instances directly affects the overall query latency. ::: +- In a testing environment, you can also perform tests with just one BE. In the actual production environment, the number of BE instances directly affects the overall query latency. +::: **Production environment** @@ -98,11 +99,12 @@ Doris can run on servers with x86-64 architecture or ARM64 architecture. | Frontend | 16+ Cores | 64+ GB | SSD or RAID, 100+ GB | 10 GbE | 1 | | Backend | 16+ Cores | 64+ GB | SSD or SATA, 100+ GB | 10 GbE | 3 | -:::tip Description: - +:::tip +Description: - In a production environment, if the FE and BE are co-located, be mindful of resource contention issues. It is recommended to store metadata and data on separate disks. - BE nodes can be configured with multiple disks for storage. You can bind multiple HDDs or SSDs to a single BE instance. -- The performance of the cluster is dependent on the resources of the BE nodes. The more BE nodes there are, the better the performance of Doris. Typically, Doris can fully leverage its performance potential with 10 to 100 machines. ::: +- The performance of the cluster is dependent on the resources of the BE nodes. The more BE nodes there are, the better the performance of Doris. Typically, Doris can fully leverage its performance potential with 10 to 100 machines. +::: ### Hard disk space calculation @@ -137,7 +139,9 @@ tmpfs /tmp tmpfs nodev,nosuid 0 0 /dev/sda3 /home ext4 defaults,noatime 0 2 ``` -:::caution It is not recommended to disable swap by setting vm.swappiness = 0 because this parameter has different semantics in different Linux kernel versions. In many cases, this will not completely disable swap. ::: +:::caution +It is not recommended to disable swap by setting vm.swappiness = 0 because this parameter has different semantics in different Linux kernel versions. In many cases, this will not completely disable swap. +::: ### Check and disable system firewall @@ -169,7 +173,9 @@ vi /etc/security/limits.conf * hard nofile 1000000 ``` -:::caution For the changes to take effect, the current user needs to log out of the current session and log back in. ::: +:::caution +For the changes to take effect, the current user needs to log out of the current session and log back in. +::: ### Modify the number of virtual memory areas @@ -326,10 +332,10 @@ ALTER SYSTEM ADD OBSERVER "<fe_ip_address>:<fe_edit_log_port>" ``` :::note - 1. It is recommended to have an odd number of FE Follower nodes (including the Master node), with a suggested number of three to achieve high availability. 2. When FE is deployed in a high availability mode (1 Master node, 2 Follower nodes), it is advisable to increase the scalability of the FE read services by adding an Observer FE. -3. Typically, one FE node can handle around 10-20 BE nodes. It is recommended to keep the total number of FE nodes below 10. ::: +3. Typically, one FE node can handle around 10-20 BE nodes. It is recommended to keep the total number of FE nodes below 10. +::: **Start** **FE** **Follower node** @@ -387,13 +393,13 @@ storage_root_path=/home/disk1/doris,medium:HDD;/home/disk2/doris,medium:SSD ``` :::note - 1. When specifying the storage type for the storage path, at least one path should have the storage type set as HDD. 2. If the storage type for a storage path is not explicitly declared, it defaults to HDD. 3. Specifying the storage type as HDD or SSD is to differentiate the storage types for the paths. It is independent of the physical storage medium. For example, you can designate a directory on an HDD disk as SSD. -4. The keywords for storage types, HDD and SSD, must be capitalized. ::: +4. The keywords for storage types, HDD and SSD, must be capitalized. +::: -5. Bind cluster IP +6. Bind cluster IP In a multi-network interface environment, it is necessary to explicitly configure the "priority_networks". However, in an environment where FQDN is configured, the "priority_networks" can be ignored. @@ -534,17 +540,21 @@ Doris processes listen to network segments represented in CIDR format for IP add This parameter is mainly used to help the system select the correct network card IP as its listening IP. For example, if the required listening IP is 192.168.0.1, you can set `priority_networks=192.168.0.0/24`. The system will automatically scan all IPs on the machine, and only those matching the 192.168.0.0/24 network segment will be used as service listening addresses. You can also configure multiple CIDR network segments in this parameter, such as `priority_networks = 10.10.0.0/16; 19 [...] -:::tip **Why use priority_networks to configure listening address segments instead of directly setting the listening IP address in the configuration file?** +:::tip +**Why use priority_networks to configure listening address segments instead of directly setting the listening IP address in the configuration file?** -The main reason is that Doris is a distributed cluster, and the same configuration file will be deployed on multiple nodes. To facilitate deployment, updates, and maintenance, it is desirable to keep all nodes' configuration files consistent. By configuring the listening address segment and then starting the system, the appropriate listening IP can be found based on this network segment, thus each machine can use a single value in this configuration. ::: +The main reason is that Doris is a distributed cluster, and the same configuration file will be deployed on multiple nodes. To facilitate deployment, updates, and maintenance, it is desirable to keep all nodes' configuration files consistent. By configuring the listening address segment and then starting the system, the appropriate listening IP can be found based on this network segment, thus each machine can use a single value in this configuration. +::: ### Why do new BE nodes need to be manually added to the cluster? After the BE node is started, you need to send a command to the FE through the MySQL protocol or the built-in web console to join the BE node to the cluster. -:::tip **How does the FE know which BE nodes constitute the cluster?** +:::tip +**How does the FE know which BE nodes constitute the cluster?** -As a distributed database, Doris generally has many BE nodes. Doris adds BE nodes to the cluster by sending commands to the FE. This is different from the way how BE nodes know the addresses of FE nodes and then actively report connections. The way of manual adding and having the FE actively connect to BE nodes is beneficial to cluster management in many aspects. For example, it can determine which nodes constitute the cluster and it can proactively shut down a BE node that cannot be con [...] +As a distributed database, Doris generally has many BE nodes. Doris adds BE nodes to the cluster by sending commands to the FE. This is different from the way how BE nodes know the addresses of FE nodes and then actively report connections. The way of manual adding and having the FE actively connect to BE nodes is beneficial to cluster management in many aspects. For example, it can determine which nodes constitute the cluster and it can proactively shut down a BE node that cannot be con [...] +::: ### How to quickly detect whether the FE has started successfully? @@ -574,7 +584,9 @@ ALTER SYSTEM ADD BACKEND "be_host_ip:heartbeat_service_port";  -:::tip For successful execution of statements that are not related to specific databases/tables in the Playground, it is necessary to randomly select a database from the left-hand database panel. This limitation will be removed later. :::well-log- +:::tip +For successful execution of statements that are not related to specific databases/tables in the Playground, it is necessary to randomly select a database from the left-hand database panel. This limitation will be removed later. +::: ### Why can't I change the root password via the Web UI? diff --git a/versioned_docs/version-2.0/install/source-install/compilation-arm.md b/versioned_docs/version-2.0/install/source-install/compilation-arm.md index 0c45db114453..d60a711f546f 100644 --- a/versioned_docs/version-2.0/install/source-install/compilation-arm.md +++ b/versioned_docs/version-2.0/install/source-install/compilation-arm.md @@ -250,7 +250,9 @@ gcc --version ## Compile -:::tip It is recommended to use LDB Toolchain for compilation with ARM environment. ::: +:::tip +It is recommended to use LDB Toolchain for compilation with ARM environment. +::: When compiling Doris on an ARM platform, please disable the AVX2 and LIBUNWIND third-party libraries. diff --git a/versioned_docs/version-2.0/install/source-install/compilation-mac.md b/versioned_docs/version-2.0/install/source-install/compilation-mac.md index e0493452476a..0a3c7a0b1bb7 100644 --- a/versioned_docs/version-2.0/install/source-install/compilation-mac.md +++ b/versioned_docs/version-2.0/install/source-install/compilation-mac.md @@ -42,7 +42,9 @@ brew install automake autoconf libtool pkg-config texinfo coreutils gnu-getopt \ python@3 cmake ninja ccache bison byacc gettext wget pcre maven llvm@16 openjdk@11 npm ``` -:::tip On MacOS, since Homebrew does not provide an installation package for JDK8, JDK11 is used here instead. Alternatively, you can manually download and install JDK8. ::: +:::tip +On MacOS, since Homebrew does not provide an installation package for JDK8, JDK11 is used here instead. Alternatively, you can manually download and install JDK8. +::: 2. **Compile source code** @@ -50,7 +52,9 @@ python@3 cmake ninja ccache bison byacc gettext wget pcre maven llvm@16 openjdk@ bash build.sh ``` -:::tip The first step of compiling Doris is to download and compile third-party libraries. You can download the pre-compiled versions of third-party libraries provided by the Doris community. Please refer to the instructions below for downloading **pre-compiled third-party libraries** to speed up the build process. ::: +:::tip +The first step of compiling Doris is to download and compile third-party libraries. You can download the pre-compiled versions of third-party libraries provided by the Doris community. Please refer to the instructions below for downloading **pre-compiled third-party libraries** to speed up the build process. +::: ## Start @@ -107,4 +111,6 @@ cd installed/bin ./thrift --version ``` -:::tip When running protoc and thrift, you may encounter an issue where the binary cannot be opened due to developer verification. To resolve this, you can go to "Security & Privacy" settings. In the "General" tab, click on the "Open Anyway" button to confirm your intent to open the binary. Refer to: https://support.apple.com/en-us/102445 ::: +:::tip +When running protoc and thrift, you may encounter an issue where the binary cannot be opened due to developer verification. To resolve this, you can go to "Security & Privacy" settings. In the "General" tab, click on the "Open Anyway" button to confirm your intent to open the binary. Refer to: https://support.apple.com/en-us/102445 +::: diff --git a/versioned_docs/version-2.0/install/source-install/compilation-with-ldb-toolchain.md b/versioned_docs/version-2.0/install/source-install/compilation-with-ldb-toolchain.md index aa3dd0b87612..7526fe78514c 100644 --- a/versioned_docs/version-2.0/install/source-install/compilation-with-ldb-toolchain.md +++ b/versioned_docs/version-2.0/install/source-install/compilation-with-ldb-toolchain.md @@ -28,9 +28,11 @@ under the License. This guide is about how to compile Doris using the LDB Toolchain. This method serves as a supplement to the Docker compilation approach to help the developers and users without a Docker environment. The recommended LDB Toolchain version is 0.17, which includes clang-16 and gcc-11. -:::tip LDB Toolchain is fully known as Linux Distribution Based Toolchain Generator. It helps compile modern C++ projects on almost all Linux distributions. ::: +:::tip +LDB Toolchain is fully known as Linux Distribution Based Toolchain Generator. It helps compile modern C++ projects on almost all Linux distributions. Special thanks to [Amos Bird](https://github.com/amosbird) for the contribution. +::: ## Prepare the compilation environment: @@ -40,7 +42,9 @@ This method applies to most Linux distributions (CentOS, Ubuntu, etc.). Download the latest `ldb_toolchain_gen.sh` from [here](https://github.com/amosbird/ldb_toolchain_gen/releases). This script is used to generate ldb toolchain. -:::tip For more information, please visit https://github.com/amosbird/ldb_toolchain_gen ::: +:::tip +For more information, please visit https://github.com/amosbird/ldb_toolchain_gen +::: 2. **Execute the following command to generate ldb toolchain.** @@ -105,7 +109,9 @@ export PATH=/path/to/ldb_toolchain/bin:$PATH ## Compile Doris -:::tip The first step of compiling Doris source code is to first download third-party libraries and compile them. You can refer to the following instructions to download precompiled versions of the third-party libraries. ::: +:::tip +The first step of compiling Doris source code is to first download third-party libraries and compile them. You can refer to the following instructions to download precompiled versions of the third-party libraries. +::: 1. **Enter the Doris source code directory and execute the following command to check if the compilation machine supports the AVX2 instruction set.** diff --git a/versioned_docs/version-2.0/lakehouse/datalake/hive.md b/versioned_docs/version-2.0/lakehouse/datalake/hive.md index 4724d0e903b2..81450b2b7ff3 100644 --- a/versioned_docs/version-2.0/lakehouse/datalake/hive.md +++ b/versioned_docs/version-2.0/lakehouse/datalake/hive.md @@ -26,22 +26,22 @@ under the License. # Hive -By connecting to Hive Metastore, or a metadata service compatible with Hive Metatore, Doris can automatically obtain Hive database table information and perform data queries. +By connecting to Hive Metastore, Doris can automatically obtain database and table information from Hive and perform data queries. -In addition to Hive, many other systems also use the Hive Metastore to store metadata. So through Hive Catalog, we can not only access Hive, but also access systems that use Hive Metastore as metadata storage. Such as Iceberg, Hudi, etc. +In addition to Hive, systems like Iceberg and Hudi also use Hive Metastore for metadata storage. Hive Catalog in Doris allows users to easily integrate with not only Hive but also systems that use Hive Metastore as metadata storage. -## Terms and Conditions +## Note -1. Need to put core-site.xml, hdfs-site.xml and hive-site.xml in the conf directory of FE and BE. First read the hadoop configuration file in the conf directory, and then read the related to the environment variable `HADOOP_CONF_DIR` configuration file. -2. hive supports version 1/2/3. -3. Support Managed Table and External Table and part of Hive View. -4. Can identify hive, iceberg, hudi metadata stored in Hive Metastore. +- Hive 1, Hive 2, and Hive 3 are supported. +- Managed Tables, External Tables, and part of the Hive Views are supported. +- It can recognize Hive, Iceberg, and Hudi metadata stored in the Hive Metastore. +- You should place core-site.xml, hdfs-site.xml, and hive-site.xml in the conf directory of both FE and BE. During reading, the hadoop configuration files in the conf directory are read first, followed by the configuration files related to the HADOOP_CONF_DIR environment variable. ## Create Catalog -### Hive On HDFS +### Hive on HDFS -```sql +```SQL CREATE CATALOG hive PROPERTIES ( 'type'='hms', 'hive.metastore.uris' = 'thrift://172.0.0.1:9083', @@ -54,11 +54,11 @@ CREATE CATALOG hive PROPERTIES ( ); ``` -In addition to the two required parameters of `type` and `hive.metastore.uris`, more parameters can be passed to pass the information required for the connection. +In addition to the two required parameters of `type` and `hive.metastore.uris`, you can specify more parameters to pass the required information for the connection. -If HDFS HA information is provided, the example is as follows: +Example to provide HDFS HA information: -```sql +```SQL CREATE CATALOG hive PROPERTIES ( 'type'='hms', 'hive.metastore.uris' = 'thrift://172.0.0.1:9083', @@ -71,33 +71,9 @@ CREATE CATALOG hive PROPERTIES ( ); ``` -Provide HDFS HA information and Kerberos authentication information at the same time, examples are as follows: +### Hive on VIEWFS -```sql -CREATE CATALOG hive PROPERTIES ( - 'type'='hms', - 'hive.metastore.uris' = 'thrift://172.0.0.1:9083', - 'hive.metastore.sasl.enabled' = 'true', - 'hive.metastore.kerberos.principal' = 'your-hms-principal', - 'dfs.nameservices'='your-nameservice', - 'dfs.ha.namenodes.your-nameservice'='nn1,nn2', - 'dfs.namenode.rpc-address.your-nameservice.nn1'='172.21.0.2:8088', - 'dfs.namenode.rpc-address.your-nameservice.nn2'='172.21.0.3:8088', - 'dfs.client.failover.proxy.provider.your-nameservice'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider', - 'hadoop.security.authentication' = 'kerberos', - 'hadoop.kerberos.keytab' = '/your-keytab-filepath/your.keytab', - 'hadoop.kerberos.principal' = 'your-princi...@your.com', - 'yarn.resourcemanager.principal' = 'your-rm-principal' -); -``` - -Please place the `krb5.conf` file and `keytab` authentication file under all `BE` and `FE` nodes. The path of the `keytab` authentication file is consistent with the configuration. The `krb5.conf` file is placed in `/etc by default /krb5.conf` path. - -The value of `hive.metastore.kerberos.principal` needs to be consistent with the property of the same name of the connected hive metastore, which can be obtained from `hive-site.xml`. - -### Hive On VIEWFS - -```sql +```SQL CREATE CATALOG hive PROPERTIES ( 'type'='hms', 'hive.metastore.uris' = 'thrift://172.0.0.1:9083', @@ -113,17 +89,17 @@ CREATE CATALOG hive PROPERTIES ( ); ``` -viewfs related parameters can be added to the catalog configuration as above, or added to `conf/core-site.xml`. +You can add viewfs-related parameters to the catalog configuration as above, or add them to `conf/core-site.xml`. -How viewfs works and parameter configuration, please refer to relevant hadoop documents, for example, https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ViewFs.html +For details about how viewfs works and the parameter configurations, please refer to relevant hadoop documents, such as https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ViewFs.html -### Hive On JuiceFS +### Hive on JuiceFS -Data is stored in JuiceFS, examples are as follows: +Example for data stored on JuiceFS: -(Need to put `juicefs-hadoop-x.x.x.jar` under `fe/lib/` and `apache_hdfs_broker/lib/`) +(You should put `juicefs-hadoop-x.x.x.jar` under `fe/lib/` and `apache_hdfs_broker/lib/`.) -```sql +```SQL CREATE CATALOG hive PROPERTIES ( 'type'='hms', 'hive.metastore.uris' = 'thrift://172.0.0.1:9083', @@ -134,9 +110,9 @@ CREATE CATALOG hive PROPERTIES ( ); ``` -### Hive On S3 +### Hive on S3 -```sql +```SQL CREATE CATALOG hive PROPERTIES ( "type"="hms", "hive.metastore.uris" = "thrift://172.0.0.1:9083", @@ -148,15 +124,15 @@ CREATE CATALOG hive PROPERTIES ( ); ``` -Options: +Optional properties: -* s3.connection.maximum: s3 maximum connection number, default 50 -* s3.connection.request.timeout: s3 request timeout, default 3000ms -* s3.connection.timeout: s3 connection timeout, default 1000ms +- s3.connection.maximum: defaults to 50 +- s3.connection.request.timeout: defaults to 3000ms +- s3.connection.timeout: defaults to 1000ms -### Hive On OSS +### Hive on OSS -```sql +```SQL CREATE CATALOG hive PROPERTIES ( "type"="hms", "hive.metastore.uris" = "thrift://172.0.0.1:9083", @@ -166,9 +142,9 @@ CREATE CATALOG hive PROPERTIES ( ); ``` -### Hive On OBS +### Hive on OBS -```sql +```SQL CREATE CATALOG hive PROPERTIES ( "type"="hms", "hive.metastore.uris" = "thrift://172.0.0.1:9083", @@ -178,9 +154,9 @@ CREATE CATALOG hive PROPERTIES ( ); ``` -### Hive On COS +### Hive on COS -```sql +```SQL CREATE CATALOG hive PROPERTIES ( "type"="hms", "hive.metastore.uris" = "thrift://172.0.0.1:9083", @@ -190,9 +166,24 @@ CREATE CATALOG hive PROPERTIES ( ); ``` -### Hive With Glue +### Hive with AWS Glue + +:::tip +When connecting to Glue, if you are in a non-EC2 environment, you need to copy the `~/.aws` directory from the EC2 environment to your current environment. Alternatively, you can download the AWS [CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) tool for configuration, which will also create the `.aws` directory in the current user's home directory. +::: + +Access to AWS services with Doris supports two authentication methods. -```sql +**Authentication using Catalog properties** + +You can fill in the basic properties of credentials, such as: + +- When accessing S3, you can use `s3.endpoint`, `s3.access_key`, `s3.secret_key`. +- When accessing Glue, you can use `glue.endpoint`, `glue.access_key`, `glue.secret_key`. + +For example, if you are accessing Glue using Iceberg Catalog, you can fill in the following properties to access tables hosted on Glue: + +```SQL CREATE CATALOG hive PROPERTIES ( "type"="hms", "hive.metastore.type" = "glue", @@ -202,22 +193,33 @@ CREATE CATALOG hive PROPERTIES ( ); ``` +**Authentication using system properties** + +This authentication method is used for applications running on AWS resources such as EC2 instances. It avoids hard-coding credentials and enhances data security. + +If you create a Catalog without filling in the Credentials properties, the DefaultAWSCredentialsProviderChain will be used by default. It can read properties configured in system environment variables or instance profiles. + +For configurations of environment variables and system properties, refer to [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html). + +- Optional environment variables include: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_SESSION_TOKEN`, `AWS_ROLE_ARN`, `AWS_WEB_IDENTITY_TOKEN_FILE`, etc. +- You can follow [AWS Configure](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html) to directly configure the Credentials information, which will generate a credentials file in the `~/.aws` directory. + ## Metadata Cache & Refresh -For Hive Catalog, 4 types of metadata are cached in Doris: +For Hive Catalog, Doris caches 4 types of metadata: -1. Table structure: cache table column information, etc. -2. Partition value: Cache the partition value information of all partitions of a table. -3. Partition information: Cache the information of each partition, such as partition data format, partition storage location, partition value, etc. -4. File information: Cache the file information corresponding to each partition, such as file path location, etc. +- Table schema: table column information, etc. +- Partition value: partition value information of all partitions in a table. +- Partition information: information of each partition, such as partition data format, partition storage location, partition value, etc. +- File information: file information corresponding to each partition, such as file path location, etc. -The above cache information will not be persisted to Doris, so operations such as restarting Doris's FE node, switching masters, etc. may cause the cache to become invalid. After the cache expires, Doris will directly access the Hive MetaStore to obtain information and refill the cache. +The above cached information will not be persisted to Doris, so operations such as restarting Doris FE node, switching to masters may invalidate the cache. If the cache expires, Doris will directly access the Hive MetaStore to obtain information and refill the cache. -Metadata cache can be updated automatically, manually, or configured with TTL (Time-to-Live) according to user needs. +Metadata cache can be updated automatically, manually, or configured with TTL (Time-to-Live). ### Default behavior and TTL -By default, the metadata cache expires 10 minutes after it is first accessed. This time is determined by the configuration parameter `external_cache_expire_time_minutes_after_access` in fe.conf. (Note that in versions 2.0.1 and earlier, the default value for this parameter was 1 day). +By default, the metadata cache expires 10 minutes after it is first accessed. The TTL is determined by the configuration parameter `external_cache_expire_time_minutes_after_access` in fe.conf. (Note that in versions 2.0.1 and earlier, the default value for this parameter was 1 day). For example, if the user accesses the metadata of table A for the first time at 10:00, then the metadata will be cached and will automatically expire after 10:10. If the user accesses the same metadata again at 10:11, Doris will directly access the Hive MetaStore to obtain information and refill the cache. @@ -225,7 +227,7 @@ For example, if the user accesses the metadata of table A for the first time at For the `INSERT INTO OVERWRITE PARTITION` operation commonly used in Hive, you can also timely update the `File Information Cache` by configuring the TTL of the `File Information Cache`: -``` +```Plain CREATE CATALOG hive PROPERTIES ( 'type'='hms', 'hive.metastore.uris' = 'thrift://172.0.0.1:9083', @@ -241,39 +243,37 @@ You can also set this value to 0 to disable file caching, which will fetch file Users need to manually refresh the metadata through the [REFRESH](../../sql-manual/sql-reference/Utility-Statements/REFRESH.md) command. -1. REFRESH CATALOG: Refresh the specified Catalog. - - ``` - REFRESH CATALOG ctl1 PROPERTIES("invalid_cache" = "true"); - ``` +**REFRESH CATALOG: Refresh the specified Catalog.** - This command will refresh the database list, table list, and all cache information of the specified Catalog. - - `invalid_cache` indicates whether to flush the cache. Defaults to true. If it is false, only the database and table list of the catalog will be refreshed, but the cache information will not be refreshed. This parameter is applicable when the user only wants to synchronize newly added or deleted database/table information. +```Shell +REFRESH CATALOG ctl1 PROPERTIES("invalid_cache" = "true"); +``` -2. REFRESH DATABASE: Refresh the specified Database. +- This command refreshes the database list, table list, and all cached information of the specified Catalog. +- `invalid_cache` indicates whether to refresh the cache. It defaults to true. If it is false, it will only refresh the database and table list of the catalog, but not the cached information. This parameter is applicable when the user only wants to synchronize newly added or deleted database/table information. - ``` - REFRESH DATABASE [ctl.]db1 PROPERTIES("invalid_cache" = "true"); - ``` +**REFRESH DATABASE: Refresh the specified Database.** - This command will refresh the table list of the specified Database and all cached information under the Database. +```Shell +REFRESH DATABASE [ctl.]db1 PROPERTIES("invalid_cache" = "true"); +``` - The meaning of the `invalid_cache` attribute is the same as above. Defaults to true. If false, only the Database's table list will be refreshed, not cached information. This parameter is suitable for users who only want to synchronize newly added or deleted table information. +- This command refreshes the table list of the specified database and all cached information under the database. +- `invalid_cache` indicates the same as above. It defaults to true. If false, it will only refresh the table list of the database, but not the cached information. This parameter is applicable when the user only wants to synchronize newly added or deleted table information. -3. REFRESH TABLE: Refresh the specified Table. +**REFRESH TABLE: Refresh the specified Table.** - ``` - REFRESH TABLE [ctl.][db.]tbl1; - ``` +```Shell +REFRESH TABLE [ctl.][db.]tbl1; +``` - This command will refresh all cache information under the specified Table. +- This command refreshes all cached information under the specified table. ### Regular refresh -Users can set the scheduled refresh of the Catalog when creating the Catalog. +Users can schedule the refresh for a Catalog when creating the Catalog. -``` +```Plain CREATE CATALOG hive PROPERTIES ( 'type'='hms', 'hive.metastore.uris' = 'thrift://172.0.0.1:9083', @@ -281,41 +281,43 @@ CREATE CATALOG hive PROPERTIES ( ); ``` -In the above example, `metadata_refresh_interval_sec` means refreshing the Catalog every 600 seconds. Equivalent to automatically executing every 600 seconds: +In the above example, `metadata_refresh_interval_sec` means refreshing the Catalog every 600 seconds, which means the following command will be executed automatically every 600 seconds: -`REFRESH CATALOG ctl1 PROPERTIES("invalid_cache" = "true");` +``` +REFRESH CATALOG ctl1 PROPERTIES("invalid_cache" = "true"); +``` -The scheduled refresh interval must not be less than 5 seconds. +Note that the scheduled refresh interval must not be less than 5 seconds. -### Auto Refresh +### Auto refresh Currently, Doris only supports automatic update of metadata in Hive Metastore (HMS). It perceives changes in metadata by the FE node which regularly reads the notification events from HMS. The supported events are as follows: | Event | Corresponding Update Operation | -| :-------------- | :----------------------------------------------------------- | +| --------------- | ------------------------------------------------------------ | | CREATE DATABASE | Create a database in the corresponding catalog. | | DROP DATABASE | Delete a database in the corresponding catalog. | -| ALTER DATABASE | Such alterations mainly include changes in properties, comments, or storage location of databases. They do not affect Doris' queries in External Catalogs so they will not be synchronized. | +| ALTER DATABASE | Such alterations mainly include changes in properties, comments, or storage location of databases. They do not affect Doris' queries in External Catalogs. | | CREATE TABLE | Create a table in the corresponding database. | | DROP TABLE | Delete a table in the corresponding database, and invalidate the cache of that table. | -| ALTER TABLE | If it is a renaming, delete the table of the old name, and then create a new table with the new name; otherwise, invalidate the cache of that table. | +| ALTER TABLE | If it is renaming, delete the table of the old name, and then create a new table with the new name; otherwise, invalidate the cache of that table. | | ADD PARTITION | Add a partition to the cached partition list of the corresponding table. | | DROP PARTITION | Delete a partition from the cached partition list of the corresponding table, and invalidate the cache of that partition. | | ALTER PARTITION | If it is a renaming, delete the partition of the old name, and then create a new partition with the new name; otherwise, invalidate the cache of that partition. | -> After data ingestion, changes in partition tables will follow the `ALTER PARTITION` logic, while those in non-partition tables will follow the `ALTER TABLE` logic. -> -> If changes are conducted on the file system directly instead of through the HMS, the HMS will not generate an event. As a result, such changes will not be perceived by Doris. +After data ingestion, changes in partition tables will follow the `ALTER PARTITION` logic, while those in non-partition tables will follow the `ALTER TABLE` logic. + +If changes are conducted on the file system directly instead of through the HMS, the HMS will not generate an event. As a result, such changes will not be perceived by Doris. The automatic update feature involves the following parameters in fe.conf: -1. `enable_hms_events_incremental_sync`: This specifies whether to enable automatic incremental synchronization for metadata, which is disabled by default. -2. `hms_events_polling_interval_ms`: This specifies the interval between two readings, which is set to 10000 by default. (Unit: millisecond) -3. `hms_events_batch_size_per_rpc`: This specifies the maximum number of events that are read at a time, which is set to 500 by default. +- `enable_hms_events_incremental_sync`: This specifies whether to enable automatic incremental synchronization for metadata, which is disabled by default. +- `hms_events_polling_interval_ms`: This specifies the interval between two readings, which is set to 10000 by default. (Unit: millisecond) +- `hms_events_batch_size_per_rpc`: This specifies the maximum number of events that are read at a time, which is set to 500 by default. To enable automatic update(Excluding Huawei MRS), you need to modify the hive-site.xml of HMS and then restart HMS and HiveServer2: -``` +```Plain <property> <name>hive.metastore.event.db.notification.api.auth</name> <value>false</value> @@ -328,43 +330,24 @@ To enable automatic update(Excluding Huawei MRS), you need to modify the hive-si <name>hive.metastore.transactional.event.listeners</name> <value>org.apache.hive.hcatalog.listener.DbNotificationListener</value> </property> - ``` -Huawei's MRS needs to change hivemetastore-site.xml and restart HMS and HiveServer2: +For Huawei MRS, you need to change hivemetastore-site.xml and restart HMS and HiveServer2: -``` +```Plain <property> <name>metastore.transactional.event.listeners</name> <value>org.apache.hive.hcatalog.listener.DbNotificationListener</value> </property> ``` -Note: Value is appended with commas separated from the original value, not overwritten.For example, the default configuration for MRS 3.1.0 is - -``` -<property> - <name>metastore.transactional.event.listeners</name> - <value>com.huawei.bigdata.hive.listener.TableKeyFileManagerListener,org.apache.hadoop.hive.metastore.listener.FileAclListener</value> -</property> -``` - -We need to change to - -``` -<property> - <name>metastore.transactional.event.listeners</name> - <value>com.huawei.bigdata.hive.listener.TableKeyFileManagerListener,org.apache.hadoop.hive.metastore.listener.FileAclListener,org.apache.hive.hcatalog.listener.DbNotificationListener</value> -</property> -``` - -> Note: To enable automatic update, whether for existing Catalogs or newly created Catalogs, all you need is to set `enable_hms_events_incremental_sync` to `true`, and then restart the FE node. You don't need to manually update the metadata before or after the restart. - ## Hive Version -Doris can correctly access the Hive Metastore in different Hive versions. By default, Doris will access the Hive Metastore with a Hive 2.3 compatible interface. You can also specify the hive version when creating the Catalog. If accessing Hive 1.1.0 version: +Doris can correctly access the Hive Metastore in different Hive versions. By default, Doris will access the Hive Metastore with a Hive 2.3 compatible interface. If you encounter the `Invalid method name: 'get_table_req'` error during queries, the root cause is Hive version incompatibility. -```sql +You can specify the Hive version when creating the Catalog. For example, to access Hive 1.1.0 version: + +```Plain CREATE CATALOG hive PROPERTIES ( 'type'='hms', 'hive.metastore.uris' = 'thrift://172.0.0.1:9083', @@ -376,126 +359,214 @@ CREATE CATALOG hive PROPERTIES ( For Hive/Iceberge/Hudi -| HMS Type | Doris Type | Comment | -|---|---|---| -| boolean| boolean | | -| tinyint|tinyint | | -| smallint| smallint| | -| int| int | | -| bigint| bigint | | -| date| date| | -| timestamp| datetime| | -| float| float| | -| double| double| | -| char| char | | -| varchar| varchar| | -| decimal| decimal | | -| `array<type>` | `array<type>`| support nested type, for example `array<array<int>>` | -| `map<KeyType, ValueType>` | `map<KeyType, ValueType>` | support nested type, for example `map<string, array<int>>` | -| `struct<col1: Type1, col2: Type2, ...>` | `struct<col1: Type1, col2: Type2, ...>` | support nested type, for example `struct<col1: array<int>, col2: map<int, date>>` | -| other | unsupported | | - -## Whether to truncate char or varchar columns according to the schema of the hive table - -If the variable `truncate_char_or_varchar_columns` is enabled, when the maximum length of the char or varchar column in the schema of the hive table is inconsistent with the schema in the underlying parquet or orc file, it will be truncated according to the maximum length of the hive table column. - -The variable default is false. +| HMS Type | Doris Type | Comment | +| ------------------------------------- | ------------------------------------- | ------------------------------------------------------------ | +| boolean | boolean | | +| tinyint | tinyint | | +| smallint | smallint | | +| int | int | | +| bigint | bigint | | +| date | date | | +| timestamp | datetime | | +| float | float | | +| double | double | | +| char | char | | +| varchar | varchar | | +| decimal | decimal | | +| array<type> | array<type> | Supports nested structure, such as array<map<string, int>> | +| map<KeyType, ValueType> | map<KeyType, ValueType> | Supports nested structure, such as map<string, array<int>> | +| struct<col1: Type1, col2: Type2, ...> | struct<col1: Type1, col2: Type2, ...> | Supports nested structure, such as struct<col1: array<int>, col2: map<int, date>> | +| other | unsupported | | + +:::tip +**Truncate char or varchar columns** + +If `truncate_char_or_varchar_columns` is enabled, and the maximum length of the char or varchar column in the schema of the hive table is inconsistent with the schema in the underlying Parquet or ORC file, the columns will be truncated according to the maximum length of the hive table column. + +The variable defaults to false. +::: ## Access HMS with broker -Add following setting when creating an HMS catalog, file splitting and scanning for Hive external table will be completed by broker named `test_broker` +Add the following setting when creating an HMS catalog, file splitting and scanning for Hive external table will be completed by the broker named `test_broker`. -```sql +```SQL "broker.name" = "test_broker" ``` +Doris supports querying HMS Catalog Iceberg by implementing the Iceberg FileIO interface in the Broker. If needed, you can add the following configuration when creating an HMS Catalog. + +```SQL +"io-impl" = "org.apache.doris.datasource.iceberg.broker.IcebergBrokerIO" +``` + ## Integrate with Apache Ranger Apache Ranger is a security framework for monitoring, enabling services, and comprehensive data security access management on the Hadoop platform. -Currently doris supports ranger library, table, and column permissions, but does not support encryption, row permissions, etc. +Doris supports authentication on specified External Hive Catalog using Apache Ranger. + +Currently Doris supports authentication of Ranger libraries, tables, and columns, but does not support encryption, row privileges, and Data Mask. -### Settings +If you need to use Apache Ranger for authentication of the entire Doris cluster, refer to [Integration with Apache Ranger](https://doris.apache.org/docs/dev/admin-manual/privilege-ldap/ranger/). -To connect to the Hive Metastore with Ranger permission verification enabled, you need to add configuration & configuration environment: +### Environment settings + +To connect to the Hive Metastore with Ranger authentication enabled, you need to add configurations and environment settings as follows: 1. When creating a Catalog, add: -```sql +```SQL "access_controller.properties.ranger.service.name" = "hive", "access_controller.class" = "org.apache.doris.catalog.authorizer.RangerHiveAccessControllerFactory", ``` +:::note + `access_controller.properties.ranger.service.name` refers to the type of service, such as `hive`, `hdfs`, etc. It does not correspond to the value of `ranger.plugin.hive.service.name` in the configuration file. +::: + 2. Configure all FE environments: - 1. Copy the configuration files ranger-hive-audit.xml, ranger-hive-security.xml, and ranger-policymgr-ssl.xml under the HMS conf directory to the FE conf directory. - - 2. Modify the properties of ranger-hive-security.xml, the reference configuration is as follows: - - ```sql - <?xml version="1.0" encoding="UTF-8"?> - <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> - <configuration> - #The directory for caching permission data, needs to be writable - <property> - <name>ranger.plugin.hive.policy.cache.dir</name> - <value>/mnt/datadisk0/zhangdong/rangerdata</value> - </property> - #The time interval for periodically pulling permission data - <property> - <name>ranger.plugin.hive.policy.pollIntervalMs</name> - <value>30000</value> - </property> - - <property> - <name>ranger.plugin.hive.policy.rest.client.connection.timeoutMs</name> - <value>60000</value> - </property> - - <property> - <name>ranger.plugin.hive.policy.rest.client.read.timeoutMs</name> - <value>60000</value> - </property> - - <property> - <name>ranger.plugin.hive.policy.rest.ssl.config.file</name> - <value></value> - </property> - - <property> - <name>ranger.plugin.hive.policy.rest.url</name> - <value>http://172.21.0.32:6080</value> - </property> - - <property> - <name>ranger.plugin.hive.policy.source.impl</name> - <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value> - </property> - - <property> - <name>ranger.plugin.hive.service.name</name> - <value>hive</value> - </property> - - <property> - <name>xasecure.hive.update.xapolicies.on.grant.revoke</name> - <value>true</value> - </property> - - </configuration> - ``` - - 3. In order to obtain the log of Ranger authentication itself, add the configuration file log4j.properties in the `<doris_home>/conf` directory. - - 4. Restart FE. - -### Best Practices - -1. Create user user1 on the ranger side and authorize the query permission of db1.table1.col1 - -2. Create role role1 on the ranger side and authorize the query permission of db1.table1.col2 - -3. Create a user user1 with the same name in doris, user1 will directly have the query authority of db1.table1.col1 - -4. Create role1 with the same name in doris, and assign role1 to user1, user1 will have the query authority of db1.table1.col1 and col2 at the same time +a. Copy the configuration files ranger-hive-audit.xml, ranger-hive-security.xml, and ranger-policymgr-ssl.xml under the HMS conf directory to the FE conf directory. + +b. Modify the properties in ranger-hive-security.xml, an example is as follows: + +```SQL +<?xml version="1.0" encoding="UTF-8"?> +<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> +<configuration> + #The directory for caching permission data, needs to be writable + <property> + <name>ranger.plugin.hive.policy.cache.dir</name> + <value>/mnt/datadisk0/zhangdong/rangerdata</value> + </property> + #The time interval for periodically pulling permission data + <property> + <name>ranger.plugin.hive.policy.pollIntervalMs</name> + <value>30000</value> + </property> + + <property> + <name>ranger.plugin.hive.policy.rest.client.connection.timeoutMs</name> + <value>60000</value> + </property> + + <property> + <name>ranger.plugin.hive.policy.rest.client.read.timeoutMs</name> + <value>60000</value> + </property> + + <property> + <name>ranger.plugin.hive.policy.rest.ssl.config.file</name> + <value></value> + </property> + + <property> + <name>ranger.plugin.hive.policy.rest.url</name> + <value>http://172.21.0.32:6080</value> + </property> + + <property> + <name>ranger.plugin.hive.policy.source.impl</name> + <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value> + </property> + + <property> + <name>ranger.plugin.hive.service.name</name> + <value>hive</value> + </property> + + <property> + <name>xasecure.hive.update.xapolicies.on.grant.revoke</name> + <value>true</value> + </property> + +</configuration> +``` + +c. In order to obtain the logs of Ranger authentication, add the configuration file log4j.properties in the `<doris_home>/conf` directory. + +d. Restart FE. + +### Best Practice + +1. Create `user1` on the Ranger side and authorize the query permission of db1.table1.col1 +2. Create `role1` on the Ranger side and authorize the query permission of db1.table1.col2 +3. Create the same `user1` in Doris, `user1` will be directly granted with the query permission of db1.table1.col1 +4. Create the same `role1` in Doris, and assign `role1` to `user1`. `user1` will be directly granted with the query permission of db1.table1.col1 and col2 at the same time. +5. The Admin and Root privileges of Doris are not impacted by Apache Ranger. + +## Authentication with Kerberos + +Kerberos is an authentication protocol designed to provide strong identity verification for applications through the use of encryption with private keys. + +### Environment settings + +1. When the services in the cluster are configured with Kerberos authentication, the relevant authentication information is needed in Hive Catalog configuration. + +- `hadoop.kerberos.keytab`: the principal required for authentication. It should be the same as the keytab in the Doris cluster. +- `hadoop.kerberos.principal`: Find the corresponding principal for the hostname in the Doris cluster, such as `doris/``hostn...@hadoop.com`, and verify with `klist -kt` using the keytab. +- `yarn.resourcemanager.principal`: Go to the Yarn Resource Manager node and retrieve this from `yarn-site.xml`. Verify the keytab of Yarn with `klist -kt`. +- `hive.metastore.kerberos.principal`: Go to the Hive metadata service node and retrieve this from `hive-site.xml`. Verify the keytab of Hive with `klist -kt`. +- `hadoop.security.authentication`: Enable Hadoop Kerberos authentication. + +Place the `krb5.conf` file and `keytab` authentication file on all BE and FE nodes. Ensure consistencyt of the path of the `keytab` authentication file. By default, the `krb5.conf` file is placed in the `/etc/krb5.conf` path. Also, confirm that the JVM parameter `-Djava.security.krb5.conf` and the environment variable `KRB5_CONFIG`point to the correct path of the `krb5.conf` file. + +2. After the configuration is completed, if you cannot locate the issue in the FE and BE logs, you can enable Kerberos debugging. + +- On all FE and BE nodes, locate the `conf/fe.conf` and `conf/be.conf` files in the deployment path. +- After finding the configuration files, set the JVM parameter `-Dsun.security.krb5.debug=true` in the `JAVA_OPTS` variable to enable Kerberos debugging. +- The FE Kerberos authentication debug information can be found in the FE log path `log/fe.out`, and the BE Kerberos authentication debug information can be found in the BE log path `log/be.out`. +- For solutions to more common issues, refer to [FAQ](https://doris.apache.org/docs/dev/lakehouse/faq/). + +### Best practice + +Example: + +```SQL +CREATE CATALOG hive_krb PROPERTIES ( + 'type'='hms', + 'hive.metastore.uris' = 'thrift://172.0.0.1:9083', + 'hive.metastore.sasl.enabled' = 'true', + 'hive.metastore.kerberos.principal' = 'your-hms-principal', + 'hadoop.security.authentication' = 'kerberos', + 'hadoop.kerberos.keytab' = '/your-keytab-filepath/your.keytab', + 'hadoop.kerberos.principal' = 'your-princi...@your.com', + 'yarn.resourcemanager.principal' = 'your-rm-principal' +); +``` + +Example of HDFS HA information and Kerberos authentication information: + +```SQL +CREATE CATALOG hive_krb_ha PROPERTIES ( + 'type'='hms', + 'hive.metastore.uris' = 'thrift://172.0.0.1:9083', + 'hive.metastore.sasl.enabled' = 'true', + 'hive.metastore.kerberos.principal' = 'your-hms-principal', + 'hadoop.security.authentication' = 'kerberos', + 'hadoop.kerberos.keytab' = '/your-keytab-filepath/your.keytab', + 'hadoop.kerberos.principal' = 'your-princi...@your.com', + 'yarn.resourcemanager.principal' = 'your-rm-principal', + 'dfs.nameservices'='your-nameservice', + 'dfs.ha.namenodes.your-nameservice'='nn1,nn2', + 'dfs.namenode.rpc-address.your-nameservice.nn1'='172.21.0.2:8088', + 'dfs.namenode.rpc-address.your-nameservice.nn2'='172.21.0.3:8088', + 'dfs.client.failover.proxy.provider.your-nameservice'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider' +); +``` + +## Hive transactional table + +Hive transactional tables are tables that support ACID semantics in Hive. For more details, see: https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions + +### Support for Hive transactional tables + +| Table Type | Supported Operations in Hive | Hive Table Properties | Supported Hive Versions | +| ------------------------------- | ------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | +| Full-ACID Transactional Table | Supports Insert, Update, Delete operations | 'transactional'='true', 'transactional_properties'='insert_only' | 3.x, 2.x. In 2.x, loading can only be performed after major compaction is executed in Hive. | +| Insert-Only Transactional Table | Only supports Insert operations | 'transactional'='true' | 3.x,2.x | + +### Current restrictions +Currently, scenarios involving Original Files are not supported. After a table is converted into a transactional table, subsequent newly written data files will follow the schema of Hive Transactional table. However, existing data files will not be converted to the schema of the Transactional table. These files are referred to as Original Files. diff --git a/versioned_docs/version-2.0/table-design/best-practice.md b/versioned_docs/version-2.0/table-design/best-practice.md index 5c4333bd7b8c..216ffcdcffd7 100644 --- a/versioned_docs/version-2.0/table-design/best-practice.md +++ b/versioned_docs/version-2.0/table-design/best-practice.md @@ -27,156 +27,451 @@ under the License. # Best Practices -## 1 tabulation +## Data models -### 1.1 Data Model Selection +> Doris arranges data in three models: DUPLICATE KEY model, UNIQUE KEY model, and AGGREGATE KEY model. -Doris data model is currently divided into three categories: AGGREGATE KEY, UNIQUE KEY, DUPLICATE KEY. Data in all three models are sorted by KEY. +:::tip +**Recommendations** -1.1.1. AGGREGATE KEY +Because the data model is determined and **immutable** during table creation, it is important to select the most suitable data model. -When AGGREGATE KEY is the same, old and new records are aggregated. The aggregation functions currently supported are SUM, MIN, MAX, REPLACE. +1. The Duplicate Key model is a good fit for ad-hoc queries across any dimensions. It cannot leverage the benefits of pre-aggregation, but it is also not constrained by the limitations of an aggregation model, so it can take advantage of columnar storage (only reading the relevant columns without the need to read all key columns). +2. The Aggregate Key model, through pre-aggregation, can largely reduce the amount of data scanned and the computational workload for aggregate queries. It is particularly suitable for reporting queries with fixed patterns. However, this model is not friendly for count(*) queries. Additionally, since the aggregation method on the Value columns are fixed, users should pay extra attention to semantic correctness when performing other types of aggregate queries. +3. The Unique Key model is designed for scenarios that require unique primary key constraints. It can ensure the uniqueness of primary keys. The downside is that it cannot reap the benefits brought by materialization and pre-aggregation. For users with high-performance requirements for aggregate queries, it is recommended to use the Merge-on-Write feature of the Unique Key model since Doris 1.2. +4. Users with partial column update requirements might select from the following data models: + 1. Unique Key model (Merge-on-Write mode) + 2. Aggregate Key model (aggregated by REPLACE_IF_NOT_NULL) +::: -AGGREGATE KEY model can aggregate data in advance and is suitable for reporting and multi-dimensional analysis business. +### DUPLICATE KEY model -``` -CREATE TABLE site_visit + + +When only the sorting columns are specified, rows with the same key will not be merged. + +This is applicable to analytical business scenarios where data does not require pre-aggregation: + +- Analysis of raw data +- Analysis of log or time series data where only new data is appended. + +**Best practice** + +```SQL +-- For example, log analysis that allows only appending new data with replicated KEYs. +CREATE TABLE session_data ( -siteid INT, -City: SMALLINT, -username VARCHAR (32), -pv BIGINT SUM DEFAULT '0' + visitorid SMALLINT, + sessionid BIGINT, + visittime DATETIME, + city CHAR(20), + province CHAR(20), + ip varchar(32), + brower CHAR(20), + url VARCHAR(1024) ) -AGGREGATE KEY(siteid, city, username) -DISTRIBUTED BY HASH(siteid) BUCKETS 10; +DUPLICATE KEY(visitorid, sessionid) -- Used solely for specifying sorting columns, rows with the same KEY will not be merged. +DISTRIBUTED BY HASH(sessionid, visitorid) BUCKETS 10; ``` -1.1.2. UNIQUE KEY +### AGGREGATE KEY model -When UNIQUE KEY is the same, the new record covers the old record. Before version 1.2, UNIQUE KEY implements the same REPLACE aggregation method as AGGREGATE KEY, and they are essentially the same. We introduced a new merge-on-write implementation for UNIQUE KEY since version 1.2, which have better performance on many scenarios. Suitable for analytical business with updated requirements. + -``` -CREATE TABLE sales_order +Old and new records with the same AGGREGATE KEY will be aggregated. The currently supported aggregation methods are as follows: + +1. SUM: calculates the sum by accumulating the values of multiple rows; +2. REPLACE: replaces the value in the previously imported rows with the value from the next batch of data; +3. MAX: retains the maximum value; +4. MIN: retains the minimum value; +5. REPLACE_IF_NOT_NULL: replaces non-null values. Unlike REPLACE, it does not replace null values; +6. HLL_UNION: aggregates columns of HLL type using the HyperLogLog algorithm; +7. BITMAP_UNION: aggregates columns of BITMAP type using bitmap union aggregation; + +This is suitable for reporting and multi-dimensional analysis scenarios such as: + +- Website traffic analysis +- Multi-dimensional analysis of data reports. + +**Best practice** + +```SQL +-- Example of website traffic analysis +CREATE TABLE site_visit ( -orderid BIGINT, -status TINYINT, -username VARCHAR (32), -amount BIGINT DEFAULT '0' + siteid INT, + city SMALLINT, + username VARCHAR(32), + pv BIGINT SUM DEFAULT '0' -- PV caculation ) -KEY (orderid) UNIT -DISTRIBUTED BY HASH(orderid) BUCKETS 10; +AGGREGATE KEY(siteid, city, username) -- Rows with the same KEY will be merged, and non-key columns will be aggregated based on the specified aggregation function. +DISTRIBUTED BY HASH(siteid) BUCKETS 10; ``` -1.1.3. DUPLICATE KEY +### UNIQUE KEY model -Only sort columns are specified, and the same rows are not merged. It is suitable for the analysis business where data need not be aggregated in advance. +The new record will replace the old record that has the same UNIQUE KEY with it. Before Doris 1.2, the UNIQUE KEY model was implemented the same way as the REPLACE aggregation of the AGGREGATE KEY model. However, Since Doris 1.2, we have introduced the Merge-on-Write implementation for the UNIQUE KEY model, which provides better performance for aggregate queries. -``` -CREATE TABLE session_data +This is suitable for analytical business scenarios that require updates, such as: + +- Deduplicated order analysis +- Real-time synchronization of inserts, updates, and deletes. + +**Best practice** + +```SQL +-- Example of deduplicated order analysis +CREATE TABLE sales_order ( -visitorid SMALLINT, -sessionid BIGINT, -visit time DATETIME, -City CHAR (20), -province CHAR(20), -ip. varchar (32), -brower CHAR(20), -url: VARCHAR (1024) + orderid BIGINT, + status TINYINT, + username VARCHAR(32), + amount BIGINT DEFAULT '0' ) -DUPLICATE KEY (visitor time, session time) -DISTRIBUTED BY HASH(sessionid, visitorid) BUCKETS 10; +UNIQUE KEY(orderid) -- Rows of the same KEY will be merged +DISTRIBUTED BY HASH(orderid) BUCKETS 10; ``` -### 1.2 Wide Table vs. Star Schema - -When the business side builds tables, in order to adapt to the front-end business, they often do not distinguish between dimension information and indicator information, and define the Schema as a large wide table, this operation is actually not so friendly to the database, we recommend users to use the star model. +## Index -* There are many fields in Schema, and there may be more key columns in the aggregation model. The number of columns that need to be sorted in the import process will increase. -* Dimensional information updates are reflected in the whole table, and the frequency of updates directly affects the efficiency of queries. +> Indexes can facilitate quick filtering and searching of data. Currently, Doris supports two types of indexes: +> +> 1. Built-in smart indexes, including prefix index and ZoneMap index. +> 2. User-created secondary indexes, including inverted index, BloomFilter index, Ngram BloomFilter index, and Bitmap index. -In the process of using Star Schema, users are advised to use Star Schema to distinguish dimension tables from indicator tables as much as possible. Frequently updated dimension tables can also be placed in MySQL external tables. If there are only a few updates, they can be placed directly in Doris. When storing dimension tables in Doris, more copies of dimension tables can be set up to improve Join's performance. +### **Prefix index** -### 1.3 Partitioning and Bucketing +Prefix indexes are built-in indexes in the Aggregate, Unique, and Duplicate data models. The underlying data storage is sorted and stored based on the columns specified as the AGGREGATE KEY, UNIQUE KEY, or DUPLICATE KEY in their respective table creation statements. Prefix index, built on top of the sorted data, allows for quick data querying based on given prefix columns. -Doris supports two-level partitioned storage. The first level is partition, which currently supports both RANGE and LIST partition types, and the second layer is HASH bucket. +Prefix indexes are sparse indexes and cannot locate the exact row where a key is present. Instead, they can only roughly identify the range where the key may exist, and then use binary search algorithms to accurately locate the position of the key. -1.3.1. Partitioning +:::tip +**Recommendations** -Partition is used to divide data into different intervals, which can be logically understood as dividing the original table into multiple sub-tables. Data can be easily managed by partition, for example, to delete data more quickly. +1. When creating a table, **the correct column order can greatly improve query efficiency**. + 1. Since the column order is specified during table creation, a table can only have one type of prefix index. However, this may not be efficient enough for queries based on the columns without prefix index. In such cases, users can adjust the column order by creating materialized views. +2. The first field in a prefix index should always be the field of the longest query conditions and should be a high-cardinality field. + 1. Bucketing field: It should have relatively even data distribution and be frequently used, preferably a high-cardinality field. + 2. Int(4) + Int(4) + varchar(50): The prefix index length is only 28. + 3. Int(4) + varchar(50) + Int(4): The prefix index length is only 24. + 4. varchar(10) + varchar(50): The prefix index length is only 30. + 5. Prefix index (36 characters): The first field delivers the best query performance. If a varchar field is encountered, the prefix index will automatically truncate it to the first 20 characters. + 6. If possible, include the most frequently used query fields in the prefix index. Otherwise, specify them as the bucketing fields. +3. The field lengths in the prefix index should be as explicit as possible because Doris can only utilize the prefix index for the first 36 bytes. +4. If it is difficult to design a partitioning, bucketing, and prefix index strategy for your data range, consider introducing inverted indexes for acceleration. +::: -1.3.1.1. Range Partitioning +### **ZoneMap index** -In business, most users will choose to partition on time, which has the following advantages: +ZoneMap index is index information automatically maintained on a per-column basis in the columnar storage format. It includes information such as Min/Max values and the number of Null values. During data querying, the ZoneMap index is utilized to select the data range to scan based on the filtered fields using range conditions. -* Differentiable heat and cold data -* Availability of Doris Hierarchical Storage (SSD + SATA) +For example, when filtering the "age" field with the following query statement: -1.3.1.2. List Partitioning +```Shell +SELECT * FROM table WHERE age > 0 and age < 51; +``` -In business,, users can select cities or other enumeration values for partition. +If the Short Key Index is not hit, the ZoneMap index will be used to determine the data range, known as the "ordinary" range, that needs to be scanned based on the query conditions for the "age" field. This reduces the number of pages that need to be scanned. -1.3.2. Hash Bucketing +### **Inverted index** -The data is divided into different buckets according to the hash value. +Doris supports inverted indexes since version 2.0.0. Inverted index can be used for full-text searches on text data and range queries on regular numeric and date types. It enables fast filtering of rows that meet the conditions from massive amounts of data. -* It is suggested that columns with large differentiation should be used as buckets to avoid data skew. -* In order to facilitate data recovery, it is suggested that the size of a single bucket should not be too large and should be kept within 10GB. Therefore, the number of buckets should be considered reasonably when building tables or increasing partitions, among which different partitions can specify different buckets. +**Best practice** -### 1.4 Sparse Index and Bloom Filter +```SQL +-- Inverted index can be specified during table creation or added later. This is an example of specifying it during table creation: +CREATE TABLE table_name +( + columns_difinition, + INDEX idx_name1(column_name1) USING INVERTED [PROPERTIES("parser" = "english|unicode|chinese")] [COMMENT 'your comment'] + INDEX idx_name2(column_name2) USING INVERTED [PROPERTIES("parser" = "english|unicode|chinese")] [COMMENT 'your comment'] + INDEX idx_name3(column_name3) USING INVERTED [PROPERTIES("parser" = "chinese", "parser_mode" = "fine_grained|coarse_grained")] [COMMENT 'your comment'] + INDEX idx_name4(column_name4) USING INVERTED [PROPERTIES("parser" = "english|unicode|chinese", "support_phrase" = "true|false")] [COMMENT 'your comment'] + INDEX idx_name5(column_name4) USING INVERTED [PROPERTIES("char_filter_type" = "char_replace", "char_filter_pattern" = "._"), "char_filter_replacement" = " "] [COMMENT 'your comment'] + INDEX idx_name5(column_name4) USING INVERTED [PROPERTIES("char_filter_type" = "char_replace", "char_filter_pattern" = "._")] [COMMENT 'your comment'] +) +table_properties; -Doris stores the data in an orderly manner, and builds a sparse index for Doris on the basis of ordered data. The index granularity is block (1024 rows). +-- Example: keyword matching in full-text searches, implemented by MATCH_ANY MATCH_ALL +SELECT * FROM table_name WHERE column_name MATCH_ANY | MATCH_ALL 'keyword1 ...'; +``` -Sparse index chooses fixed length prefix in schema as index content, and Doris currently chooses 36 bytes prefix as index. +:::tip +**Recommendations** -* When building tables, it is suggested that the common filter fields in queries should be placed in front of Schema. The more distinguishable the query fields are, the more frequent the query fields are. -* One particular feature of this is the varchar type field. The varchar type field can only be used as the last field of the sparse index. The index is truncated at varchar, so if varchar appears in front, the length of the index may be less than 36 bytes. Specifically, you can refer to [data model](./data-model.md), [ROLLUP and query](./hit-the-rollup.md). -* In addition to sparse index, Doris also provides bloomfilter index. Bloomfilter index has obvious filtering effect on columns with high discrimination. If you consider that varchar cannot be placed in a sparse index, you can create a bloomfilter index. +1. If it is difficult to design a partitioning, bucketing, and prefix index strategy for your data range, consider introducing inverted indexes for acceleration. +::: -### 1.5 Rollup -Rollup can essentially be understood as a physical index of the original table. When creating Rollup, only some columns in Base Table can be selected as Schema. The order of fields in Schema can also be different from that in Base Table. +:::caution +**Restrictions** -Rollup can be considered in the following cases: +1. Different data models have different restrictions on inverted index. + 1. Aggregate KEY model: only allows inverted index for Key columns + 2. Unique KEY model: allows inverted index for any column after enabling Merge-on-Write + 3. Duplicate KEY model: allows inverted index for any column +::: -1.5.1. Low ratio of data aggregation in the Base Table +### **BloomFilter index** -This is usually due to the fact that Base Table has more differentiated fields. At this point, you can consider selecting some columns and establishing Rollup. +Doris supports adding BloomFilter indexes to fields with high value distinctiveness, making it suitable for scenarios that involve equivalence queries on columns with high cardinality. -For the `site_visit'table: +**Best practice** +```SQL +-- Example: add "bloom_filter_columns"="k1,k2,k3" in the PROPERTIES of the table creation statement. +-- To create BloomFilter index for saler_id and category_id in the table. +CREATE TABLE IF NOT EXISTS sale_detail_bloom ( + sale_date date NOT NULL COMMENT "Sale data", + customer_id int NOT NULL COMMENT "Customer ID", + saler_id int NOT NULL COMMENT "Saler ID", + sku_id int NOT NULL COMMENT "SKU ID", + category_id int NOT NULL COMMENT "Category ID", + sale_count int NOT NULL COMMENT "Sale count", + sale_price DECIMAL(12,2) NOT NULL COMMENT "Sale price", + sale_amt DECIMAL(20,2) COMMENT "Sale amount" +) +Duplicate KEY(sale_date, customer_id,saler_id,sku_id,category_id) +DISTRIBUTED BY HASH(saler_id) BUCKETS 10 +PROPERTIES ( +"bloom_filter_columns"="saler_id,category_id" +); ``` -site -u visit (siteid, city, username, pv) -``` - -Siteid may lead to a low degree of data aggregation. If business parties often base their PV needs on city statistics, they can build a city-only, PV-based rollup: -``` -ALTER TABLE site_visit ADD ROLLUP rollup_city(city, pv); +:::caution +**Restrictions** + +1. BloomFilter indexes are not supported for columns of type Tinyint, Float, and Double. +2. BloomFilter indexes can only accelerate filtering using "in" and "=" operators. +3. BloomFilter indexes must be built on high-cardinality columns (above 5000) in query conditions that involve "in" or "=" operators. + 1. BloomFilter indexes are suitable for non-prefix filtering. + 2. Queries will filter based on the high-frequency values in the column, and the filtering conditions are mostly "in" and "=". + 3. Unlike Bitmap indexes, BloomFilter indexes are suitable for high-cardinality columns, such as UserID. If created on low-cardinality columns like "gender", each block will contain almost all values, rendering the BloomFilter index meaningless. + 4. It is suitable for cases with data cardinality around half of the total range. + 5. For high-cardinality columns with equality (=) queries, such as ID numbers, using Bitmap indexes can greatly accelerate performance. +::: + +### **Ngram BloomFilter index** + +Since 2.0.0, Doris has introduced the NGram BloomFilter index to improve the performance of "**LIKE**" queries. + +**Best practice** + +```SQL +-- Example of creating NGram BloomFilter index in table creation statement +CREATE TABLE `nb_table` ( + `siteid` int(11) NULL DEFAULT "10" COMMENT "", + `citycode` smallint(6) NULL COMMENT "", + `username` varchar(32) NULL DEFAULT "" COMMENT "", + INDEX idx_ngrambf (`username`) USING NGRAM_BF PROPERTIES("gram_size"="3", "bf_size"="256") COMMENT 'username ngram_bf index' +) ENGINE=OLAP +AGGREGATE KEY(`siteid`, `citycode`, `username`) COMMENT "OLAP" +DISTRIBUTED BY HASH(`siteid`) BUCKETS 10; + +-- PROPERTIES("gram_size"="3", "bf_size"="256"), representing the number of grams and the byte size of the BloomFilter +-- The number of grams is determined according to the query cases and is typically set to the length of the majority of query strings. The number of bytes in the BloomFilter can be determined after testing. Generally, a larger number of bytes leads to better filtering results, and it is recommended to start with a value of 256 for testing and evaluating the effectiveness. However, it's important to note that a larger number of bytes also increases the storage cost of the index. +-- With high data cardinality, there is no need to set a large BloomFilter size. Conversely, with low data cardinality, increase the BloomFilter size to enhance filtering efficiency. ``` -1.5.2. The prefix index in Base Table cannot be hit +:::caution +**Restrictions** -Generally, the way Base Table is constructed cannot cover all query modes. At this point, you can consider adjusting the column order and establishing Rollup. +1. NGram BloomFilter index only supports string columns. +2. NGram BloomFilter indexes and BloomFilter indexes are mutually exclusive, meaning that only one of them can be set for the same column. +3. The sizes of the NGram and the BloomFilter can both be optimized based on the actual situation. If the NGram size is relatively small, you may increase the BloomFilter size. +4. For data at the scale of billions or above, if there is a need for fuzzy matching, it is recommended to use inverted indexes or NGram BloomFilter. +::: -Database Session +### **Bitmap index** -``` -session -u data (visitorid, sessionid, visittime, city, province, ip, browser, url) -``` +To accelerate data queries, Doris supports users in adding Bitmap indexes to certain fields. This is suitable for scenarios involving equivalence or range queries on columns with lower cardinality. -In addition to visitorid analysis, there are browser and province analysis cases, Rollup can be established separately. +**Best practice** +```SQL +-- Example: create Bitmap index for siteid on bitmap_table +CREATE INDEX [IF NOT EXISTS] bitmap_index_name ON +bitmap_table (siteid) +USING BITMAP COMMENT 'bitmap_siteid'; ``` -ALTER TABLE session_data ADD ROLLUP rollup_browser(browser,province,ip,url) DUPLICATE KEY(browser,province); -``` - -## Schema Change - -Users can modify the Schema of an existing table through the Schema Change operation, currently Doris supports the following modifications: -- Adding and deleting columns -- Modify column types -- Reorder columns -- Adding or removing index +:::caution +**Restrictions** + +1. Bitmap indexes can only be created on a single column. +2. Bitmap indexes can be applied to all columns in the `Duplicate` and `Unique` Key models, as well as the key columns in the `Aggregate` Key model. +3. Bitmap indexes support the following data types: + 1. `TINYINT` + 2. `SMALLINT` + 3. `INT` + 4. `BIGINT` + 5. `CHAR` + 6. `VARCHAR` + 7. `DATE` + 8. `DATETIME` + 9. `LARGEINT` + 10. `DECIMAL` + 11. `BOOL` +4. Bitmap indexes only take effect under Segment V2. The storage format of tables with Bitmap indexes will be automatically converted to V2 format by default. +5. Bitmap indexes should be constructed within a certain cardinality range. It is not suitable for extremely high or low cardinality cases. + 1. It is recommended for columns with a cardinality between 100 and 100,000, such as the occupation field or city field. If the duplication rate is too high, there won't be significant advantages to build Bitmap indexes compared to other types of indexes. If the duplication rate is too low, Bitmap indexes can significantly reduce space efficiency and performance. Specific types of queries, such as count, OR, and AND operations, only require bitwise operations. + 2. Bitmap indexes are more suitable for orthogonal queries. +::: + +## Field type + +Doris supports various field types, including precise deduplication with BITMAP, fuzzy deduplication with HLL, semi-structured data types such as ARRAY/MAP/JSON, as well as common numeric, string, and time types. + +:::tip +**Recommendations** + +1. VARCHAR + 1. Variable-length string with a length range of 1-65533 bytes. It is stored in UTF-8 encoding, where English characters typically occupy 1 byte. + 2. There is often a misunderstanding about the performance difference between varchar(255) and varchar(65533). If the data stored in both cases is the same, the performance will be the same as well. When creating a table, if you are unsure about the maximum length of the field, it is recommended to use varchar(65533) to prevent import errors caused by excessively long strings. +2. STRING + 1. Variable-length string with a default size of 1048576 bytes (1MB), which can be increased to 2147483643 bytes (2GB). It is stored in UTF-8 encoding, where English characters typically occupy 1 byte. + 2. It can only be used in value columns but not key columns or partitioning columns. + 3. It is suitable for storing large text content. However, if such a requirement does not exist, it is recommended to use VARCHAR. STRING columns have limitations as they cannot be used in key columns or partitioning columns. +3. Numeric fields: Choose the appropriate data type based on the required precision. There is no special restrictions on this. +4. Time fields: Note that if there is a high precision requirement (timestamp accurate to milliseconds), you need to specify the use of datetime(6). Otherwise, such timestamps are not supported by default. +5. It is recommended to use the JSON data type instead of string type for storing JSON data. +::: + +## Create table + + + +Considerations in creating a table include the setting of data partitions and buckets in addition to data model, index, and field types. + +**Best practice** + +```SQL +-- Take Merge-on-Write tables in the Unique Key model as an example: +-- Merge-on-Write in the Unique Key model is implemented in a different way from the Aggregate Key model. The performance of it is similar to that on the Duplicate Key model. +-- In use cases requiring primary key constraints, the Aggregate Key model can deliver much better query performance compared to the Duplicate Key model, especially in aggregate queries and queries that involve filtering a large amount of data using indexes. + +-- For non-partitioned tables +CREATE TABLE IF NOT EXISTS tbl_unique_merge_on_write +( + `user_id` LARGEINT NOT NULL COMMENT "Use ID", + `username` VARCHAR(50) NOT NULL COMMENT "Username", + `register_time` DATE COMMENT "User registration time", + `city` VARCHAR(20) COMMENT "User city", + `age` SMALLINT COMMENT "User age", + `sex` TINYINT COMMENT "User gender", + `phone` LARGEINT COMMENT "User phone number", + `address` VARCHAR(500) COMMENT "User address" +) +UNIQUE KEY(`user_id`, `username`) +-- Data volume of 3~5G +DISTRIBUTED BY HASH(`user_id`) BUCKETS 10 +PROPERTIES ( +-- In Doris 1.2.0, as a new feature, Merge-on-Write is disabled by default. Users can enable it by adding the following property. +"enable_unique_key_merge_on_write" = "true" +); + +-- For partitioned tables +CREATE TABLE IF NOT EXISTS tbl_unique_merge_on_write_p +( + `user_id` LARGEINT NOT NULL COMMENT "Use ID", + `username` VARCHAR(50) NOT NULL COMMENT "Username", + `register_time` DATE COMMENT "User registration time", + `city` VARCHAR(20) COMMENT "User city", + `age` SMALLINT COMMENT "User age", + `sex` TINYINT COMMENT "User gender", + `phone` LARGEINT COMMENT "User phone number", + `address` VARCHAR(500) COMMENT "User address" +) +UNIQUE KEY(`user_id`, `username`, `register_time`) +PARTITION BY RANGE(`register_time`) ( + PARTITION p00010101_1899 VALUES [('0001-01-01'), ('1900-01-01')), + PARTITION p19000101 VALUES [('1900-01-01'), ('1900-01-02')), + PARTITION p19000102 VALUES [('1900-01-02'), ('1900-01-03')), + PARTITION p19000103 VALUES [('1900-01-03'), ('1900-01-04')), + PARTITION p19000104_1999 VALUES [('1900-01-04'), ('2000-01-01')), + FROM ("2000-01-01") TO ("2022-01-01") INTERVAL 1 YEAR, + PARTITION p30001231 VALUES [('3000-12-31'), ('3001-01-01')), + PARTITION p99991231 VALUES [('9999-12-31'), (MAXVALUE)) +) +-- Data volume of 3~5G +DISTRIBUTED BY HASH(`user_id`) BUCKETS 10 +PROPERTIES ( +-- In Doris 1.2.0, as a new feature, Merge-on-Write is disabled by default. Users can enable it by adding the following property. +"enable_unique_key_merge_on_write" = "true", +-- The unit for dynamic partition scheduling can be specified as HOUR, DAY, WEEK, MONTH, or YEAR. +"dynamic_partition.time_unit" = "MONTH", +-- The starting offset for dynamic partitioning is specified as a negative number. Depending on the value of the time_unit, it uses the current day (week/month) as the reference point, partitions prior to this offset will be deleted (TTL). If not specified, the default value is -2147483648, indicating that historical partitions will not be deleted. +"dynamic_partition.start" = "-3000", +-- The ending offset for dynamic partitioning is specified as a positive number. Depending on the value of the time_unit, it uses the current day (week/month) as the reference point. Create the corresponding partitions of the specified range in advance. +"dynamic_partition.end" = "10", +-- The prefix for names of the dynamically created partitions (required). +"dynamic_partition.prefix" = "p", +-- The number of buckets corresponding to the dynamically created partitions. +"dynamic_partition.buckets" = "10", +"dynamic_partition.enable" = "true", +-- The following is the number of replicas corresponding to dynamically created partitions. If not specified, the default value will be the replication factor specified when creating the table, which is typically 3. +"dynamic_partition.replication_num" = "3", +"replication_num" = "3" +); + +-- View existing partitions +-- The actual number of created partitions is determined by a combination of dynamic_partition.start, dynamic_partition.end, and the settings of PARTITION BY RANGE. +show partitions from tbl_unique_merge_on_write_p; +``` -For details, please refer to [Schema Change](../advanced/alter-table/schema-change.md) \ No newline at end of file +:::tip +**Recommendations** + +1. The database name should be in lowercase, separated by underscores (_) with a maximum length of 62 bytes. +2. Table names are case-sensitive and should be in lowercase, separated by underscores (_) with a maximum length of 64 bytes. +3. Manual bucketing is preferred over auto bucketing. It is recommended to partition and bucket your data based on your specific data volume, as this can improve data writing and query performance. Auto bucketing can result in a large number of tablets and an abundance of small files. +4. For data volumes ranging from 10 million to 200 million rows, you can directly set a bucketing strategy and skip the partition configurations. (Doris internally has default partitions for the table). +5. In **time-series scenarios**, it is recommended to include `compaction_policy" = "time_series` in the table properties when creating the table. This effectively reduces the write amplification of compaction, especially in continuous data ingestion scenarios. Note that it should be used in conjunction with inverted indexes. +::: + + +:::caution +**Restrictions** + +1. The database character set should be specified as UTF-8 since only UTF-8 is supported. +2. The replication factor for tables must be 3 (if not specified, it defaults to 3). +3. The data volume of an individual tablet (**Tablet Count = Partition Count \* Bucket Count \* Replication Factor**) theoretically has no upper or lower bounds, except for small tables (in the range of hundreds of megabytes to one gigabyte), where it should be ensured to be within the range of 1 GB to 10 GB: + 1. If the data volume of an individual tablet is too small, it can result in poor data aggregation performance and increased metadata management overhead. + 2. If the data volume is too large, it hinders replica migration, synchronization, and increases the cost of schema changes or materialization operations (these operations are retried at the granularity of a tablet). +4. For data exceeding 500 million records, **partitioning and bucketing** strategies must be implemented: + 1. **Recommendations for bucketing:** + 1. For large tables, each tablet should be in the range of 1 GB to 10 GB to prevent the generation of too many small files. + 2. For dimension tables of approximately 100 megabytes, the number of tablets should be controlled within the range of 3 to 5. This ensures a certain level of concurrency without generating excessive small files. + 2. If partitioning is not feasible and the data grows rapidly without the possibility of dynamic time-based partitioning, it is advisable to increase the number of buckets to accommodate the data volume based on the data retention period (180 days). It is still recommended to keep the size of each bucket between 1 GB and 10 GB. + 3. Apply salting to the bucketing field and use the same salting strategy for queries in order to leverage bucket pruning capabilities. + 4. Random bucketing: + 1. If an OLAP table does not have fields that need to be updated, setting the data bucketing mode to RANDOM can avoid severe data skew. During data ingestion, each batch of data is randomly assigned to a tablet for writing. + 2. When the bucketing mode for a table is set to RANDOM, since there is no bucketing column, querying the table will scan all buckets in the hit partitions instead of querying specific buckets based on the values of the bucketing column. This setting is suitable for overall aggregation and analysis queries rather than high-concurrency point queries. + 3. If an OLAP table has a random distribution of data, setting the `load_to_single_tablet` parameter to true during data ingestion allows each task to write to a single tablet. This improves concurrency and throughput during large-scale data ingestion. It can also reduce the write amplification caused by data ingestion and compaction and ensure cluster stability. + 5. Dimension tables, which grow slowly, can use a single partition and apply bucketing based on commonly used query conditions (where the data distribution of the bucketing field is relatively even). + 6. Fact tables. +5. If the bucketing field exhibits more than 30% data skew, it is recommended to avoid using the Hash bucketing strategy and instead use the RANDOM bucketing strategy. +6. Dynamic partitioning should not be used for tables of a data size less than 20 million rows. Because the dynamic partitioning method automatically creates partitions, and for small tables, it might unnecessarily create lots of partitions and buckets). +7. For scenarios where there is a large amount of historical partitioned data but the historical data is relatively small, unbalanced, or queried infrequently, you can use the following approach to place the data in special partitions. You can create historical partitions for historical data of small sizes (e.g., yearly partitions, monthly partitions). For example, you can create historical partitions for data `FROM ("2000-01-01") TO ("2022-01-01") INTERVAL 1 YEAR`: + 1. ( + + 2. PARTITION p00010101_1899 VALUES [('0001-01-01'), ('1900-01-01')), + + 3. PARTITION p19000101 VALUES [('1900-01-01'), ('1900-01-02')), + + 4. ... + + 5. PARTITION p19000104_1999 VALUES [('1900-01-04'), ('2000-01-01')), + + 6. FROM ("2000-01-01") TO ("2022-01-01") INTERVAL 1 YEAR, + + 7. PARTITION p30001231 VALUES [('3000-12-31'), ('3001-01-01')), + + 8. PARTITION p99991231 VALUES [('9999-12-31'), (MAXVALUE)) + + 9. ) +8. The number of single-table materialized views should not exceed 6: + 1. Single-table materialized views are built in real time. + 2. On the Unique Key model, materialized views can only be used to reorder keys and cannot be used for data aggregation, as the aggregation model for the Unique Key model is Replace. +::: --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org