This is an automated email from the ASF dual-hosted git repository. luzhijing pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new aff431f07fb [doc](mtmv) Add some feature docs for 2.1.5 and master (#876) aff431f07fb is described below commit aff431f07fb9cc681ed41d19242ea878140eb8c5 Author: seawinde <149132972+seawi...@users.noreply.github.com> AuthorDate: Fri Jul 19 11:33:13 2024 +0800 [doc](mtmv) Add some feature docs for 2.1.5 and master (#876) Add docs for some feature in 2.1.5 such as - enable nondeterministic function when create materialized view - support more query rewrite --- .../query-async-materialized-view.md | 83 ++++++++++++-- .../Create/CREATE-ASYNC-MATERIALIZED-VIEW.md | 11 +- .../query-async-materialized-view.md | 81 +++++++++++-- .../Create/CREATE-ASYNC-MATERIALIZED-VIEW.md | 9 +- .../query-async-materialized-view.md | 127 +++++++++++++++------ .../Create/CREATE-ASYNC-MATERIALIZED-VIEW.md | 20 +++- .../query-async-materialized-view.md | 83 ++++++++++++-- .../Create/CREATE-ASYNC-MATERIALIZED-VIEW.md | 11 +- 8 files changed, 342 insertions(+), 83 deletions(-) diff --git a/docs/query/view-materialized-view/query-async-materialized-view.md b/docs/query/view-materialized-view/query-async-materialized-view.md index 09a88a8a6ad..7988e76fdb9 100644 --- a/docs/query/view-materialized-view/query-async-materialized-view.md +++ b/docs/query/view-materialized-view/query-async-materialized-view.md @@ -345,6 +345,73 @@ l_shipdate, l_suppkey; ``` + +**Case 3** +Supports transparent rewriting for multidimensional aggregation. That is, if the materialized view does not contain +GROUPING SETS, CUBE, ROLLUP, and there are multidimensional aggregations in the query. Additionally, if the fields +after the group by in the materialized view include all the fields in the multidimensional aggregation in the query, +then transparent rewriting can also be performed. + + +Materialized view definition: +```sql +CREATE MATERIALIZED VIEW mv5_1 +BUILD IMMEDIATE REFRESH AUTO ON SCHEDULE EVERY 1 hour +DISTRIBUTED BY RANDOM BUCKETS 3 +PROPERTIES ('replication_num' = '1') +AS +select o_orderstatus, o_orderdate, o_orderpriority, + sum(o_totalprice) as sum_total, + max(o_totalprice) as max_total, + min(o_totalprice) as min_total, + count(*) as count_all +from orders +group by +o_orderstatus, o_orderdate, o_orderpriority; +``` + +Query statement: +```sql +select o_orderstatus, o_orderdate, o_orderpriority, + sum(o_totalprice), + max(o_totalprice), + min(o_totalprice), + count(*) +from orders +group by +GROUPING SETS ((o_orderstatus, o_orderdate), (o_orderpriority), (o_orderstatus), ()); +``` + + +**Case 4** +When the query contains aggregation and the materialized view does not contain aggregation, if all the columns +used in the query can be obtained from the materialized view, then the rewrite can also be successful. + +Materialized view definition: +```sql +CREATE MATERIALIZED VIEW mv5_2 +BUILD IMMEDIATE REFRESH AUTO ON SCHEDULE EVERY 1 hour +DISTRIBUTED BY RANDOM BUCKETS 3 +PROPERTIES ('replication_num' = '1') +AS +select case when o_shippriority > 1 and o_orderkey IN (4, 5) then o_custkey else o_shippriority end, + o_orderstatus, + bin(o_orderkey) +from orders; +``` + +Query statement: +```sql +select + count(case when o_shippriority > 1 and o_orderkey IN (4, 5) then o_custkey else o_shippriority end), + o_orderstatus, + bin(o_orderkey) +from orders +group by + o_orderstatus, + bin(o_orderkey); +``` + Temporary support for the aggregation roll-up functions is as follows: | Functions in Queries | Functions in Materialized Views | Aggregation Functions After Rewriting | @@ -587,14 +654,14 @@ you can execute the following statement. It will provide a detailed breakdown of ## Relevant Environment Variables -| Switch | Description | -|---------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------| -| SET enable_nereids_planner = true; | Asynchronous materialized views are only supported under the new optimizer, so the new optimizer needs to be enabled. | -| SET enable_materialized_view_rewrite = true; | Enable or disable query transparent rewriting, default is disabled | -| SET materialized_view_rewrite_enable_contain_external_table = true; | Whether materialized views participating in transparent rewriting are allowed to contain external tables, default is not allowed | -| SET materialized_view_rewrite_success_candidate_num = 3; | Transparently rewrites the successful result set, allowing the maximum number of CBO candidates to participate, the default is 3 | -| SET enable_materialized_view_union_rewrite = true; | Whether to allow the union of base table and materialized view using UNION ALL when the partitioned materialized view is insufficient to provide all the data required by a query. Default is enabled. | -| SET enable_materialized_view_nest_rewrite = true; | Whether to allow nested rewriting. Default is disabled. | +| Switch | Description | +|---------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| SET enable_nereids_planner = true; | Asynchronous materialized views are only supported under the new optimizer, so the new optimizer needs to be enabled. | +| SET enable_materialized_view_rewrite = true; | Enable or disable query transparent rewriting, default is enabled | +| SET materialized_view_rewrite_enable_contain_external_table = true; | Whether materialized views participating in transparent rewriting are allowed to contain external tables, default is not allowed | +| SET materialized_view_rewrite_success_candidate_num = 3; | Transparently rewrites the successful result set, allowing the maximum number of CBO candidates to participate, the default is 3 | +| SET enable_materialized_view_union_rewrite = true; | Whether to allow the union of base table and materialized view using UNION ALL when the partitioned materialized view is insufficient to provide all the data required by a query. Default is enabled. | +| SET enable_materialized_view_nest_rewrite = true; | Whether to allow nested rewriting. Default is disabled. | | SET materialized_view_relation_mapping_max_count = 8; | Maximum number of relation mappings allowed during transparent rewrite. If exceeded, truncation will occur. Relation mapping is typically generated by self-joins in tables and the number is usually the Cartesian product, for example, if there are 3 tables, it may generate 8 combinations. Default is 8. | ## Limitations diff --git a/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md b/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md index 64036ea2d08..4e588733f71 100644 --- a/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md +++ b/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md @@ -213,14 +213,11 @@ The properties unique to materialized views include: `workload_group`: The name of the workload_group used by the materialized view when performing refresh tasks. This is used to limit the resources used for refreshing data in the materialized view, in order to avoid affecting the operation of other business processes. For details on how to create and use workload_group, refer to [WORKLOAD-GROUP](../../../../admin-manual/workload-group.md) -##### query +`query`: Create a query statement for the materialized view, and the result is the data in the materialized view -Create a query statement for the materialized view, and the result is the data in the materialized view - -Random functions are not supported, for example: -```sql -SELECT random() as dd,k3 FROM user -``` +`enable_nondeterministic_function`: Whether the SQL definition of the materialized view allows containing nondeterministic +functions, such as current_date(), now(), random(), etc. If true, they are allowed; otherwise, they are not allowed. +By default, they are not allowed. ### Example diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query/view-materialized-view/query-async-materialized-view.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query/view-materialized-view/query-async-materialized-view.md index b636d78810c..b9574ce36c7 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query/view-materialized-view/query-async-materialized-view.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query/view-materialized-view/query-async-materialized-view.md @@ -323,6 +323,69 @@ l_shipdate, l_suppkey; ``` +**用例 3** +支持多维聚合的透明改写,即如果物化视图中没有 GROUPING SETS, CUBE, ROLLUP, 查询中有多维聚合。并且物化视图 group by 后的字段包含查询中多维聚合 +中的所有字段。那么也可以进行透明改写。 + + +mv 定义: +```sql +CREATE MATERIALIZED VIEW mv5_1 +BUILD IMMEDIATE REFRESH AUTO ON SCHEDULE EVERY 1 hour +DISTRIBUTED BY RANDOM BUCKETS 3 +PROPERTIES ('replication_num' = '1') +AS +select o_orderstatus, o_orderdate, o_orderpriority, + sum(o_totalprice) as sum_total, + max(o_totalprice) as max_total, + min(o_totalprice) as min_total, + count(*) as count_all +from orders +group by +o_orderstatus, o_orderdate, o_orderpriority; +``` + +查询语句: +```sql +select o_orderstatus, o_orderdate, o_orderpriority, + sum(o_totalprice), + max(o_totalprice), + min(o_totalprice), + count(*) +from orders +group by +GROUPING SETS ((o_orderstatus, o_orderdate), (o_orderpriority), (o_orderstatus), ()); +``` + + +**用例 4** +当查询中包含聚合,物化视图中不包含聚合,查询中使用的列都可以从物化视图中获取,那么也可以改写成功。 + +mv 定义: +```sql +CREATE MATERIALIZED VIEW mv5_2 +BUILD IMMEDIATE REFRESH AUTO ON SCHEDULE EVERY 1 hour +DISTRIBUTED BY RANDOM BUCKETS 3 +PROPERTIES ('replication_num' = '1') +AS +select case when o_shippriority > 1 and o_orderkey IN (4, 5) then o_custkey else o_shippriority end, + o_orderstatus, + bin(o_orderkey) +from orders; +``` + +查询语句: +```sql +select + count(case when o_shippriority > 1 and o_orderkey IN (4, 5) then o_custkey else o_shippriority end), + o_orderstatus, + bin(o_orderkey) +from orders +group by + o_orderstatus, + bin(o_orderkey); +``` + 暂时目前支持的聚合上卷函数列表如下: | 查询中函数 | 物化视图中函数 | 函数上卷后 | @@ -537,19 +600,19 @@ where o_orderstatus = 'o' ## 相关环境变量 -| 开关 | 说明 | -|---------------------------------------------------------------------|------------------------------------------------------------------------------------------------------| -| SET enable_nereids_planner = true; | 异步物化视图只有在新优化器下才支持,所以需要开启新优化器 | -| SET enable_materialized_view_rewrite = true; | 开启或者关闭查询透明改写,默认关闭 | -| SET materialized_view_rewrite_enable_contain_external_table = true; | 参与透明改写的物化视图是否允许包含外表,默认不允许 | -| SET materialized_view_rewrite_success_candidate_num = 3; | 透明改写成功的结果集合,允许参与到 CBO 候选的最大数量,默认是 3 | -| SET enable_materialized_view_union_rewrite = true; | 当分区物化视图不足以提供查询的全部数据时,是否允许基表和物化视图 union all 来响应查询,默认允许 | -| SET enable_materialized_view_nest_rewrite = true; | 是否允许嵌套改写,默认不允许 | +| 开关 | 说明 | +|---------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------| +| SET enable_nereids_planner = true; | 异步物化视图只有在新优化器下才支持,所以需要开启新优化器 | +| SET enable_materialized_view_rewrite = true; | 开启或者关闭查询透明改写,默认开启 | +| SET materialized_view_rewrite_enable_contain_external_table = true; | 参与透明改写的物化视图是否允许包含外表,默认不允许 | +| SET materialized_view_rewrite_success_candidate_num = 3; | 透明改写成功的结果集合,允许参与到 CBO 候选的最大数量,默认是 3 | +| SET enable_materialized_view_union_rewrite = true; | 当分区物化视图不足以提供查询的全部数据时,是否允许基表和物化视图 union all 来响应查询,默认允许 | +| SET enable_materialized_view_nest_rewrite = true; | 是否允许嵌套改写,默认不允许 | | SET materialized_view_relation_mapping_max_count = 8; | 透明改写过程中,relation mapping最大允许数量,如果超过,进行截取。relation mapping通常由表自关联产生,数量一般会是笛卡尔积,比如3张表,可能会产生 8 种组合。默认是 8 | ## 限制 - 物化视图定义语句中只允许包含 SELECT、FROM、WHERE、JOIN、GROUP BY 语句,JOIN 的输入可以包含简单的 GROUP BY(单表聚合),其中 JOIN 的支持的类型为 - INNER,LEFT OUTER JOIN,RIGHT OUTER JOIN,FULL OUTER JOIN, LEFT SEMI JOIN,RIGHT SEMI JOIN,LEFT ANTI JOIN,RIGHT ANTI JOIN。 + INNER,LEFT OUTER JOIN,RIGHT OUTER JOIN,FULL OUTER JOIN,LEFT SEMI JOIN,RIGHT SEMI JOIN,LEFT ANTI JOIN,RIGHT ANTI JOIN。 - 基于 External Table 的物化视图不保证查询结果强一致。 - 不支持使用非确定性函数来构建物化视图,包括 rand、now、current_time、current_date、random、uuid 等。 - 不支持窗口函数的透明改写。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md index 3fcfe11e9e4..b786d141065 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md @@ -221,14 +221,11 @@ partition by (date_trunc(`k2`,'month')) `partition_date_format`:分区字段的时间格式,例如"%Y-%m-%d" -`query` +`enable_nondeterministic_function`:物化视图定义 SQL 是否允许包含 nondeterministic 函数,比如 current_date(), now(), random()等,如果 +是 true, 允许包含,否则不允许包含, 默认不允许包含。 -创建物化视图的查询语句,其结果即为物化视图中的数据 +`query`:创建物化视图的查询语句,其结果即为物化视图中的数据 -不支持随机函数,例如: -```sql -SELECT random() as dd,k3 FROM user -``` ### 示例 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query/view-materialized-view/query-async-materialized-view.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query/view-materialized-view/query-async-materialized-view.md index 183c62e8c29..a55dbdbcd44 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query/view-materialized-view/query-async-materialized-view.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query/view-materialized-view/query-async-materialized-view.md @@ -59,9 +59,9 @@ CREATE TABLE IF NOT EXISTS lineitem ( PROPERTIES ("replication_num" = "1"); insert into lineitem values - (1, 2, 3, 4, 5.5, 6.5, 7.5, 8.5, 'o', 'k', '2023-10-17', '2023-10-17', '2023-10-17', 'a', 'b', 'yyyyyyyyy'), - (2, 4, 3, 4, 5.5, 6.5, 7.5, 8.5, 'o', 'k', '2023-10-18', '2023-10-18', '2023-10-18', 'a', 'b', 'yyyyyyyyy'), - (3, 2, 4, 4, 5.5, 6.5, 7.5, 8.5, 'o', 'k', '2023-10-19', '2023-10-19', '2023-10-19', 'a', 'b', 'yyyyyyyyy'); +(1, 2, 3, 4, 5.5, 6.5, 7.5, 8.5, 'o', 'k', '2023-10-17', '2023-10-17', '2023-10-17', 'a', 'b', 'yyyyyyyyy'), +(2, 4, 3, 4, 5.5, 6.5, 7.5, 8.5, 'o', 'k', '2023-10-18', '2023-10-18', '2023-10-18', 'a', 'b', 'yyyyyyyyy'), +(3, 2, 4, 4, 5.5, 6.5, 7.5, 8.5, 'o', 'k', '2023-10-19', '2023-10-19', '2023-10-19', 'a', 'b', 'yyyyyyyyy'); ``` ```sql CREATE TABLE IF NOT EXISTS orders ( @@ -81,31 +81,31 @@ CREATE TABLE IF NOT EXISTS orders ( DISTRIBUTED BY HASH(o_orderkey) BUCKETS 3 PROPERTIES ("replication_num" = "1"); - insert into orders values - (1, 1, 'o', 9.5, '2023-10-17', 'a', 'b', 1, 'yy'), - (1, 1, 'o', 10.5, '2023-10-18', 'a', 'b', 1, 'yy'), - (2, 1, 'o', 11.5, '2023-10-19', 'a', 'b', 1, 'yy'), - (3, 1, 'o', 12.5, '2023-10-19', 'a', 'b', 1, 'yy'); +insert into orders values +(1, 1, 'o', 9.5, '2023-10-17', 'a', 'b', 1, 'yy'), +(1, 1, 'o', 10.5, '2023-10-18', 'a', 'b', 1, 'yy'), +(2, 1, 'o', 11.5, '2023-10-19', 'a', 'b', 1, 'yy'), +(3, 1, 'o', 12.5, '2023-10-19', 'a', 'b', 1, 'yy'); ``` ```sql - CREATE TABLE IF NOT EXISTS partsupp ( - ps_partkey INTEGER NOT NULL, - ps_suppkey INTEGER NOT NULL, - ps_availqty INTEGER NOT NULL, - ps_supplycost DECIMALV3(15,2) NOT NULL, - ps_comment VARCHAR(199) NOT NULL - ) - DUPLICATE KEY(ps_partkey, ps_suppkey) - DISTRIBUTED BY HASH(ps_partkey) BUCKETS 3 - PROPERTIES ( - "replication_num" = "1" - ); +CREATE TABLE IF NOT EXISTS partsupp ( + ps_partkey INTEGER NOT NULL, + ps_suppkey INTEGER NOT NULL, + ps_availqty INTEGER NOT NULL, + ps_supplycost DECIMALV3(15,2) NOT NULL, + ps_comment VARCHAR(199) NOT NULL +) +DUPLICATE KEY(ps_partkey, ps_suppkey) +DISTRIBUTED BY HASH(ps_partkey) BUCKETS 3 +PROPERTIES ( + "replication_num" = "1" +); - insert into partsupp values - (2, 3, 9, 10.01, 'supply1'), - (4, 3, 10, 11.01, 'supply2'), - (2, 3, 10, 11.01, 'supply3'); +insert into partsupp values +(2, 3, 9, 10.01, 'supply1'), +(4, 3, 10, 11.01, 'supply2'), +(2, 3, 10, 11.01, 'supply3'); ``` ## 直查物化视图 @@ -323,6 +323,69 @@ l_shipdate, l_suppkey; ``` +**用例 3** +支持多维聚合的透明改写,即如果物化视图中没有 GROUPING SETS, CUBE, ROLLUP, 查询中有多维聚合。并且物化视图 group by 后的字段包含查询中多维聚合 +中的所有字段。那么也可以进行透明改写。 + + +mv 定义: +```sql +CREATE MATERIALIZED VIEW mv5_1 +BUILD IMMEDIATE REFRESH AUTO ON SCHEDULE EVERY 1 hour +DISTRIBUTED BY RANDOM BUCKETS 3 +PROPERTIES ('replication_num' = '1') +AS +select o_orderstatus, o_orderdate, o_orderpriority, + sum(o_totalprice) as sum_total, + max(o_totalprice) as max_total, + min(o_totalprice) as min_total, + count(*) as count_all +from orders +group by +o_orderstatus, o_orderdate, o_orderpriority; +``` + +查询语句: +```sql +select o_orderstatus, o_orderdate, o_orderpriority, + sum(o_totalprice), + max(o_totalprice), + min(o_totalprice), + count(*) +from orders +group by +GROUPING SETS ((o_orderstatus, o_orderdate), (o_orderpriority), (o_orderstatus), ()); +``` + + +**用例 4** +当查询中包含聚合,物化视图中不包含聚合,查询中使用的列都可以从物化视图中获取,那么也可以改写成功。 + +mv 定义: +```sql +CREATE MATERIALIZED VIEW mv5_2 +BUILD IMMEDIATE REFRESH AUTO ON SCHEDULE EVERY 1 hour +DISTRIBUTED BY RANDOM BUCKETS 3 +PROPERTIES ('replication_num' = '1') +AS +select case when o_shippriority > 1 and o_orderkey IN (4, 5) then o_custkey else o_shippriority end, + o_orderstatus, + bin(o_orderkey) +from orders; +``` + +查询语句: +```sql +select + count(case when o_shippriority > 1 and o_orderkey IN (4, 5) then o_custkey else o_shippriority end), + o_orderstatus, + bin(o_orderkey) +from orders +group by + o_orderstatus, + bin(o_orderkey); +``` + 暂时目前支持的聚合上卷函数列表如下: | 查询中函数 | 物化视图中函数 | 函数上卷后 | @@ -537,14 +600,14 @@ where o_orderstatus = 'o' ## 相关环境变量 -| 开关 | 说明 | -|---------------------------------------------------------------------|------------------------------------------------------------------------------------------------------| -| SET enable_nereids_planner = true; | 异步物化视图只有在新优化器下才支持,所以需要开启新优化器 | -| SET enable_materialized_view_rewrite = true; | 开启或者关闭查询透明改写,默认关闭 | -| SET materialized_view_rewrite_enable_contain_external_table = true; | 参与透明改写的物化视图是否允许包含外表,默认不允许 | -| SET materialized_view_rewrite_success_candidate_num = 3; | 透明改写成功的结果集合,允许参与到 CBO 候选的最大数量,默认是 3 | -| SET enable_materialized_view_union_rewrite = true; | 当分区物化视图不足以提供查询的全部数据时,是否允许基表和物化视图 union all 来响应查询,默认允许 | -| SET enable_materialized_view_nest_rewrite = true; | 是否允许嵌套改写,默认不允许 | +| 开关 | 说明 | +|---------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------| +| SET enable_nereids_planner = true; | 异步物化视图只有在新优化器下才支持,所以需要开启新优化器 | +| SET enable_materialized_view_rewrite = true; | 开启或者关闭查询透明改写,默认开启 | +| SET materialized_view_rewrite_enable_contain_external_table = true; | 参与透明改写的物化视图是否允许包含外表,默认不允许 | +| SET materialized_view_rewrite_success_candidate_num = 3; | 透明改写成功的结果集合,允许参与到 CBO 候选的最大数量,默认是 3 | +| SET enable_materialized_view_union_rewrite = true; | 当分区物化视图不足以提供查询的全部数据时,是否允许基表和物化视图 union all 来响应查询,默认允许 | +| SET enable_materialized_view_nest_rewrite = true; | 是否允许嵌套改写,默认不允许 | | SET materialized_view_relation_mapping_max_count = 8; | 透明改写过程中,relation mapping最大允许数量,如果超过,进行截取。relation mapping通常由表自关联产生,数量一般会是笛卡尔积,比如3张表,可能会产生 8 种组合。默认是 8 | ## 限制 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md index 7562d2c43bf..9f6fad688bc 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md @@ -207,17 +207,22 @@ partition by (date_trunc(`k2`,'month')) `partition_sync_time_unit`:时间单位,支持 DAY/MONTH/YEAR(默认DAY) - @@ -209,9 +208,9 @@ KEY(k1,k2) -SELECT random() as dd,k3 FROM user -``` +`enable_nondeterministic_function`:物化视图定义 SQL 是否允许包含 nondeterministic 函数,比如 current_date(), now(), random()等,如果 +是 true, 允许包含,否则不允许包含, 默认不允许包含。 + +`query`: 创建物化视图的查询语句,其结果即为物化视图中的数据 + ### 示例 -1. 创建一个立即刷新,之后每周刷新一次的物化视图 mv1,数据源为 Hive Catalog +1. 创建一个立即刷新,之后每周刷新一次的物化视图 `mv1`,数据源为 Hive Catalog ```sql CREATE MATERIALIZED VIEW mv1 BUILD IMMEDIATE REFRESH COMPLETE ON SCHEDULE EVERY 1 WEEK - @@ -222,7 +221,7 @@ SELECT random() as dd,k3 FROM user + DISTRIBUTED BY RANDOM BUCKETS 2 + PROPERTIES ( + "replication_num" = "1" + ) AS SELECT * FROM hive_catalog.db1.user; ``` @@ -225,7 +230,10 @@ SELECT random() as dd,k3 FROM user ```sql CREATE MATERIALIZED VIEW mv1 BUILD IMMEDIATE REFRESH COMPLETE ON SCHEDULE EVERY 1 WEEK - @@ -233,7 +232,6 @@ SELECT random() as dd,k3 FROM user + DISTRIBUTED BY RANDOM BUCKETS 2 + PROPERTIES ( + "replication_num" = "1" + ) AS select user.k1,user.k3,com.k4 from user join com on user.k1=com.k1; ``` diff --git a/versioned_docs/version-2.1/query/view-materialized-view/query-async-materialized-view.md b/versioned_docs/version-2.1/query/view-materialized-view/query-async-materialized-view.md index 8f8382ad01c..c73ea97caa2 100644 --- a/versioned_docs/version-2.1/query/view-materialized-view/query-async-materialized-view.md +++ b/versioned_docs/version-2.1/query/view-materialized-view/query-async-materialized-view.md @@ -345,6 +345,73 @@ l_shipdate, l_suppkey; ``` +**Case 3** +Supports transparent rewriting for multidimensional aggregation. That is, if the materialized view does not contain +GROUPING SETS, CUBE, ROLLUP, and there are multidimensional aggregations in the query. Additionally, if the fields +after the group by in the materialized view include all the fields in the multidimensional aggregation in the query, +then transparent rewriting can also be performed. + + +Materialized view definition: +```sql +CREATE MATERIALIZED VIEW mv5_1 +BUILD IMMEDIATE REFRESH AUTO ON SCHEDULE EVERY 1 hour +DISTRIBUTED BY RANDOM BUCKETS 3 +PROPERTIES ('replication_num' = '1') +AS +select o_orderstatus, o_orderdate, o_orderpriority, + sum(o_totalprice) as sum_total, + max(o_totalprice) as max_total, + min(o_totalprice) as min_total, + count(*) as count_all +from orders +group by +o_orderstatus, o_orderdate, o_orderpriority; +``` + +Query statement: +```sql +select o_orderstatus, o_orderdate, o_orderpriority, + sum(o_totalprice), + max(o_totalprice), + min(o_totalprice), + count(*) +from orders +group by +GROUPING SETS ((o_orderstatus, o_orderdate), (o_orderpriority), (o_orderstatus), ()); +``` + + +**Case 4** +When the query contains aggregation and the materialized view does not contain aggregation, if all the columns +used in the query can be obtained from the materialized view, then the rewrite can also be successful. + +Materialized view definition: +```sql +CREATE MATERIALIZED VIEW mv5_2 +BUILD IMMEDIATE REFRESH AUTO ON SCHEDULE EVERY 1 hour +DISTRIBUTED BY RANDOM BUCKETS 3 +PROPERTIES ('replication_num' = '1') +AS +select case when o_shippriority > 1 and o_orderkey IN (4, 5) then o_custkey else o_shippriority end, + o_orderstatus, + bin(o_orderkey) +from orders; +``` + +Query statement: +```sql +select + count(case when o_shippriority > 1 and o_orderkey IN (4, 5) then o_custkey else o_shippriority end), + o_orderstatus, + bin(o_orderkey) +from orders +group by + o_orderstatus, + bin(o_orderkey); +``` + + Temporary support for the aggregation roll-up functions is as follows: | Functions in Queries | Functions in Materialized Views | Aggregation Functions After Rewriting | @@ -587,14 +654,14 @@ you can execute the following statement. It will provide a detailed breakdown of ## Relevant Environment Variables -| Switch | Description | -|---------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------| -| SET enable_nereids_planner = true; | Asynchronous materialized views are only supported under the new optimizer, so the new optimizer needs to be enabled. | -| SET enable_materialized_view_rewrite = true; | Enable or disable query transparent rewriting, default is disabled | -| SET materialized_view_rewrite_enable_contain_external_table = true; | Whether materialized views participating in transparent rewriting are allowed to contain external tables, default is not allowed | -| SET materialized_view_rewrite_success_candidate_num = 3; | Transparently rewrites the successful result set, allowing the maximum number of CBO candidates to participate, the default is 3 | -| SET enable_materialized_view_union_rewrite = true; | Whether to allow the union of base table and materialized view using UNION ALL when the partitioned materialized view is insufficient to provide all the data required by a query. Default is enabled. | -| SET enable_materialized_view_nest_rewrite = true; | Whether to allow nested rewriting. Default is disabled. | +| Switch | Description | +|---------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| SET enable_nereids_planner = true; | Asynchronous materialized views are only supported under the new optimizer, so the new optimizer needs to be enabled. | +| SET enable_materialized_view_rewrite = true; | Enable or disable query transparent rewriting, default is enabled | +| SET materialized_view_rewrite_enable_contain_external_table = true; | Whether materialized views participating in transparent rewriting are allowed to contain external tables, default is not allowed | +| SET materialized_view_rewrite_success_candidate_num = 3; | Transparently rewrites the successful result set, allowing the maximum number of CBO candidates to participate, the default is 3 | +| SET enable_materialized_view_union_rewrite = true; | Whether to allow the union of base table and materialized view using UNION ALL when the partitioned materialized view is insufficient to provide all the data required by a query. Default is enabled. | +| SET enable_materialized_view_nest_rewrite = true; | Whether to allow nested rewriting. Default is disabled. | | SET materialized_view_relation_mapping_max_count = 8; | Maximum number of relation mappings allowed during transparent rewrite. If exceeded, truncation will occur. Relation mapping is typically generated by self-joins in tables and the number is usually the Cartesian product, for example, if there are 3 tables, it may generate 8 combinations. Default is 8. | ## Limitations diff --git a/versioned_docs/version-2.1/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md b/versioned_docs/version-2.1/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md index 57088d7f8db..54e129e2cf3 100644 --- a/versioned_docs/version-2.1/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md +++ b/versioned_docs/version-2.1/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-ASYNC-MATERIALIZED-VIEW.md @@ -214,14 +214,11 @@ The properties unique to materialized views include: `workload_group`: The name of the workload_group used by the materialized view when performing refresh tasks. This is used to limit the resources used for refreshing data in the materialized view, in order to avoid affecting the operation of other business processes. For details on how to create and use workload_group, refer to [WORKLOAD-GROUP](../../../../admin-manual/workload-group.md) -##### query +`query`: Create a query statement for the materialized view, and the result is the data in the materialized view -Create a query statement for the materialized view, and the result is the data in the materialized view - -Random functions are not supported, for example: -```sql -SELECT random() as dd,k3 FROM user -``` +`enable_nondeterministic_function`: Whether the SQL definition of the materialized view allows containing nondeterministic +functions, such as current_date(), now(), random(), etc. If true, they are allowed; otherwise, they are not allowed. +By default, they are not allowed. ### Example --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org