This is an automated email from the ASF dual-hosted git repository. kassiez pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new 80d9f3ad153 add hudi_meta tvf (#1673) 80d9f3ad153 is described below commit 80d9f3ad15382461af55fbd424840e032e73a43d Author: Socrates <suxiaogang...@icloud.com> AuthorDate: Thu Jan 2 15:42:57 2025 +0800 add hudi_meta tvf (#1673) docs about https://github.com/apache/doris/pull/46137 ## Versions - [x] dev - [x] 3.0 - [x] 2.1 - [ ] 2.0 ## Languages - [x] Chinese - [x] English ## Docs Checklist - [ ] Checked by AI - [ ] Test Cases Built --------- Co-authored-by: KassieZ <139741991+kass...@users.noreply.github.com> --- docs/lakehouse/datalake-analytics/hudi.md | 2 + .../table-valued-functions/hudi-meta.md | 95 ++++++++++++++++++++++ .../current/lakehouse/datalake-analytics/hudi.md | 2 + .../table-valued-functions/hudi-meta.md | 91 +++++++++++++++++++++ .../lakehouse/datalake-analytics/hudi.md | 2 + .../table-valued-functions/hudi-meta.md | 91 +++++++++++++++++++++ .../lakehouse/datalake-analytics/hudi.md | 2 + .../table-valued-functions/hudi-meta.md | 90 ++++++++++++++++++++ .../lakehouse/datalake-analytics/hudi.md | 2 + .../table-valued-functions/hudi-meta.md | 95 ++++++++++++++++++++++ .../lakehouse/datalake-analytics/hudi.md | 2 + .../table-valued-functions/hudi-meta.md | 95 ++++++++++++++++++++++ 12 files changed, 569 insertions(+) diff --git a/docs/lakehouse/datalake-analytics/hudi.md b/docs/lakehouse/datalake-analytics/hudi.md index 955981b97c9..0fb2fe0f3e5 100644 --- a/docs/lakehouse/datalake-analytics/hudi.md +++ b/docs/lakehouse/datalake-analytics/hudi.md @@ -107,6 +107,8 @@ You can use the `FOR TIME AS OF` statement, based on the time of the snapshot to Hudi table does not support the `FOR VERSION AS OF` statement. Using this syntax to query the Hudi table will throw an error. +In addition, you can use the [hudi_meta](../../sql-manual/sql-functions/table-valued-functions/hudi-meta.md) table function to query the timeline information of the specified table. + ## Incremental Read Incremental Read can query the data changed between startTime and endTime, and the returned result set is the final state of the data at endTime. diff --git a/docs/sql-manual/sql-functions/table-valued-functions/hudi-meta.md b/docs/sql-manual/sql-functions/table-valued-functions/hudi-meta.md new file mode 100644 index 00000000000..5e6698dbdfc --- /dev/null +++ b/docs/sql-manual/sql-functions/table-valued-functions/hudi-meta.md @@ -0,0 +1,95 @@ +--- +{ +"title": "HUDI_META", +"language": "en" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + + +## Description + +hudi_meta table-valued-function(tvf), using for read hudi metadata, operation history, timeline of table, instant state etc. + +## Syntax + +```sql +hudi_meta( + "table" = "ctl.db.tbl", + "query_type" = "timeline" + ... + ); +``` + +**parameter description** + +Each parameter in hudi_meta tvf is a pair of `"key"="value"`. + +Related parameters: +- `table`: (required) Use hudi table name the format `catlog.database.table`. +- `query_type`: (required) The type of hudi metadata. Only `timeline` is currently supported. + +## Example + +Read and access the hudi tabular metadata for timeline. + +```sql +select * from hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); + +``` + +Can be used with `desc function` : + +```sql +desc function hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); +``` + +## Keywords + + hudi_meta, table-valued-function, tvf + +## Best Practice + +Inspect the hudi table timeline : + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline"); ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | +| 20240724195845718 | commit | 20240724195845718.commit | COMPLETED | 20240724195846653 | +| 20240724195848377 | commit | 20240724195848377.commit | COMPLETED | 20240724195849337 | +| 20240724195850799 | commit | 20240724195850799.commit | COMPLETED | 20240724195851676 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` + +Filtered by timestamp : + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline") +where timestamp = 20240724195843565; ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hudi.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hudi.md index 00a566c02cc..b7f4b3a0a59 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hudi.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hudi.md @@ -105,6 +105,8 @@ SELECT * FROM hudi_tbl FOR TIME AS OF "2022-10-07"; ``` Hudi 表不支持 `FOR VERSION AS OF` 语句,使用该语法查询 Hudi 表将抛错。 +另外, 你可以使用 [hudi_meta](../../sql-manual/sql-functions/table-valued-functions/hudi-meta.md) 表函数查询 Hudi 表的时间线,获取 commitTime 和对应的快照时间。 + ## Incremental Read Incremental Read 可以查询在 startTime 和 endTime 之间变化的数据,返回的结果集是数据在 endTime 的最终状态。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/table-valued-functions/hudi-meta.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/table-valued-functions/hudi-meta.md new file mode 100644 index 00000000000..aac1d312743 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/table-valued-functions/hudi-meta.md @@ -0,0 +1,91 @@ +--- +{ +"title": "HUDI_META", +"language": "zh-CN" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + + +## 描述 + +hudi_meta 表函数(table-valued-function,tvf),可以用于读取 hudi 表的各类元数据信息,如操作历史、表的时间线、文件元数据等。 + +## 语法 + +```sql +hudi_meta( + "table" = "ctl.db.tbl", + "query_type" = "timeline" + ... + ); +``` + +**参数说明** + +hudi_meta 表函数 tvf 中的每一个参数都是一个 `"key"="value"` 对。 +相关参数: +- `table`: (必填) 完整的表名,需要按照目录名。库名。表名的格式,填写需要查看的 hudi 表名。 +- `query_type`: (必填) 想要查看的元数据类型,目前仅支持 timeline。 + +## 举例 + +读取并访问 hudi 表格式的 timeline 元数据。 + +```sql +select * from hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); + +``` + +可以配合`desc function`使用 + +```sql +desc function hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); +``` + + +## 最佳实践 + +查看 hudi 表的 timeline + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline"); ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | +| 20240724195845718 | commit | 20240724195845718.commit | COMPLETED | 20240724195846653 | +| 20240724195848377 | commit | 20240724195848377.commit | COMPLETED | 20240724195849337 | +| 20240724195850799 | commit | 20240724195850799.commit | COMPLETED | 20240724195851676 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` + +根据 timestamp 字段筛选 + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline") +where timestamp = 20240724195843565; ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hudi.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hudi.md index d9e86b4ae97..4186dc0db66 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hudi.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hudi.md @@ -105,6 +105,8 @@ SELECT * FROM hudi_tbl FOR TIME AS OF "2022-10-07"; ``` Hudi 表不支持 `FOR VERSION AS OF` 语句,使用该语法查询 Hudi 表将抛错。 +另外, 你可以使用 [hudi_meta](../../sql-manual/sql-functions/table-valued-functions/hudi-meta.md) 表函数查询 Hudi 表的时间线,获取 commitTime 和对应的快照时间。 + ## Incremental Read Incremental Read 可以查询在 startTime 和 endTime 之间变化的数据,返回的结果集是数据在 endTime 的最终状态。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-functions/table-valued-functions/hudi-meta.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-functions/table-valued-functions/hudi-meta.md new file mode 100644 index 00000000000..aac1d312743 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-functions/table-valued-functions/hudi-meta.md @@ -0,0 +1,91 @@ +--- +{ +"title": "HUDI_META", +"language": "zh-CN" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + + +## 描述 + +hudi_meta 表函数(table-valued-function,tvf),可以用于读取 hudi 表的各类元数据信息,如操作历史、表的时间线、文件元数据等。 + +## 语法 + +```sql +hudi_meta( + "table" = "ctl.db.tbl", + "query_type" = "timeline" + ... + ); +``` + +**参数说明** + +hudi_meta 表函数 tvf 中的每一个参数都是一个 `"key"="value"` 对。 +相关参数: +- `table`: (必填) 完整的表名,需要按照目录名。库名。表名的格式,填写需要查看的 hudi 表名。 +- `query_type`: (必填) 想要查看的元数据类型,目前仅支持 timeline。 + +## 举例 + +读取并访问 hudi 表格式的 timeline 元数据。 + +```sql +select * from hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); + +``` + +可以配合`desc function`使用 + +```sql +desc function hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); +``` + + +## 最佳实践 + +查看 hudi 表的 timeline + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline"); ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | +| 20240724195845718 | commit | 20240724195845718.commit | COMPLETED | 20240724195846653 | +| 20240724195848377 | commit | 20240724195848377.commit | COMPLETED | 20240724195849337 | +| 20240724195850799 | commit | 20240724195850799.commit | COMPLETED | 20240724195851676 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` + +根据 timestamp 字段筛选 + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline") +where timestamp = 20240724195843565; ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/lakehouse/datalake-analytics/hudi.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/lakehouse/datalake-analytics/hudi.md index 00a566c02cc..b7f4b3a0a59 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/lakehouse/datalake-analytics/hudi.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/lakehouse/datalake-analytics/hudi.md @@ -105,6 +105,8 @@ SELECT * FROM hudi_tbl FOR TIME AS OF "2022-10-07"; ``` Hudi 表不支持 `FOR VERSION AS OF` 语句,使用该语法查询 Hudi 表将抛错。 +另外, 你可以使用 [hudi_meta](../../sql-manual/sql-functions/table-valued-functions/hudi-meta.md) 表函数查询 Hudi 表的时间线,获取 commitTime 和对应的快照时间。 + ## Incremental Read Incremental Read 可以查询在 startTime 和 endTime 之间变化的数据,返回的结果集是数据在 endTime 的最终状态。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-functions/table-valued-functions/hudi-meta.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-functions/table-valued-functions/hudi-meta.md new file mode 100644 index 00000000000..7c84e08fb45 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-functions/table-valued-functions/hudi-meta.md @@ -0,0 +1,90 @@ +--- +{ +"title": "HUDI_META", +"language": "zh-CN" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +## 描述 + +hudi_meta 表函数(table-valued-function,tvf),可以用于读取 hudi 表的各类元数据信息,如操作历史、表的时间线、文件元数据等。 + +## 语法 + +```sql +hudi_meta( + "table" = "ctl.db.tbl", + "query_type" = "timeline" + ... + ); +``` + +**参数说明** + +hudi_meta 表函数 tvf 中的每一个参数都是一个 `"key"="value"` 对。 +相关参数: +- `table`: (必填) 完整的表名,需要按照目录名。库名。表名的格式,填写需要查看的 hudi 表名。 +- `query_type`: (必填) 想要查看的元数据类型,目前仅支持 timeline。 + +## 举例 + +读取并访问 hudi 表格式的 timeline 元数据。 + +```sql +select * from hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); + +``` + +可以配合`desc function`使用 + +```sql +desc function hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); +``` + + +## 最佳实践 + +查看 hudi 表的 timeline + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline"); ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | +| 20240724195845718 | commit | 20240724195845718.commit | COMPLETED | 20240724195846653 | +| 20240724195848377 | commit | 20240724195848377.commit | COMPLETED | 20240724195849337 | +| 20240724195850799 | commit | 20240724195850799.commit | COMPLETED | 20240724195851676 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` + +根据 timestamp 字段筛选 + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline") +where timestamp = 20240724195843565; ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` diff --git a/versioned_docs/version-2.1/lakehouse/datalake-analytics/hudi.md b/versioned_docs/version-2.1/lakehouse/datalake-analytics/hudi.md index 955981b97c9..0fb2fe0f3e5 100644 --- a/versioned_docs/version-2.1/lakehouse/datalake-analytics/hudi.md +++ b/versioned_docs/version-2.1/lakehouse/datalake-analytics/hudi.md @@ -107,6 +107,8 @@ You can use the `FOR TIME AS OF` statement, based on the time of the snapshot to Hudi table does not support the `FOR VERSION AS OF` statement. Using this syntax to query the Hudi table will throw an error. +In addition, you can use the [hudi_meta](../../sql-manual/sql-functions/table-valued-functions/hudi-meta.md) table function to query the timeline information of the specified table. + ## Incremental Read Incremental Read can query the data changed between startTime and endTime, and the returned result set is the final state of the data at endTime. diff --git a/versioned_docs/version-2.1/sql-manual/sql-functions/table-valued-functions/hudi-meta.md b/versioned_docs/version-2.1/sql-manual/sql-functions/table-valued-functions/hudi-meta.md new file mode 100644 index 00000000000..5e6698dbdfc --- /dev/null +++ b/versioned_docs/version-2.1/sql-manual/sql-functions/table-valued-functions/hudi-meta.md @@ -0,0 +1,95 @@ +--- +{ +"title": "HUDI_META", +"language": "en" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + + +## Description + +hudi_meta table-valued-function(tvf), using for read hudi metadata, operation history, timeline of table, instant state etc. + +## Syntax + +```sql +hudi_meta( + "table" = "ctl.db.tbl", + "query_type" = "timeline" + ... + ); +``` + +**parameter description** + +Each parameter in hudi_meta tvf is a pair of `"key"="value"`. + +Related parameters: +- `table`: (required) Use hudi table name the format `catlog.database.table`. +- `query_type`: (required) The type of hudi metadata. Only `timeline` is currently supported. + +## Example + +Read and access the hudi tabular metadata for timeline. + +```sql +select * from hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); + +``` + +Can be used with `desc function` : + +```sql +desc function hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); +``` + +## Keywords + + hudi_meta, table-valued-function, tvf + +## Best Practice + +Inspect the hudi table timeline : + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline"); ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | +| 20240724195845718 | commit | 20240724195845718.commit | COMPLETED | 20240724195846653 | +| 20240724195848377 | commit | 20240724195848377.commit | COMPLETED | 20240724195849337 | +| 20240724195850799 | commit | 20240724195850799.commit | COMPLETED | 20240724195851676 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` + +Filtered by timestamp : + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline") +where timestamp = 20240724195843565; ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` diff --git a/versioned_docs/version-3.0/lakehouse/datalake-analytics/hudi.md b/versioned_docs/version-3.0/lakehouse/datalake-analytics/hudi.md index 955981b97c9..0fb2fe0f3e5 100644 --- a/versioned_docs/version-3.0/lakehouse/datalake-analytics/hudi.md +++ b/versioned_docs/version-3.0/lakehouse/datalake-analytics/hudi.md @@ -107,6 +107,8 @@ You can use the `FOR TIME AS OF` statement, based on the time of the snapshot to Hudi table does not support the `FOR VERSION AS OF` statement. Using this syntax to query the Hudi table will throw an error. +In addition, you can use the [hudi_meta](../../sql-manual/sql-functions/table-valued-functions/hudi-meta.md) table function to query the timeline information of the specified table. + ## Incremental Read Incremental Read can query the data changed between startTime and endTime, and the returned result set is the final state of the data at endTime. diff --git a/versioned_docs/version-3.0/sql-manual/sql-functions/table-valued-functions/hudi-meta.md b/versioned_docs/version-3.0/sql-manual/sql-functions/table-valued-functions/hudi-meta.md new file mode 100644 index 00000000000..5e6698dbdfc --- /dev/null +++ b/versioned_docs/version-3.0/sql-manual/sql-functions/table-valued-functions/hudi-meta.md @@ -0,0 +1,95 @@ +--- +{ +"title": "HUDI_META", +"language": "en" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + + +## Description + +hudi_meta table-valued-function(tvf), using for read hudi metadata, operation history, timeline of table, instant state etc. + +## Syntax + +```sql +hudi_meta( + "table" = "ctl.db.tbl", + "query_type" = "timeline" + ... + ); +``` + +**parameter description** + +Each parameter in hudi_meta tvf is a pair of `"key"="value"`. + +Related parameters: +- `table`: (required) Use hudi table name the format `catlog.database.table`. +- `query_type`: (required) The type of hudi metadata. Only `timeline` is currently supported. + +## Example + +Read and access the hudi tabular metadata for timeline. + +```sql +select * from hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); + +``` + +Can be used with `desc function` : + +```sql +desc function hudi_meta("table" = "ctl.db.tbl", "query_type" = "timeline"); +``` + +## Keywords + + hudi_meta, table-valued-function, tvf + +## Best Practice + +Inspect the hudi table timeline : + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline"); ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | +| 20240724195845718 | commit | 20240724195845718.commit | COMPLETED | 20240724195846653 | +| 20240724195848377 | commit | 20240724195848377.commit | COMPLETED | 20240724195849337 | +| 20240724195850799 | commit | 20240724195850799.commit | COMPLETED | 20240724195851676 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` + +Filtered by timestamp : + +```sql +select * from hudi_meta("table" = "hudi_ctl.test_db.test_tbl", "query_type" = "timeline") +where timestamp = 20240724195843565; ++-------------------+--------+--------------------------+-----------+-----------------------+ +| timestamp | action | file_name | state | state_transition_time | ++-------------------+--------+--------------------------+-----------+-----------------------+ +| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269 | ++-------------------+--------+--------------------------+-----------+-----------------------+ +``` --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org