This is an automated email from the ASF dual-hosted git repository. eldenmoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new 15ab1ce13c modify github events sample download link, and add some FQA (#1110) 15ab1ce13c is described below commit 15ab1ce13cd17ca8395b560272f6e2aa8f4ed08f Author: lihangyu <15605149...@163.com> AuthorDate: Wed Sep 18 09:36:37 2024 +0800 modify github events sample download link, and add some FQA (#1110) # Versions - [x] dev - [x] 3.0 - [x] 2.1 - [ ] 2.0 # Languages - [x] Chinese - [x] English --- blog/variant-in-apache-doris-2.1.md | 5 +++-- docs/sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++-- .../sql-manual/sql-data-types/semi-structured/VARIANT.md | 10 ++++++++-- .../sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++-- .../sql-manual/sql-data-types/semi-structured/VARIANT.md | 7 ++++++- .../sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++-- .../sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++-- .../sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++-- .../sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++-- 9 files changed, 59 insertions(+), 17 deletions(-) diff --git a/blog/variant-in-apache-doris-2.1.md b/blog/variant-in-apache-doris-2.1.md index efe2fce906..3056e69f9f 100644 --- a/blog/variant-in-apache-doris-2.1.md +++ b/blog/variant-in-apache-doris-2.1.md @@ -157,7 +157,7 @@ properties("replication_num" = "1"); Load the `gh_2022-11-07-3.json` file, which is Github Events records of an hour. One formatted row of it looks like this: ```JSON -wget http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json +wget https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json curl --location-trusted -u root: -T gh_2022-11-07-3.json -H "read_json_by_line:true" -H "format:json" http://127.0.0.1:18148/api/test_variant/github_events/_strea m_load @@ -286,9 +286,10 @@ mysql> SELECT 2. Count the number of events containing the keyword `doris`. ```sql +-- implicit cast `payload['comment']['body']` to string type mysql> SELECT -> count() FROM github_events - -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris'; + -> WHERE payload['comment']['body'] MATCH 'doris'; +---------+ | count() | +---------+ diff --git a/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md b/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md index 0b25e2bf47..83280efd71 100644 --- a/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md +++ b/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md @@ -171,7 +171,7 @@ properties("replication_num" = "1"); Importing gh_2022-11-07-3.json, which contains one hour's worth of GitHub events data. ``` shell -wget http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json +wget https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json curl --location-trusted -u root: -T gh_2022-11-07-3.json -H "read_json_by_line:true" -H "format:json" http://127.0.0.1:18148/api/test_variant/github_events/_strea m_load @@ -300,9 +300,10 @@ mysql> SELECT 2. Retrieve the count of comments containing "doris". ``` sql +-- implicit cast `payload['comment']['body']` to string type mysql> SELECT -> count() FROM github_events - -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris'; + -> WHERE payload['comment']['body'] MATCH 'doris'; +---------+ | count() | +---------+ @@ -360,6 +361,10 @@ When the above types cannot be compatible, they will be transformed into JSON ty - Not supported as primary or sort keys. - Queries with filters or aggregations require casting. The storage layer eliminates cast operations based on storage type and the target type of the cast, speeding up queries. +### FAQ +1.Streamload Error: [CANCELLED][INTERNAL_ERROR] tablet error: [DATA_QUALITY_ERROR] Reached max column size limit 2048. +Due to compaction and metadata storage limitations, the VARIANT type imposes a limit on the number of columns, with the default being 2048 columns. You can adjust the BE configuration `variant_max_merged_tablet_schema_size` accordingly, but it is not recommended to exceed 4096 columns. + ### Keywords VARIANT diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-data-types/semi-structured/VARIANT.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-data-types/semi-structured/VARIANT.md index c3996a9f1b..3213e74dd5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-data-types/semi-structured/VARIANT.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-data-types/semi-structured/VARIANT.md @@ -171,7 +171,7 @@ properties("replication_num" = "1"); 导入 gh_2022-11-07-3.json,这是 github events 一个小时的数据 ``` shell -wget http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json +wget https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json curl --location-trusted -u root: -T gh_2022-11-07-3.json -H "read_json_by_line:true" -H "format:json" http://127.0.0.1:18148/api/test_variant/github_events/_strea m_load @@ -304,9 +304,10 @@ mysql> SELECT 2. 获取评论中包含 doris 的数量 ``` sql +-- implicit cast `payload['comment']['body']` to string type mysql> SELECT -> count() FROM github_events - -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris'; + -> WHERE payload['comment']['body'] MATCH 'doris'; +---------+ | count() | +---------+ @@ -364,6 +365,11 @@ VARIANT 动态列与预定义静态列几乎一样高效。处理诸如日志之 - 不支持作为主键或者排序键 - 查询过滤、聚合需要带 cast,存储层会根据存储类型和 cast 目标类型来消除 cast 操作,加速查询。 +### FAQ +1. Stream Load 报错: [CANCELLED][INTERNAL_ERROR]tablet error: [DATA_QUALITY_ERROR]Reached max column size limit 2048。 +由于 Compaction 和元信息存储限制, VARIANT 类型会限制列数,默认 2048 列,可以适当调整 BE 配置 `variant_max_merged_tablet_schema_size` , 但是不建议超过 4096 + + ### Keywords VARIANT diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md index c3996a9f1b..f6a6260c58 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md @@ -171,7 +171,7 @@ properties("replication_num" = "1"); 导入 gh_2022-11-07-3.json,这是 github events 一个小时的数据 ``` shell -wget http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json +wget https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json curl --location-trusted -u root: -T gh_2022-11-07-3.json -H "read_json_by_line:true" -H "format:json" http://127.0.0.1:18148/api/test_variant/github_events/_strea m_load @@ -304,9 +304,10 @@ mysql> SELECT 2. 获取评论中包含 doris 的数量 ``` sql +-- implicit cast `payload['comment']['body']` to string type mysql> SELECT -> count() FROM github_events - -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris'; + -> WHERE payload['comment']['body'] MATCH 'doris'; +---------+ | count() | +---------+ @@ -364,6 +365,10 @@ VARIANT 动态列与预定义静态列几乎一样高效。处理诸如日志之 - 不支持作为主键或者排序键 - 查询过滤、聚合需要带 cast,存储层会根据存储类型和 cast 目标类型来消除 cast 操作,加速查询。 +### FAQ +1. Stream Load 报错: [CANCELLED][INTERNAL_ERROR]tablet error: [DATA_QUALITY_ERROR]Reached max column size limit 2048。 +由于 Compaction 和元信息存储限制, VARIANT 类型会限制列数,默认 2048 列,可以适当调整 BE 配置 `variant_max_merged_tablet_schema_size` , 但是不建议超过 4096 + ### Keywords VARIANT diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md index c3996a9f1b..cfc687415b 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md @@ -171,7 +171,7 @@ properties("replication_num" = "1"); 导入 gh_2022-11-07-3.json,这是 github events 一个小时的数据 ``` shell -wget http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json +wget https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json curl --location-trusted -u root: -T gh_2022-11-07-3.json -H "read_json_by_line:true" -H "format:json" http://127.0.0.1:18148/api/test_variant/github_events/_strea m_load @@ -304,6 +304,7 @@ mysql> SELECT 2. 获取评论中包含 doris 的数量 ``` sql +-- implicit cast `payload['comment']['body']` to string type mysql> SELECT -> count() FROM github_events -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris'; @@ -364,6 +365,10 @@ VARIANT 动态列与预定义静态列几乎一样高效。处理诸如日志之 - 不支持作为主键或者排序键 - 查询过滤、聚合需要带 cast,存储层会根据存储类型和 cast 目标类型来消除 cast 操作,加速查询。 +### FAQ +1. Stream Load 报错: [CANCELLED][INTERNAL_ERROR]tablet error: [DATA_QUALITY_ERROR]Reached max column size limit 2048。 +由于 Compaction 和元信息存储限制, VARIANT 类型会限制列数,默认 2048 列,可以适当调整 BE 配置 `variant_max_merged_tablet_schema_size` , 但是不建议超过 4096 + ### Keywords VARIANT diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md index c3996a9f1b..f6a6260c58 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md @@ -171,7 +171,7 @@ properties("replication_num" = "1"); 导入 gh_2022-11-07-3.json,这是 github events 一个小时的数据 ``` shell -wget http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json +wget https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json curl --location-trusted -u root: -T gh_2022-11-07-3.json -H "read_json_by_line:true" -H "format:json" http://127.0.0.1:18148/api/test_variant/github_events/_strea m_load @@ -304,9 +304,10 @@ mysql> SELECT 2. 获取评论中包含 doris 的数量 ``` sql +-- implicit cast `payload['comment']['body']` to string type mysql> SELECT -> count() FROM github_events - -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris'; + -> WHERE payload['comment']['body'] MATCH 'doris'; +---------+ | count() | +---------+ @@ -364,6 +365,10 @@ VARIANT 动态列与预定义静态列几乎一样高效。处理诸如日志之 - 不支持作为主键或者排序键 - 查询过滤、聚合需要带 cast,存储层会根据存储类型和 cast 目标类型来消除 cast 操作,加速查询。 +### FAQ +1. Stream Load 报错: [CANCELLED][INTERNAL_ERROR]tablet error: [DATA_QUALITY_ERROR]Reached max column size limit 2048。 +由于 Compaction 和元信息存储限制, VARIANT 类型会限制列数,默认 2048 列,可以适当调整 BE 配置 `variant_max_merged_tablet_schema_size` , 但是不建议超过 4096 + ### Keywords VARIANT diff --git a/versioned_docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md b/versioned_docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md index 0b25e2bf47..83280efd71 100644 --- a/versioned_docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md +++ b/versioned_docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md @@ -171,7 +171,7 @@ properties("replication_num" = "1"); Importing gh_2022-11-07-3.json, which contains one hour's worth of GitHub events data. ``` shell -wget http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json +wget https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json curl --location-trusted -u root: -T gh_2022-11-07-3.json -H "read_json_by_line:true" -H "format:json" http://127.0.0.1:18148/api/test_variant/github_events/_strea m_load @@ -300,9 +300,10 @@ mysql> SELECT 2. Retrieve the count of comments containing "doris". ``` sql +-- implicit cast `payload['comment']['body']` to string type mysql> SELECT -> count() FROM github_events - -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris'; + -> WHERE payload['comment']['body'] MATCH 'doris'; +---------+ | count() | +---------+ @@ -360,6 +361,10 @@ When the above types cannot be compatible, they will be transformed into JSON ty - Not supported as primary or sort keys. - Queries with filters or aggregations require casting. The storage layer eliminates cast operations based on storage type and the target type of the cast, speeding up queries. +### FAQ +1.Streamload Error: [CANCELLED][INTERNAL_ERROR] tablet error: [DATA_QUALITY_ERROR] Reached max column size limit 2048. +Due to compaction and metadata storage limitations, the VARIANT type imposes a limit on the number of columns, with the default being 2048 columns. You can adjust the BE configuration `variant_max_merged_tablet_schema_size` accordingly, but it is not recommended to exceed 4096 columns. + ### Keywords VARIANT diff --git a/versioned_docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md b/versioned_docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md index 0b25e2bf47..83280efd71 100644 --- a/versioned_docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md +++ b/versioned_docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md @@ -171,7 +171,7 @@ properties("replication_num" = "1"); Importing gh_2022-11-07-3.json, which contains one hour's worth of GitHub events data. ``` shell -wget http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json +wget https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json curl --location-trusted -u root: -T gh_2022-11-07-3.json -H "read_json_by_line:true" -H "format:json" http://127.0.0.1:18148/api/test_variant/github_events/_strea m_load @@ -300,9 +300,10 @@ mysql> SELECT 2. Retrieve the count of comments containing "doris". ``` sql +-- implicit cast `payload['comment']['body']` to string type mysql> SELECT -> count() FROM github_events - -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris'; + -> WHERE payload['comment']['body'] MATCH 'doris'; +---------+ | count() | +---------+ @@ -360,6 +361,10 @@ When the above types cannot be compatible, they will be transformed into JSON ty - Not supported as primary or sort keys. - Queries with filters or aggregations require casting. The storage layer eliminates cast operations based on storage type and the target type of the cast, speeding up queries. +### FAQ +1.Streamload Error: [CANCELLED][INTERNAL_ERROR] tablet error: [DATA_QUALITY_ERROR] Reached max column size limit 2048. +Due to compaction and metadata storage limitations, the VARIANT type imposes a limit on the number of columns, with the default being 2048 columns. You can adjust the BE configuration `variant_max_merged_tablet_schema_size` accordingly, but it is not recommended to exceed 4096 columns. + ### Keywords VARIANT diff --git a/versioned_docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md b/versioned_docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md index 0b25e2bf47..83280efd71 100644 --- a/versioned_docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md +++ b/versioned_docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md @@ -171,7 +171,7 @@ properties("replication_num" = "1"); Importing gh_2022-11-07-3.json, which contains one hour's worth of GitHub events data. ``` shell -wget http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json +wget https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json curl --location-trusted -u root: -T gh_2022-11-07-3.json -H "read_json_by_line:true" -H "format:json" http://127.0.0.1:18148/api/test_variant/github_events/_strea m_load @@ -300,9 +300,10 @@ mysql> SELECT 2. Retrieve the count of comments containing "doris". ``` sql +-- implicit cast `payload['comment']['body']` to string type mysql> SELECT -> count() FROM github_events - -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris'; + -> WHERE payload['comment']['body'] MATCH 'doris'; +---------+ | count() | +---------+ @@ -360,6 +361,10 @@ When the above types cannot be compatible, they will be transformed into JSON ty - Not supported as primary or sort keys. - Queries with filters or aggregations require casting. The storage layer eliminates cast operations based on storage type and the target type of the cast, speeding up queries. +### FAQ +1.Streamload Error: [CANCELLED][INTERNAL_ERROR] tablet error: [DATA_QUALITY_ERROR] Reached max column size limit 2048. +Due to compaction and metadata storage limitations, the VARIANT type imposes a limit on the number of columns, with the default being 2048 columns. You can adjust the BE configuration `variant_max_merged_tablet_schema_size` accordingly, but it is not recommended to exceed 4096 columns. + ### Keywords VARIANT --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org