This is an automated email from the ASF dual-hosted git repository. dataroaring pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new f5b2d88d1e [doc] Bitmap type can be used in Duplicate tables (#850) f5b2d88d1e is described below commit f5b2d88d1e00a434604b948add4f501292440f80 Author: bobhan1 <bh2444151...@outlook.com> AuthorDate: Fri Jul 12 20:25:46 2024 +0800 [doc] Bitmap type can be used in Duplicate tables (#850) --- docs/table-design/data-type.md | 2 +- .../docusaurus-plugin-content-docs/current/table-design/data-type.md | 2 +- .../version-2.1/table-design/data-type.md | 2 +- versioned_docs/version-2.1/table-design/data-type.md | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/table-design/data-type.md b/docs/table-design/data-type.md index 0cfb9e8f55..59afea441b 100644 --- a/docs/table-design/data-type.md +++ b/docs/table-design/data-type.md @@ -50,7 +50,7 @@ The list of data types supported by Doris is as follows: | JSON | Variable Length | Binary JSON type, stored in binary JSON format, access internal JSON fields through JSON function. Supported up to 1048576 bytes (1MB) by default, and can be adjusted to a maximum of 2147483643 bytes (2GB). This limit can be modified through the BE configuration parameter 'jsonb_type_length_soft_limit_bytes'. | | VARIANT | Variable Length | The VARIANT data type is dynamically adaptable, specifically designed for semi-structured data like JSON. It can store any JSON object and automatically splits JSON fields into subcolumns for improved storage efficiency and query performance. The length limits and configuration methods are the same as for the STRING type. However, the VARIANT type can only be used in value columns and cannot be used in key columns or partition / bucket columns. | | HLL | Variable Length | HLL stands for HyperLogLog, is a fuzzy deduplication. It performs better than Count Distinct when dealing with large datasets. The error rate of HLL is typically around 1%, and sometimes it can reach 2%. HLL cannot be used as a key column, and the aggregation type is HLL_UNION when creating a table. Users do not need to specify the length or default value as it is internally controlled based on the aggregation level of the data. HLL columns can on [...] -| BITMAP | Variable Length | BITMAP type can be used in Aggregate tables or Unique tables. - When used in a Unique table, BITMAP must be employed as non-key columns. - When used in an Aggregate table, BITMAP must also serve as non-key columns, and the aggregation type must be set to BITMAP_UNION during table creation. Users do not need to specify the length or default value as it is internally controlled based on the aggregation level of the data. BITMAP columns can only be qu [...] +| BITMAP | Variable Length | BITMAP type can be used in Aggregate tables, Unique tables or Duplicate tables. - When used in a Unique table or a Duplicate table, BITMAP must be employed as non-key columns. - When used in an Aggregate table, BITMAP must also serve as non-key columns, and the aggregation type must be set to BITMAP_UNION during table creation. Users do not need to specify the length or default value as it is internally controlled based on the aggregation level of [...] | QUANTILE_STATE | Variable Length | A type used to calculate approximate quantile values. When loading, it performs pre-aggregation for the same keys with different values. When the number of values does not exceed 2048, it records all data in detail. When the number of values is greater than 2048, it employs the TDigest algorithm to aggregate (cluster) the data and store the centroid points after clustering. QUANTILE_STATE cannot be used as a key column and should be paired with the [...] | AGG_STATE | Variable Length | Aggregate function can only be used with state/merge/union function combiners. AGG_STATE cannot be used as a key column. When creating a table, the signature of the aggregate function needs to be declared alongside. Users do not need to specify the length or default value. The actual data storage size depends on the function's implementation. | diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md index ab2039050a..e9cee2725c 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md @@ -52,7 +52,7 @@ Apache Doris 已支持的数据类型列表如下: | JSON | 不定长 | 二进制 JSON 类型,采用二进制 JSON 格式存储,通过 JSON 函数访问 JSON 内部字段。长度限制和配置方式与 String 相同 | | VARIANT | 不定长 | 动态可变数据类型,专为半结构化数据如 JSON 设计,可以存入任意 JSON,自动将 JSON 中的字段拆分成子列存储,提升存储效率和查询分析性能。长度限制和配置方式与 String 相同。Variant 类型只能用在 Value 列,不能用在 Key 列和分区分桶列。| | HLL | 不定长 | HLL 是模糊去重,在数据量大的情况性能优于 Count Distinct。HLL 的误差通常在 1% 左右,有时会达到 2%。HLL 不能作为 Key 列使用,建表时配合聚合类型为 HLL_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。HLL 列只能通过配套的 hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 进行查询或使用。</p> | -| BITMAP | 不定长 | Bitmap 类型的列可以在 Aggregate 表或 Unique 表中使用。在 Unique 表中使用时,其必须作为非 Key 列使用。在 Aggregate 表中使用时,其必须作为非 Key 列使用,且建表时配合的聚合类型为 BITMAP_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。BITMAP 列只能通过配套的 bitmap_union_count、bitmap_union、bitmap_hash、bitmap_hash64 等函数进行查询或使用。</p> | +| BITMAP | 不定长 | Bitmap 类型的列可以在 Aggregate 表、Unique 表或 Duplicate 表中使用。在 Unique 表或 Duplicate 表中使用时,其必须作为非 Key 列使用。在 Aggregate 表中使用时,其必须作为非 Key 列使用,且建表时配合的聚合类型为 BITMAP_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。BITMAP 列只能通过配套的 bitmap_union_count、bitmap_union、bitmap_hash、bitmap_hash64 等函数进行查询或使用。</p> | | QUANTILE_STATE | 不定长 | QUANTILE_STATE 是一种计算分位数近似值的类型,在导入时会对相同的 Key,不同 Value 进行预聚合,当 value 数量不超过 2048 时采用明细记录所有数据,当 Value 数量大于 2048 时采用 TDigest 算法,对数据进行聚合(聚类)保存聚类后的质心点。QUANTILE_STATE 不能作为 Key 列使用,建表时配合聚合类型为 QUANTILE_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。QUANTILE_STATE 列只能通过配套的 QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数进行查询或使用。</p> | | AGG_STATE | 不定长 | 聚合函数,只能配合 state/merge/union 函数组合器使用。AGG_STATE 不能作为 Key 列使用,建表时需要同时声明聚合函数的签名。用户不需要指定长度和默认值。实际存储的数据大小与函数实现有关。 | diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md index ab2039050a..e9cee2725c 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md @@ -52,7 +52,7 @@ Apache Doris 已支持的数据类型列表如下: | JSON | 不定长 | 二进制 JSON 类型,采用二进制 JSON 格式存储,通过 JSON 函数访问 JSON 内部字段。长度限制和配置方式与 String 相同 | | VARIANT | 不定长 | 动态可变数据类型,专为半结构化数据如 JSON 设计,可以存入任意 JSON,自动将 JSON 中的字段拆分成子列存储,提升存储效率和查询分析性能。长度限制和配置方式与 String 相同。Variant 类型只能用在 Value 列,不能用在 Key 列和分区分桶列。| | HLL | 不定长 | HLL 是模糊去重,在数据量大的情况性能优于 Count Distinct。HLL 的误差通常在 1% 左右,有时会达到 2%。HLL 不能作为 Key 列使用,建表时配合聚合类型为 HLL_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。HLL 列只能通过配套的 hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 进行查询或使用。</p> | -| BITMAP | 不定长 | Bitmap 类型的列可以在 Aggregate 表或 Unique 表中使用。在 Unique 表中使用时,其必须作为非 Key 列使用。在 Aggregate 表中使用时,其必须作为非 Key 列使用,且建表时配合的聚合类型为 BITMAP_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。BITMAP 列只能通过配套的 bitmap_union_count、bitmap_union、bitmap_hash、bitmap_hash64 等函数进行查询或使用。</p> | +| BITMAP | 不定长 | Bitmap 类型的列可以在 Aggregate 表、Unique 表或 Duplicate 表中使用。在 Unique 表或 Duplicate 表中使用时,其必须作为非 Key 列使用。在 Aggregate 表中使用时,其必须作为非 Key 列使用,且建表时配合的聚合类型为 BITMAP_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。BITMAP 列只能通过配套的 bitmap_union_count、bitmap_union、bitmap_hash、bitmap_hash64 等函数进行查询或使用。</p> | | QUANTILE_STATE | 不定长 | QUANTILE_STATE 是一种计算分位数近似值的类型,在导入时会对相同的 Key,不同 Value 进行预聚合,当 value 数量不超过 2048 时采用明细记录所有数据,当 Value 数量大于 2048 时采用 TDigest 算法,对数据进行聚合(聚类)保存聚类后的质心点。QUANTILE_STATE 不能作为 Key 列使用,建表时配合聚合类型为 QUANTILE_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。QUANTILE_STATE 列只能通过配套的 QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数进行查询或使用。</p> | | AGG_STATE | 不定长 | 聚合函数,只能配合 state/merge/union 函数组合器使用。AGG_STATE 不能作为 Key 列使用,建表时需要同时声明聚合函数的签名。用户不需要指定长度和默认值。实际存储的数据大小与函数实现有关。 | diff --git a/versioned_docs/version-2.1/table-design/data-type.md b/versioned_docs/version-2.1/table-design/data-type.md index 0cfb9e8f55..59afea441b 100644 --- a/versioned_docs/version-2.1/table-design/data-type.md +++ b/versioned_docs/version-2.1/table-design/data-type.md @@ -50,7 +50,7 @@ The list of data types supported by Doris is as follows: | JSON | Variable Length | Binary JSON type, stored in binary JSON format, access internal JSON fields through JSON function. Supported up to 1048576 bytes (1MB) by default, and can be adjusted to a maximum of 2147483643 bytes (2GB). This limit can be modified through the BE configuration parameter 'jsonb_type_length_soft_limit_bytes'. | | VARIANT | Variable Length | The VARIANT data type is dynamically adaptable, specifically designed for semi-structured data like JSON. It can store any JSON object and automatically splits JSON fields into subcolumns for improved storage efficiency and query performance. The length limits and configuration methods are the same as for the STRING type. However, the VARIANT type can only be used in value columns and cannot be used in key columns or partition / bucket columns. | | HLL | Variable Length | HLL stands for HyperLogLog, is a fuzzy deduplication. It performs better than Count Distinct when dealing with large datasets. The error rate of HLL is typically around 1%, and sometimes it can reach 2%. HLL cannot be used as a key column, and the aggregation type is HLL_UNION when creating a table. Users do not need to specify the length or default value as it is internally controlled based on the aggregation level of the data. HLL columns can on [...] -| BITMAP | Variable Length | BITMAP type can be used in Aggregate tables or Unique tables. - When used in a Unique table, BITMAP must be employed as non-key columns. - When used in an Aggregate table, BITMAP must also serve as non-key columns, and the aggregation type must be set to BITMAP_UNION during table creation. Users do not need to specify the length or default value as it is internally controlled based on the aggregation level of the data. BITMAP columns can only be qu [...] +| BITMAP | Variable Length | BITMAP type can be used in Aggregate tables, Unique tables or Duplicate tables. - When used in a Unique table or a Duplicate table, BITMAP must be employed as non-key columns. - When used in an Aggregate table, BITMAP must also serve as non-key columns, and the aggregation type must be set to BITMAP_UNION during table creation. Users do not need to specify the length or default value as it is internally controlled based on the aggregation level of [...] | QUANTILE_STATE | Variable Length | A type used to calculate approximate quantile values. When loading, it performs pre-aggregation for the same keys with different values. When the number of values does not exceed 2048, it records all data in detail. When the number of values is greater than 2048, it employs the TDigest algorithm to aggregate (cluster) the data and store the centroid points after clustering. QUANTILE_STATE cannot be used as a key column and should be paired with the [...] | AGG_STATE | Variable Length | Aggregate function can only be used with state/merge/union function combiners. AGG_STATE cannot be used as a key column. When creating a table, the signature of the aggregate function needs to be declared alongside. Users do not need to specify the length or default value. The actual data storage size depends on the function's implementation. | --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org