This is an automated email from the ASF dual-hosted git repository. kassiez pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new 0faca980f13 [fix] Fix deadlink of sql data type and update blog (#2302) 0faca980f13 is described below commit 0faca980f13c8705160dcf291df973d81fcd614c Author: KassieZ <139741991+kass...@users.noreply.github.com> AuthorDate: Fri Apr 18 16:56:44 2025 +0800 [fix] Fix deadlink of sql data type and update blog (#2302) ## Versions - [ ] dev - [ ] 3.0 - [ ] 2.1 - [ ] 2.0 ## Languages - [ ] Chinese - [ ] English ## Docs Checklist - [ ] Checked by AI - [ ] Test Cases Built --- blog/doris-compute-storage-decoupled.md | 2 +- ...doris-for-maximum-performance-and-resilience.md | 2 - blog/release-note-2.1.9.md | 2 +- ...tencent-music-migrate-elasticsearch-to-doris.md | 190 +++++++++++++++++++++ ...is-best-alternatives-for-real-time-analytics.md | 2 +- .../sql-data-types/data-type-overview.md | 38 ++--- .../sql-data-types/data-type-overview.md | 38 ++--- .../sql-data-types/data-type-overview.md | 39 +++-- src/components/recent-blogs/recent-blogs.data.ts | 11 +- src/constant/newsletter.data.ts | 14 +- .../A-big-cost-reduction.png | Bin 0 -> 337270 bytes .../A-seamless-migration.png | Bin 0 -> 235927 bytes .../Multi-service-resource-isolation.png | Bin 0 -> 275545 bytes ...rid-solution-Elasticsearch-and-Apache-Doris.png | Bin 0 -> 195418 bytes .../a-unified-solution-based-on-Apache-Doris.png | Bin 0 -> 209413 bytes ...encent-music-migrate-elasticsearch-to-doris.jpg | Bin 0 -> 53542 bytes 16 files changed, 263 insertions(+), 75 deletions(-) diff --git a/blog/doris-compute-storage-decoupled.md b/blog/doris-compute-storage-decoupled.md index acdf2157d54..4c860365092 100644 --- a/blog/doris-compute-storage-decoupled.md +++ b/blog/doris-compute-storage-decoupled.md @@ -7,7 +7,7 @@ 'author': 'Apache Doris', 'tags': ['Tech Sharing'], 'picked': "true", - 'order': "1", + 'order': "2", "image": '/images/compute-storage-decoupled-banner.jpg' } --- diff --git a/blog/ortege-studio-2-fine-tuning-apache-doris-for-maximum-performance-and-resilience.md b/blog/ortege-studio-2-fine-tuning-apache-doris-for-maximum-performance-and-resilience.md index b35b9e2f3eb..7c376a0d4eb 100644 --- a/blog/ortege-studio-2-fine-tuning-apache-doris-for-maximum-performance-and-resilience.md +++ b/blog/ortege-studio-2-fine-tuning-apache-doris-for-maximum-performance-and-resilience.md @@ -6,8 +6,6 @@ 'date': '2024-11-20', 'author': 'Justin Trollip', 'tags': ['Best Practice'], - 'picked': "true", - 'order': "4", "image": '/images/ortege-2.jpg' } diff --git a/blog/release-note-2.1.9.md b/blog/release-note-2.1.9.md index 81bd338e062..a1bef313559 100644 --- a/blog/release-note-2.1.9.md +++ b/blog/release-note-2.1.9.md @@ -7,7 +7,7 @@ 'author': 'Apache Doris', 'tags': ['Release Notes'], 'picked': "true", - 'order': "2", + 'order': "3", "image": '/images/2.1.9.jpg' } --- diff --git a/blog/tencent-music-migrate-elasticsearch-to-doris.md b/blog/tencent-music-migrate-elasticsearch-to-doris.md new file mode 100644 index 00000000000..9c41ea9fd35 --- /dev/null +++ b/blog/tencent-music-migrate-elasticsearch-to-doris.md @@ -0,0 +1,190 @@ +--- +{ + 'title': 'How Tencent Music saved 80% in costs by migrating from Elasticsearch to Apache Doris', + 'summary': 'Handle full-text search, audience segmentation, and aggregation analysis directly within Apache Doris and slash their storage costs by 80% while boosting write performance by 4x', + 'description': 'Handle full-text search, audience segmentation, and aggregation analysis directly within Apache Doris and slash their storage costs by 80% while boosting write performance by 4x', + 'date': '2025-04-17', + 'author': 'Apache Doris', + 'tags': ['Best Practices'], + 'picked': "true", + 'order': "1", + "image": '/images/tencent-music-migrate-elasticsearch-to-doris.jpg' +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + http://www.apache.org/licenses/LICENSE-2.0 +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +As a long-time user of Apache Doris, Tencent Music Entertainment (NYSE: TME) has undergone four generations of data platform evolution, with the Doris community actively supporting its transformation. From [replacing ClickHouse as the analytical engine](https://doris.apache.org/blog/Tencent-Data-Engineers-Why-We-Went-from-ClickHouse-to-Apache-Doris) to gradually offloading Elasticsearch's functionalities, TME has now taken a big step—fully replacing Elasticsearch with Doris as its unifie [...] + +## What they do + +The TME content library provides two types of functionality: + +- **Search**: Quickly locate artists, songs, and other textual data based on flexible query conditions. +- **Tag-based segmentation**: Filter data based on specific tags and criteria among billions of records and deliver sub-second query responses + +## A hybrid solution: Elasticsearch + Apache Doris + +TME previously used both Elasticsearch and Apache Doris in its content library platform to leverage the strengths from both: + +- **Elasticsearch** excelled in full-text search. It can quickly match specific keywords or phrases using inverted indexing while supporting indexing of all fields and flexible filtering conditions. However, it struggled with data aggregation, lacked support for complex queries like JOINs, and had high storage overhead. +- **Apache Doris** offered efficient OLAP capabilities for complex analytical queries while optimizing storage through high compression rates, but before the release of Apache Doris 2.0, it had limited search capabilities due to the absence of inverted index. + +That's why TME built a hybrid architecture. In this setup, Elasticsearch handled full-text search and tag-based segmentation, while Apache Doris powered OLAP analytics. With Doris' [Elasticsearch catalog](https://doris.apache.org/docs/lakehouse/database/es), data in Elasticsearch can be queried directly through Doris, creating a unified query interface for seamless data retrieval. + + + +Despite the advantages of the hybrid architecture, TME encountered several challenges during its implementation: + +- **High storage costs**: Elasticsearch continued to consume huge storage space. +- **Write performance bottlenecks**: As data volumes grew, the write pressure on the Elasticsearch cluster intensified. Full data writes were taking over 10 hours, nearing the business's operational limits. +- **Architectural complexity**: The multi-component architecture meant complex maintenance, extra costs due to redundant data storage, and higher risk of data inconsistency. + +## A unified solution based on Apache Doris + +In [version 2.0](https://doris.apache.org/blog/release-note-2.0.0), Apache Doris introduced inverted index and started to support full-text search. This release drove TME to consider entrusting Doris with the full scope of full-text search, tag-based segmentation, and aggregation analysis tasks. + +What enables Doris to fully replace Elasticsearch? + +- In terms of **full-text search**, Doris accelerates standard equality and range queries (`=`, `!=`, `>`, `>=`, `<`, `<=`) and supports comprehensive text field searches, including tokenization for English, Chinese and Unicode, multi-keyword searches (`MATCH_ANY`, `MATCH_ALL`), phrase searches (`MATCH_PHRASE`, `MATCH_PHRASE_PREFIX`, `MATCH_PHRASE_REGEXP`), slop in phrase, and multi-field searches (`MULTI_MATCH`). It improves performance by orders of magnitude compared to traditional dat [...] +- As for **inverted index**, Doris implements it directly within the database kernel. Inverted indexing in Doris is seamlessly integrated with SQL syntax and supports any logical combinations for `AND`, `OR`, and `NOT` operations, so it allows for complex filtering and search queries. This is an example query involving five filtering conditions: full-text (`title MATCH 'love' OR description MATCH_PHRASE 'true love'`), date range filtering (`dt BETWEEN '2024-09-10 00:00:00' AND '2024-09-1 [...] + +```sql +SELECT actor, count() as cnt +FROM table1 +WHERE dt BETWEEN '2024-09-10 00:00:00' AND '2024-09-10 23:59:59' + AND (title MATCH 'love' OR description MATCH_PHRASE 'true love') + AND rating > 4 + AND country = 'Canada' +GROUP BY actor +ORDER BY cnt DESC LIMIT 100; +``` + +This is the data platform after TME transitioned from a hybrid Elasticsearch + Doris architecture to a unified Doris solution. + + + +With this upgrade, users can now experience: + +- **A big cost reduction**: Doris now handles both search and analytical workloads, leading to an **80%** reduction in operational costs. For example, a single business' daily full data previously required 697.7 GB in Elasticsearch but now only takes 195.4 GB in Doris. + + + +- **Improved performance**: Data ingestion time was cut down from over 10 hours to under 3 hours, making Doris' write performance **4x faster than Elasticsearch**. Additionally, Doris supports complex custom tag-based queries, enabling previously impractical analytics and significantly enhancing user experience. +- **Simplified architecture**: With a unified Doris-based architecture, TME now maintains a single technology stack and eliminates data inconsistency issues + +The transition to a Doris-only architecture required several key design optimizations. In the following sections, we'll dive deeper into the technical strategies and lessons learned from this migration. + +### The game changer: Inverted Index + +To optimize storage, TME adopts a dimension table + fact table model to efficiently handle search and analytics workloads: + +- **Dimension table**: Built using the [Primary Key model](https://doris.apache.org/docs/3.0/table-design/data-model/unique), dimension tables are can be easily updated via the partial column update feature of Doris. These tables are meant for both searching and tag-based segmentation. +- **Fact table**: Designed with the [Aggregate model](https://doris.apache.org/docs/3.0/table-design/data-model/aggregate), this table stores daily metric data. Given the high data volume and the independence of daily datasets, a new [partition](https://doris.apache.org/docs/3.0/table-design/data-partitioning/data-distribution#partitioning-strategy) is created every day to enhance query performance and manageability. + +To ensure a seamless transition from Elasticsearch to Apache Doris, TME designs the table schemas and indexes based on Doris' inverted index [docs](https://doris.apache.org/docs/3.0/table-design/index/inverted-index). The mapping follows these key principles: + +- Elasticsearch's `Keyword` type maps to Doris' `Varchar`/`String` type with non-tokenized inverted indexing (`USING INVERTED`). +- Elasticsearch's `Text` type maps to Doris' `Varchar`/`String` type with tokenized inverted indexing (`USING INVERTED PROPERTIES("parser" = "english/unicode")`). + +```sql +CREATE TABLE `tag_baike_zipper_track_dim_string` ( + `dayno` date NOT NULL COMMENT 'date', + `id` int(11) NOT NULL COMMENT 'id', + `a4` varchar(65000) NULL COMMENT 'song_name', + `a43` varchar(65000) NULL COMMENT 'zyqk_singer_id', + INDEX idx_a4 (`a4`) USING INVERTED PROPERTIES("parser" = "unicode", "support_phrase" = "true") COMMENT '', + INDEX idx_a43 (`a43`) USING INVERTED PROPERTIES("parser" = "english") COMMENT '' +) ENGINE=OLAP +UNIQUE KEY(`dayno`, `id`) +COMMENT 'OLAP' +PARTITION BY RANGE(`dayno`) +(PARTITION p99991230 VALUES [('9999-12-30'), ('9999-12-31'))) +DISTRIBUTED BY HASH(`id`) BUCKETS auto +PROPERTIES ( +... +); +``` + +**Before enabling inverted index in Doris:** + +Take the following complex query as an example: Without inverted indexing, it was slow and took minutes to return results. + +```sql +-- like (Low performance): +SELECT * FROM db_tag_pro.tag_track_pro_3 WHERE +dayno='2024-08-01' AND ( concat('#',a4,'#') like '%#I'm so busy dancing#%' +or concat('#',a43,'#') like '%#1000#%') + +-- explode (Low performance and often triggers ERROR 1105 (HY000)): +SELECT * + FROM ( + SELECT tab1.*,a4_single,a43_single FROM ( + SELECT * + FROM db_tag_pro.tag_track_pro_3 + WHERE dayno='2024-08-01' + ) tab1 + lateral view explode_split(a4, '#') tmp1 as a4_single + lateral view explode_split(a43, '#') tmp2 as a43_single + ) tab2 + where a4_single='I'm so busy dancing' or a43_single='1000' +``` + +**After enabling inverted index in Doris:** + +The query response times reduces **from minutes to just seconds**. A tip is to set `store_row_column` to enable row-based storage. This optimizes `select*` queries that reads all columns from a table. + +```sql +-- Retrieve the corresponding ID from the dimension table +SELECT id FROM db_tag_pro.tag_baike_zipper_track_dim_string WHERE +( a4 MATCH_PHRASE 'I'm so busy dancing' OR a43 MATCH_ALL '1000' ) AND dayno ='2024-08-01' + +-- Fetch the detailed data from the fact table based on the ID +SELECT * FROM db_tag_pro.tag_baike_track_pro WHERE id IN ( 563559286 ) +``` + +Moreover, Apache Doris overcomes a key limitation found in Elasticsearch—**handling overly long SQL queries that previously failed due to length constraints**. Doris supports longer and more complex queries with ease. Additionally, using Doris as the unified engine means that users can leverage materialized views and BITMAP data type to further optimize intermediate query results. This eliminates the need for cross-engine synchronization. + +### Multi-service resource isolation + +To ensure a cost-effective and seamless user experience, TME leverages Doris' resource isolation mechanism for efficient workload management across different business scenarios. + +- **Layer 1: Physical isolation (Resource Group)** They divide the cluster into two Resource Groups to serve difference workloads: Core and Normal. The Core group is dedicated to mission-critical tasks such as content search and tag-based segmentation, while the Normal group handles general-purpose queries. This node-level physical isolation ensures that high-priority operations remain unaffected by other workloads. +- **Layer 2: Logical isolation (Workload Group)** Within each physically isolated Resource Group, resources are further divided into Workload Groups. For example, TME creates multiple Workload Groups within the Normal resource group, and assign a default Workload Group to each user. In this way, they prevent any single user from monopolizing cluster resources. + + + +These resource isolation mechanisms improve system stability. **In TME's case, the frequency of alerts has reduced from over 20 times per day to single digits per month**. The team can now focus more on system optimization and performance improvements rather than constant firefighting. + +## A seamless migration + +TME implements the migration via their self-developed [SuperSonic](https://github.com/tencentmusic/supersonic) project, which has a built-in Headless BI feature to simplify the process. All they need is to convert the queries written in Elasticsearch's Domain Specific Language (DSL) into SQL statements, and switch the data sources for pre-defined metrics and tags. + +The idea of Headless BI is to decouple data modeling, management, and consumption. With it, business analysts can define metrics and tags directly on the Headless BI platform without worrying about the underlying data sources. Because Headless BI abstracts away differences between various data storage and analytics engines, users can experience a transparent, frictionless migration without disruptions. + + + +The Headless BI enables seamless data source migration and largely simplifies data management and querying. SuperSonic takes this a step further by integrating Chat BI capabilities with Headless BI, so users can perform unified data governance and data analysis using natural language. Originally developed and battle-tested in-house by TME, the SuperSonic platform is now open source: https://github.com/tencentmusic/supersonic + +## What's next + +The migration from Elasticsearch to Apache Doris has yielded impressive gains. Write performance has improved 4x and storage usage has dropped by 72%, while the overall operational costs have been cut by up to 80%. + +By replacing its Elasticsearch cluster with Doris, TME has unified its content library's search and analytics engines into a single, streamlined platform. The system now supports complex custom tag-based segmentation with sub-second response. The next-phase plan of TME is to explore broader use cases of Apache Doris and prepare to adopt the [compute-storage decoupled mode](https://doris.apache.org/docs/3.0/compute-storage-decoupled/overview) to drive even greater cost efficiency. + +For direct communication, real-world insights, and best practices, join [#elasticsearch-to-doris](https://apachedoriscommunity.slack.com/archives/C08CQKX20R5) channel in the [Apache Doris community](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2gmq5o30h-455W226d79zP3L96ZhXIoQ). \ No newline at end of file diff --git a/blog/why-apache-doris-is-best-alternatives-for-real-time-analytics.md b/blog/why-apache-doris-is-best-alternatives-for-real-time-analytics.md index 0aebb7094cf..bf4be96caf9 100644 --- a/blog/why-apache-doris-is-best-alternatives-for-real-time-analytics.md +++ b/blog/why-apache-doris-is-best-alternatives-for-real-time-analytics.md @@ -7,7 +7,7 @@ 'author': 'Kang, Apache Doris PMC Member', 'tags': ['Release Notes'], 'picked': "true", - 'order': "3", + 'order': "4", "image": '/images/es-alternatives/Alternative-to-Elasticsearch.jpg' } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/data-type-overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/data-type-overview.md index d9ee399810f..4053f2a5cf1 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/data-type-overview.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/data-type-overview.md @@ -32,38 +32,38 @@ under the License. **1. BOOLEAN 类型:** -两种取值,0 代表 false,1 代表 true。更多信息参考 [BOOLEAN 文档](../../sql-manual/sql-data-types/numeric/BOOLEAN.md)。 +两种取值,0 代表 false,1 代表 true。更多信息参考 [BOOLEAN 文档](../../basic-element/sql-data-types/numeric/BOOLEAN.md)。 **2. 整数类型:** 都是有符号整数,xxINT 的差异是占用字节数和表示范围 -- TINYINT 占 1 字节,范围 [-128, 127], 更多信息参考 [TINYINT 文档](../../sql-manual/sql-data-types/numeric/TINYINT.md)。 +- TINYINT 占 1 字节,范围 [-128, 127], 更多信息参考 [TINYINT 文档](../../basic-element/sql-manual/sql-data-types/numeric/TINYINT.md)。 -- SMALLINT 占 2 字节,范围 [-32768, 32767], 更多信息参考 [SMALLINT 文档](../../sql-manual/sql-data-types/numeric/SMALLINT.md)。 +- SMALLINT 占 2 字节,范围 [-32768, 32767], 更多信息参考 [SMALLINT 文档](../../basic-element/sql-manual/sql-data-types/numeric/SMALLINT.md)。 -- INT 占 4 字节,范围 [-2147483648, 2147483647], 更多信息参考 [INT 文档](../../sql-manual/sql-data-types/numeric/INT.md)。 +- INT 占 4 字节,范围 [-2147483648, 2147483647], 更多信息参考 [INT 文档](../../basic-element/sql-data-types/numeric/INT.md)。 -- BIGINT 占 8 字节,范围 [-9223372036854775808, 9223372036854775807], 更多信息参考 [BIGINT 文档](../../sql-manual/sql-data-types/numeric/BIGINT.md)。 +- BIGINT 占 8 字节,范围 [-9223372036854775808, 9223372036854775807], 更多信息参考 [BIGINT 文档](../../basic-element/sql-data-types/numeric/BIGINT.md)。 -- LARGEINT 占 16 字节,范围 [-2^127, 2^127 - 1], 更多信息参考 [LARGEINT 文档](../../sql-manual/sql-data-types/numeric/LARGEINT.md)。 +- LARGEINT 占 16 字节,范围 [-2^127, 2^127 - 1], 更多信息参考 [LARGEINT 文档](../../basic-element/sql-data-types/numeric/LARGEINT.md)。 **3. 浮点数类型:** -不精确的浮点数类型 FLOAT 和 DOUBLE,和常见编程语言中的 float 和 double 对应。更多信息参考 [FLOAT](../../sql-manual/sql-data-types/numeric/FLOAT.md)、[DOUBLE](../../sql-manual/sql-data-types/numeric/DOUBLE.md) 文档。 +不精确的浮点数类型 FLOAT 和 DOUBLE,和常见编程语言中的 float 和 double 对应。更多信息参考 [FLOAT](../../basic-element/sql-data-types/numeric/FLOAT.md)、[DOUBLE](../../basic-element/sql-data-types/numeric/DOUBLE.md) 文档。 **4. 定点数类型:** -精确的定点数类型 DECIMAL,用于金融等精度要求严格准确的场景。更多信息参考 [DECIMAL](../../sql-manual/sql-data-types/numeric/DECIMAL.md) 文档。 +精确的定点数类型 DECIMAL,用于金融等精度要求严格准确的场景。更多信息参考 [DECIMAL](../../basic-element/sql-data-types/numeric/DECIMAL.md) 文档。 ## 日期类型 日期类型包括 DATE、TIME 和 DATETIME,DATE 类型只存储日期精确到天,DATETIME 类型存储日期和时间,可以精确到微秒。TIME 类型只存储时间,且**暂时不支持建表存储,只能在查询过程中使用**。 -对日期类型进行计算,或将其转换为数字,请使用类似 [TIME_TO_SEC](../sql-functions/scalar-functions/date-time-functions/time-to-sec), [DATE_DIFF](../sql-functions/scalar-functions/date-time-functions/datediff), [UNIX_TIMESTAMP](../sql-functions/scalar-functions/date-time-functions/unix-timestamp) 等函数,直接将其 CAST 为数字类型的结果不受保证。在未来的版本中,此类 CAST 行为将会被禁止。 +对日期类型进行计算,或将其转换为数字,请使用类似 [TIME_TO_SEC](../../sql-functions/scalar-functions/date-time-functions/time-to-sec), [DATE_DIFF](../../sql-functions/scalar-functions/date-time-functions/datediff), [UNIX_TIMESTAMP](../../sql-functions/scalar-functions/date-time-functions/unix-timestamp) 等函数,直接将其 CAST 为数字类型的结果不受保证。在未来的版本中,此类 CAST 行为将会被禁止。 -更多信息参考 [DATE](../../sql-manual/sql-data-types/date-time/DATE)、[TIME](../../sql-manual/sql-data-types/date-time/TIME) 和 [DATETIME](../../sql-manual/sql-data-types/date-time/DATETIME) 文档。 +更多信息参考 [DATE](../../basic-element/sql-data-types/date-time/DATE)、[TIME](../../basic-element/sql-data-types/date-time/TIME) 和 [DATETIME](../../basic-element/sql-data-types/date-time/DATETIME) 文档。 ## 字符串类型 @@ -80,29 +80,29 @@ under the License. 针对 JSON 半结构化数据,支持 3 类不同场景的半结构化数据类型: -1. 支持嵌套的固定 schema,适合分析的数据类型 **[ARRAY](../../sql-manual/sql-data-types/semi-structured/ARRAY.md)、 [MAP](../../sql-manual/sql-data-types/semi-structured/MAP.md) [STRUCT](../../sql-manual/sql-data-types/semi-structured/STRUCT.md)**:常用于用户行为和画像分析,湖仓一体查询数据湖中 Parquet 等格式的数据等场景。由于 schema 相对固定,没有动态 schema 推断的开销,写入和分析性能很高。 +1. 支持嵌套的固定 schema,适合分析的数据类型 **[ARRAY](../../basic-element/sql-data-types/semi-structured/ARRAY.md)、 [MAP](../../basic-element/sql-data-types/semi-structured/MAP.md) [STRUCT](../../basic-element/sql-data-types/semi-structured/STRUCT.md)**:常用于用户行为和画像分析,湖仓一体查询数据湖中 Parquet 等格式的数据等场景。由于 schema 相对固定,没有动态 schema 推断的开销,写入和分析性能很高。 -2. 支持嵌套的不固定 schema,适合分析的数据类型 **[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md)**:常用于 Log, Trace, IoT 等分析场景,schema 灵活可以写入任何合法的 JSON 数据,并自动展开成子列采用列式存储,存储压缩率高,聚合 过滤 排序等分析性能很好。 +2. 支持嵌套的不固定 schema,适合分析的数据类型 **[VARIANT](../../basic-element/sql-data-types/semi-structured/VARIANT.md)**:常用于 Log, Trace, IoT 等分析场景,schema 灵活可以写入任何合法的 JSON 数据,并自动展开成子列采用列式存储,存储压缩率高,聚合 过滤 排序等分析性能很好。 -3. 支持嵌套的不固定 schema,适合点查的数据类型 **[JSON](../../sql-manual/sql-data-types/semi-structured/JSON.md)**:常用于高并发点查场景,schema 灵活可以写入任何合法的 JSON 数据,采用二进制格式存储,提取字段的性能比普通 JSON String 快 2 倍以上。 +3. 支持嵌套的不固定 schema,适合点查的数据类型 **[JSON](../../basic-element/sql-data-types/semi-structured/JSON.md)**:常用于高并发点查场景,schema 灵活可以写入任何合法的 JSON 数据,采用二进制格式存储,提取字段的性能比普通 JSON String 快 2 倍以上。 ## 聚合类型 聚合类型存储聚合的结果或者中间状态,用于加速聚合查询,包括下面几种: -1. [BITMAP](../../sql-manual/sql-data-types/aggregate/BITMAP.md):用于精确去重,如 UV 统计,人群圈选等场景。配合 bitmap_union、bitmap_union_count、bitmap_hash、bitmap_hash64 等 BITMAP 函数使用。 +1. [BITMAP](../../basic-element/sql-data-types/aggregate/BITMAP.md):用于精确去重,如 UV 统计,人群圈选等场景。配合 bitmap_union、bitmap_union_count、bitmap_hash、bitmap_hash64 等 BITMAP 函数使用。 -2. [HLL](../../sql-manual/sql-data-types/aggregate/HLL.md):用于近似去重,性能优于 COUNT DISTINCT。配合 hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 等 HLL 函数使用。 +2. [HLL](../../basic-element/sql-data-types/aggregate/HLL.md):用于近似去重,性能优于 COUNT DISTINCT。配合 hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 等 HLL 函数使用。 -3. [QUANTILE_STATE](../../sql-manual/sql-data-types/aggregate/QUANTILE-STATE.md):用于分位数近似计算,性能优于 PERCENTILE。配合 QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数使用。 +3. [QUANTILE_STATE](../../basic-element/sql-data-types/aggregate/QUANTILE-STATE.md):用于分位数近似计算,性能优于 PERCENTILE。配合 QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数使用。 -4. [AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md):用于聚合计算加速,配合 state/merge/union 聚合函数组合器使用。 +4. [AGG_STATE](../../basic-element/sql-data-types/aggregate/AGG-STATE.md):用于聚合计算加速,配合 state/merge/union 聚合函数组合器使用。 ## IP 类型 IP 类型以二进制形式存储 IP 地址,比用字符串存储更省空间查询速度更快,支持 2 种类型: -1. [IPv4](../../sql-manual/sql-data-types/ip/IPV4.md):以 4 字节二进制存储 IPv4 地址,配合 ipv4_* 系列函数使用。 +1. [IPv4](../../basic-element/sql-data-types/ip/IPV4.md):以 4 字节二进制存储 IPv4 地址,配合 ipv4_* 系列函数使用。 -2. [IPv6](../../sql-manual/sql-data-types/ip/IPV6.md):以 16 字节二进制存储 IPv6 地址,配合 ipv6_* 系列函数使用。 +2. [IPv6](../../basic-element/sql-data-types/ip/IPV6.md):以 16 字节二进制存储 IPv6 地址,配合 ipv6_* 系列函数使用。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/basic-element/sql-data-types/data-type-overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/basic-element/sql-data-types/data-type-overview.md index d9ee399810f..4053f2a5cf1 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/basic-element/sql-data-types/data-type-overview.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/basic-element/sql-data-types/data-type-overview.md @@ -32,38 +32,38 @@ under the License. **1. BOOLEAN 类型:** -两种取值,0 代表 false,1 代表 true。更多信息参考 [BOOLEAN 文档](../../sql-manual/sql-data-types/numeric/BOOLEAN.md)。 +两种取值,0 代表 false,1 代表 true。更多信息参考 [BOOLEAN 文档](../../basic-element/sql-data-types/numeric/BOOLEAN.md)。 **2. 整数类型:** 都是有符号整数,xxINT 的差异是占用字节数和表示范围 -- TINYINT 占 1 字节,范围 [-128, 127], 更多信息参考 [TINYINT 文档](../../sql-manual/sql-data-types/numeric/TINYINT.md)。 +- TINYINT 占 1 字节,范围 [-128, 127], 更多信息参考 [TINYINT 文档](../../basic-element/sql-manual/sql-data-types/numeric/TINYINT.md)。 -- SMALLINT 占 2 字节,范围 [-32768, 32767], 更多信息参考 [SMALLINT 文档](../../sql-manual/sql-data-types/numeric/SMALLINT.md)。 +- SMALLINT 占 2 字节,范围 [-32768, 32767], 更多信息参考 [SMALLINT 文档](../../basic-element/sql-manual/sql-data-types/numeric/SMALLINT.md)。 -- INT 占 4 字节,范围 [-2147483648, 2147483647], 更多信息参考 [INT 文档](../../sql-manual/sql-data-types/numeric/INT.md)。 +- INT 占 4 字节,范围 [-2147483648, 2147483647], 更多信息参考 [INT 文档](../../basic-element/sql-data-types/numeric/INT.md)。 -- BIGINT 占 8 字节,范围 [-9223372036854775808, 9223372036854775807], 更多信息参考 [BIGINT 文档](../../sql-manual/sql-data-types/numeric/BIGINT.md)。 +- BIGINT 占 8 字节,范围 [-9223372036854775808, 9223372036854775807], 更多信息参考 [BIGINT 文档](../../basic-element/sql-data-types/numeric/BIGINT.md)。 -- LARGEINT 占 16 字节,范围 [-2^127, 2^127 - 1], 更多信息参考 [LARGEINT 文档](../../sql-manual/sql-data-types/numeric/LARGEINT.md)。 +- LARGEINT 占 16 字节,范围 [-2^127, 2^127 - 1], 更多信息参考 [LARGEINT 文档](../../basic-element/sql-data-types/numeric/LARGEINT.md)。 **3. 浮点数类型:** -不精确的浮点数类型 FLOAT 和 DOUBLE,和常见编程语言中的 float 和 double 对应。更多信息参考 [FLOAT](../../sql-manual/sql-data-types/numeric/FLOAT.md)、[DOUBLE](../../sql-manual/sql-data-types/numeric/DOUBLE.md) 文档。 +不精确的浮点数类型 FLOAT 和 DOUBLE,和常见编程语言中的 float 和 double 对应。更多信息参考 [FLOAT](../../basic-element/sql-data-types/numeric/FLOAT.md)、[DOUBLE](../../basic-element/sql-data-types/numeric/DOUBLE.md) 文档。 **4. 定点数类型:** -精确的定点数类型 DECIMAL,用于金融等精度要求严格准确的场景。更多信息参考 [DECIMAL](../../sql-manual/sql-data-types/numeric/DECIMAL.md) 文档。 +精确的定点数类型 DECIMAL,用于金融等精度要求严格准确的场景。更多信息参考 [DECIMAL](../../basic-element/sql-data-types/numeric/DECIMAL.md) 文档。 ## 日期类型 日期类型包括 DATE、TIME 和 DATETIME,DATE 类型只存储日期精确到天,DATETIME 类型存储日期和时间,可以精确到微秒。TIME 类型只存储时间,且**暂时不支持建表存储,只能在查询过程中使用**。 -对日期类型进行计算,或将其转换为数字,请使用类似 [TIME_TO_SEC](../sql-functions/scalar-functions/date-time-functions/time-to-sec), [DATE_DIFF](../sql-functions/scalar-functions/date-time-functions/datediff), [UNIX_TIMESTAMP](../sql-functions/scalar-functions/date-time-functions/unix-timestamp) 等函数,直接将其 CAST 为数字类型的结果不受保证。在未来的版本中,此类 CAST 行为将会被禁止。 +对日期类型进行计算,或将其转换为数字,请使用类似 [TIME_TO_SEC](../../sql-functions/scalar-functions/date-time-functions/time-to-sec), [DATE_DIFF](../../sql-functions/scalar-functions/date-time-functions/datediff), [UNIX_TIMESTAMP](../../sql-functions/scalar-functions/date-time-functions/unix-timestamp) 等函数,直接将其 CAST 为数字类型的结果不受保证。在未来的版本中,此类 CAST 行为将会被禁止。 -更多信息参考 [DATE](../../sql-manual/sql-data-types/date-time/DATE)、[TIME](../../sql-manual/sql-data-types/date-time/TIME) 和 [DATETIME](../../sql-manual/sql-data-types/date-time/DATETIME) 文档。 +更多信息参考 [DATE](../../basic-element/sql-data-types/date-time/DATE)、[TIME](../../basic-element/sql-data-types/date-time/TIME) 和 [DATETIME](../../basic-element/sql-data-types/date-time/DATETIME) 文档。 ## 字符串类型 @@ -80,29 +80,29 @@ under the License. 针对 JSON 半结构化数据,支持 3 类不同场景的半结构化数据类型: -1. 支持嵌套的固定 schema,适合分析的数据类型 **[ARRAY](../../sql-manual/sql-data-types/semi-structured/ARRAY.md)、 [MAP](../../sql-manual/sql-data-types/semi-structured/MAP.md) [STRUCT](../../sql-manual/sql-data-types/semi-structured/STRUCT.md)**:常用于用户行为和画像分析,湖仓一体查询数据湖中 Parquet 等格式的数据等场景。由于 schema 相对固定,没有动态 schema 推断的开销,写入和分析性能很高。 +1. 支持嵌套的固定 schema,适合分析的数据类型 **[ARRAY](../../basic-element/sql-data-types/semi-structured/ARRAY.md)、 [MAP](../../basic-element/sql-data-types/semi-structured/MAP.md) [STRUCT](../../basic-element/sql-data-types/semi-structured/STRUCT.md)**:常用于用户行为和画像分析,湖仓一体查询数据湖中 Parquet 等格式的数据等场景。由于 schema 相对固定,没有动态 schema 推断的开销,写入和分析性能很高。 -2. 支持嵌套的不固定 schema,适合分析的数据类型 **[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md)**:常用于 Log, Trace, IoT 等分析场景,schema 灵活可以写入任何合法的 JSON 数据,并自动展开成子列采用列式存储,存储压缩率高,聚合 过滤 排序等分析性能很好。 +2. 支持嵌套的不固定 schema,适合分析的数据类型 **[VARIANT](../../basic-element/sql-data-types/semi-structured/VARIANT.md)**:常用于 Log, Trace, IoT 等分析场景,schema 灵活可以写入任何合法的 JSON 数据,并自动展开成子列采用列式存储,存储压缩率高,聚合 过滤 排序等分析性能很好。 -3. 支持嵌套的不固定 schema,适合点查的数据类型 **[JSON](../../sql-manual/sql-data-types/semi-structured/JSON.md)**:常用于高并发点查场景,schema 灵活可以写入任何合法的 JSON 数据,采用二进制格式存储,提取字段的性能比普通 JSON String 快 2 倍以上。 +3. 支持嵌套的不固定 schema,适合点查的数据类型 **[JSON](../../basic-element/sql-data-types/semi-structured/JSON.md)**:常用于高并发点查场景,schema 灵活可以写入任何合法的 JSON 数据,采用二进制格式存储,提取字段的性能比普通 JSON String 快 2 倍以上。 ## 聚合类型 聚合类型存储聚合的结果或者中间状态,用于加速聚合查询,包括下面几种: -1. [BITMAP](../../sql-manual/sql-data-types/aggregate/BITMAP.md):用于精确去重,如 UV 统计,人群圈选等场景。配合 bitmap_union、bitmap_union_count、bitmap_hash、bitmap_hash64 等 BITMAP 函数使用。 +1. [BITMAP](../../basic-element/sql-data-types/aggregate/BITMAP.md):用于精确去重,如 UV 统计,人群圈选等场景。配合 bitmap_union、bitmap_union_count、bitmap_hash、bitmap_hash64 等 BITMAP 函数使用。 -2. [HLL](../../sql-manual/sql-data-types/aggregate/HLL.md):用于近似去重,性能优于 COUNT DISTINCT。配合 hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 等 HLL 函数使用。 +2. [HLL](../../basic-element/sql-data-types/aggregate/HLL.md):用于近似去重,性能优于 COUNT DISTINCT。配合 hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 等 HLL 函数使用。 -3. [QUANTILE_STATE](../../sql-manual/sql-data-types/aggregate/QUANTILE-STATE.md):用于分位数近似计算,性能优于 PERCENTILE。配合 QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数使用。 +3. [QUANTILE_STATE](../../basic-element/sql-data-types/aggregate/QUANTILE-STATE.md):用于分位数近似计算,性能优于 PERCENTILE。配合 QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数使用。 -4. [AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md):用于聚合计算加速,配合 state/merge/union 聚合函数组合器使用。 +4. [AGG_STATE](../../basic-element/sql-data-types/aggregate/AGG-STATE.md):用于聚合计算加速,配合 state/merge/union 聚合函数组合器使用。 ## IP 类型 IP 类型以二进制形式存储 IP 地址,比用字符串存储更省空间查询速度更快,支持 2 种类型: -1. [IPv4](../../sql-manual/sql-data-types/ip/IPV4.md):以 4 字节二进制存储 IPv4 地址,配合 ipv4_* 系列函数使用。 +1. [IPv4](../../basic-element/sql-data-types/ip/IPV4.md):以 4 字节二进制存储 IPv4 地址,配合 ipv4_* 系列函数使用。 -2. [IPv6](../../sql-manual/sql-data-types/ip/IPV6.md):以 16 字节二进制存储 IPv6 地址,配合 ipv6_* 系列函数使用。 +2. [IPv6](../../basic-element/sql-data-types/ip/IPV6.md):以 16 字节二进制存储 IPv6 地址,配合 ipv6_* 系列函数使用。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/basic-element/sql-data-types/data-type-overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/basic-element/sql-data-types/data-type-overview.md index d9ee399810f..d35229c976a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/basic-element/sql-data-types/data-type-overview.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/basic-element/sql-data-types/data-type-overview.md @@ -25,45 +25,44 @@ under the License. --> - ## 数值类型 包括以下 4 种: **1. BOOLEAN 类型:** -两种取值,0 代表 false,1 代表 true。更多信息参考 [BOOLEAN 文档](../../sql-manual/sql-data-types/numeric/BOOLEAN.md)。 +两种取值,0 代表 false,1 代表 true。更多信息参考 [BOOLEAN 文档](../../basic-element/sql-data-types/numeric/BOOLEAN.md)。 **2. 整数类型:** 都是有符号整数,xxINT 的差异是占用字节数和表示范围 -- TINYINT 占 1 字节,范围 [-128, 127], 更多信息参考 [TINYINT 文档](../../sql-manual/sql-data-types/numeric/TINYINT.md)。 +- TINYINT 占 1 字节,范围 [-128, 127], 更多信息参考 [TINYINT 文档](../../basic-element/sql-manual/sql-data-types/numeric/TINYINT.md)。 -- SMALLINT 占 2 字节,范围 [-32768, 32767], 更多信息参考 [SMALLINT 文档](../../sql-manual/sql-data-types/numeric/SMALLINT.md)。 +- SMALLINT 占 2 字节,范围 [-32768, 32767], 更多信息参考 [SMALLINT 文档](../../basic-element/sql-manual/sql-data-types/numeric/SMALLINT.md)。 -- INT 占 4 字节,范围 [-2147483648, 2147483647], 更多信息参考 [INT 文档](../../sql-manual/sql-data-types/numeric/INT.md)。 +- INT 占 4 字节,范围 [-2147483648, 2147483647], 更多信息参考 [INT 文档](../../basic-element/sql-data-types/numeric/INT.md)。 -- BIGINT 占 8 字节,范围 [-9223372036854775808, 9223372036854775807], 更多信息参考 [BIGINT 文档](../../sql-manual/sql-data-types/numeric/BIGINT.md)。 +- BIGINT 占 8 字节,范围 [-9223372036854775808, 9223372036854775807], 更多信息参考 [BIGINT 文档](../../basic-element/sql-data-types/numeric/BIGINT.md)。 -- LARGEINT 占 16 字节,范围 [-2^127, 2^127 - 1], 更多信息参考 [LARGEINT 文档](../../sql-manual/sql-data-types/numeric/LARGEINT.md)。 +- LARGEINT 占 16 字节,范围 [-2^127, 2^127 - 1], 更多信息参考 [LARGEINT 文档](../../basic-element/sql-data-types/numeric/LARGEINT.md)。 **3. 浮点数类型:** -不精确的浮点数类型 FLOAT 和 DOUBLE,和常见编程语言中的 float 和 double 对应。更多信息参考 [FLOAT](../../sql-manual/sql-data-types/numeric/FLOAT.md)、[DOUBLE](../../sql-manual/sql-data-types/numeric/DOUBLE.md) 文档。 +不精确的浮点数类型 FLOAT 和 DOUBLE,和常见编程语言中的 float 和 double 对应。更多信息参考 [FLOAT](../../basic-element/sql-data-types/numeric/FLOAT.md)、[DOUBLE](../../basic-element/sql-data-types/numeric/DOUBLE.md) 文档。 **4. 定点数类型:** -精确的定点数类型 DECIMAL,用于金融等精度要求严格准确的场景。更多信息参考 [DECIMAL](../../sql-manual/sql-data-types/numeric/DECIMAL.md) 文档。 +精确的定点数类型 DECIMAL,用于金融等精度要求严格准确的场景。更多信息参考 [DECIMAL](../../basic-element/sql-data-types/numeric/DECIMAL.md) 文档。 ## 日期类型 日期类型包括 DATE、TIME 和 DATETIME,DATE 类型只存储日期精确到天,DATETIME 类型存储日期和时间,可以精确到微秒。TIME 类型只存储时间,且**暂时不支持建表存储,只能在查询过程中使用**。 -对日期类型进行计算,或将其转换为数字,请使用类似 [TIME_TO_SEC](../sql-functions/scalar-functions/date-time-functions/time-to-sec), [DATE_DIFF](../sql-functions/scalar-functions/date-time-functions/datediff), [UNIX_TIMESTAMP](../sql-functions/scalar-functions/date-time-functions/unix-timestamp) 等函数,直接将其 CAST 为数字类型的结果不受保证。在未来的版本中,此类 CAST 行为将会被禁止。 +对日期类型进行计算,或将其转换为数字,请使用类似 [TIME_TO_SEC](../../sql-functions/scalar-functions/date-time-functions/time-to-sec), [DATE_DIFF](../../sql-functions/scalar-functions/date-time-functions/datediff), [UNIX_TIMESTAMP](../../sql-functions/scalar-functions/date-time-functions/unix-timestamp) 等函数,直接将其 CAST 为数字类型的结果不受保证。在未来的版本中,此类 CAST 行为将会被禁止。 -更多信息参考 [DATE](../../sql-manual/sql-data-types/date-time/DATE)、[TIME](../../sql-manual/sql-data-types/date-time/TIME) 和 [DATETIME](../../sql-manual/sql-data-types/date-time/DATETIME) 文档。 +更多信息参考 [DATE](../../basic-element/sql-data-types/date-time/DATE)、[TIME](../../basic-element/sql-data-types/date-time/TIME) 和 [DATETIME](../../basic-element/sql-data-types/date-time/DATETIME) 文档。 ## 字符串类型 @@ -80,29 +79,29 @@ under the License. 针对 JSON 半结构化数据,支持 3 类不同场景的半结构化数据类型: -1. 支持嵌套的固定 schema,适合分析的数据类型 **[ARRAY](../../sql-manual/sql-data-types/semi-structured/ARRAY.md)、 [MAP](../../sql-manual/sql-data-types/semi-structured/MAP.md) [STRUCT](../../sql-manual/sql-data-types/semi-structured/STRUCT.md)**:常用于用户行为和画像分析,湖仓一体查询数据湖中 Parquet 等格式的数据等场景。由于 schema 相对固定,没有动态 schema 推断的开销,写入和分析性能很高。 +1. 支持嵌套的固定 schema,适合分析的数据类型 **[ARRAY](../../basic-element/sql-data-types/semi-structured/ARRAY.md)、 [MAP](../../basic-element/sql-data-types/semi-structured/MAP.md) [STRUCT](../../basic-element/sql-data-types/semi-structured/STRUCT.md)**:常用于用户行为和画像分析,湖仓一体查询数据湖中 Parquet 等格式的数据等场景。由于 schema 相对固定,没有动态 schema 推断的开销,写入和分析性能很高。 -2. 支持嵌套的不固定 schema,适合分析的数据类型 **[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md)**:常用于 Log, Trace, IoT 等分析场景,schema 灵活可以写入任何合法的 JSON 数据,并自动展开成子列采用列式存储,存储压缩率高,聚合 过滤 排序等分析性能很好。 +2. 支持嵌套的不固定 schema,适合分析的数据类型 **[VARIANT](../../basic-element/sql-data-types/semi-structured/VARIANT.md)**:常用于 Log, Trace, IoT 等分析场景,schema 灵活可以写入任何合法的 JSON 数据,并自动展开成子列采用列式存储,存储压缩率高,聚合 过滤 排序等分析性能很好。 -3. 支持嵌套的不固定 schema,适合点查的数据类型 **[JSON](../../sql-manual/sql-data-types/semi-structured/JSON.md)**:常用于高并发点查场景,schema 灵活可以写入任何合法的 JSON 数据,采用二进制格式存储,提取字段的性能比普通 JSON String 快 2 倍以上。 +3. 支持嵌套的不固定 schema,适合点查的数据类型 **[JSON](../../basic-element/sql-data-types/semi-structured/JSON.md)**:常用于高并发点查场景,schema 灵活可以写入任何合法的 JSON 数据,采用二进制格式存储,提取字段的性能比普通 JSON String 快 2 倍以上。 ## 聚合类型 聚合类型存储聚合的结果或者中间状态,用于加速聚合查询,包括下面几种: -1. [BITMAP](../../sql-manual/sql-data-types/aggregate/BITMAP.md):用于精确去重,如 UV 统计,人群圈选等场景。配合 bitmap_union、bitmap_union_count、bitmap_hash、bitmap_hash64 等 BITMAP 函数使用。 +1. [BITMAP](../../basic-element/sql-data-types/aggregate/BITMAP.md):用于精确去重,如 UV 统计,人群圈选等场景。配合 bitmap_union、bitmap_union_count、bitmap_hash、bitmap_hash64 等 BITMAP 函数使用。 -2. [HLL](../../sql-manual/sql-data-types/aggregate/HLL.md):用于近似去重,性能优于 COUNT DISTINCT。配合 hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 等 HLL 函数使用。 +2. [HLL](../../basic-element/sql-data-types/aggregate/HLL.md):用于近似去重,性能优于 COUNT DISTINCT。配合 hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 等 HLL 函数使用。 -3. [QUANTILE_STATE](../../sql-manual/sql-data-types/aggregate/QUANTILE-STATE.md):用于分位数近似计算,性能优于 PERCENTILE。配合 QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数使用。 +3. [QUANTILE_STATE](../../basic-element/sql-data-types/aggregate/QUANTILE-STATE.md):用于分位数近似计算,性能优于 PERCENTILE。配合 QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数使用。 -4. [AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md):用于聚合计算加速,配合 state/merge/union 聚合函数组合器使用。 +4. [AGG_STATE](../../basic-element/sql-data-types/aggregate/AGG-STATE.md):用于聚合计算加速,配合 state/merge/union 聚合函数组合器使用。 ## IP 类型 IP 类型以二进制形式存储 IP 地址,比用字符串存储更省空间查询速度更快,支持 2 种类型: -1. [IPv4](../../sql-manual/sql-data-types/ip/IPV4.md):以 4 字节二进制存储 IPv4 地址,配合 ipv4_* 系列函数使用。 +1. [IPv4](../../basic-element/sql-data-types/ip/IPV4.md):以 4 字节二进制存储 IPv4 地址,配合 ipv4_* 系列函数使用。 -2. [IPv6](../../sql-manual/sql-data-types/ip/IPV6.md):以 16 字节二进制存储 IPv6 地址,配合 ipv6_* 系列函数使用。 +2. [IPv6](../../basic-element/sql-data-types/ip/IPV6.md):以 16 字节二进制存储 IPv6 地址,配合 ipv6_* 系列函数使用。 diff --git a/src/components/recent-blogs/recent-blogs.data.ts b/src/components/recent-blogs/recent-blogs.data.ts index 967120e019b..710503a5d9d 100644 --- a/src/components/recent-blogs/recent-blogs.data.ts +++ b/src/components/recent-blogs/recent-blogs.data.ts @@ -1,8 +1,12 @@ export const RECENT_BLOGS_POSTS = [ { - label: `Apache Doris 3.0.4 Released`, + label: `Apache Doris 2.1.9 Released`, link: 'https://doris.apache.org/blog/release-note-3.0.4', }, + { + label: 'Why Apache Doris is a Better Alternative to Elasticsearch for Real-Time Analytics', + link: 'https://doris.apache.org/blog/why-apache-doris-is-best-alternatives-for-real-time-analytics', + }, { label: 'Automatic and flexible data sharding: Auto Partition in Apache Doris', link: 'https://doris.apache.org/blog/auto-partition-in-apache-doris', @@ -11,9 +15,6 @@ export const RECENT_BLOGS_POSTS = [ label: 'Migrate data lakehouse from BigQuery to Apache Doris, saving $4,500 per month', link: 'https://doris.apache.org/blog/migrate-lakehouse-from-bigquery-to-doris', }, - { - label: 'Why Apache Doris is the Best Open Source Alternative to Rockset', - link: 'https://doris.apache.org/blog/apache-doris-vs-rockset', - }, + ]; diff --git a/src/constant/newsletter.data.ts b/src/constant/newsletter.data.ts index 4d115713ac5..ba29be3cfe5 100644 --- a/src/constant/newsletter.data.ts +++ b/src/constant/newsletter.data.ts @@ -1,4 +1,11 @@ export const NEWSLETTER_DATA = [ + { + tags: ['Best Practice'], + title: "How Tencent Music saved 80% in costs by migrating from Elasticsearch to Apache Doris", + content: `Handle full-text search, audience segmentation, and aggregation analysis directly within Apache Doris and slash their storage costs by 80% while boosting write performance by 4x`, + to: '/blog/tencent-music-migrate-elasticsearch-to-doris', + image: 'tencent-music-migrate-elasticsearch-to-doris.jpg', + }, { tags: ['Tech Sharing'], title: "Slash your cost by 90% with Apache Doris Compute-Storage Decoupled Mode", @@ -19,13 +26,6 @@ export const NEWSLETTER_DATA = [ to: '/blog/why-apache-doris-is-best-alternatives-for-real-time-analytics', image: 'es-alternatives/Alternative-to-Elasticsearch.jpg', }, - { - tags: ['Best Practice'], - title: "Fine-tuning Apache Doris for maximum performance and resilience: a deep dive into fe.conf", - content: `Ortege handles massive volumes of blockchain data to power its analytics platform, Ortege Studio. Apache Doris forms the backbone of its Lakehouse v2, enabling it to process billions of records and deliver real-time insights.`, - to: '/blog/ortege-studio-2-fine-tuning-apache-doris-for-maximum-performance-and-resilience', - image: 'ortege-2.jpg', - }, ]; diff --git a/static/images/blog-tencent-alternative-es/A-big-cost-reduction.png b/static/images/blog-tencent-alternative-es/A-big-cost-reduction.png new file mode 100644 index 00000000000..9783b8da323 Binary files /dev/null and b/static/images/blog-tencent-alternative-es/A-big-cost-reduction.png differ diff --git a/static/images/blog-tencent-alternative-es/A-seamless-migration.png b/static/images/blog-tencent-alternative-es/A-seamless-migration.png new file mode 100644 index 00000000000..dc5ab055edb Binary files /dev/null and b/static/images/blog-tencent-alternative-es/A-seamless-migration.png differ diff --git a/static/images/blog-tencent-alternative-es/Multi-service-resource-isolation.png b/static/images/blog-tencent-alternative-es/Multi-service-resource-isolation.png new file mode 100644 index 00000000000..1e49bfe54ae Binary files /dev/null and b/static/images/blog-tencent-alternative-es/Multi-service-resource-isolation.png differ diff --git a/static/images/blog-tencent-alternative-es/a-hybrid-solution-Elasticsearch-and-Apache-Doris.png b/static/images/blog-tencent-alternative-es/a-hybrid-solution-Elasticsearch-and-Apache-Doris.png new file mode 100644 index 00000000000..8836edebeda Binary files /dev/null and b/static/images/blog-tencent-alternative-es/a-hybrid-solution-Elasticsearch-and-Apache-Doris.png differ diff --git a/static/images/blog-tencent-alternative-es/a-unified-solution-based-on-Apache-Doris.png b/static/images/blog-tencent-alternative-es/a-unified-solution-based-on-Apache-Doris.png new file mode 100644 index 00000000000..e96ad506bd4 Binary files /dev/null and b/static/images/blog-tencent-alternative-es/a-unified-solution-based-on-Apache-Doris.png differ diff --git a/static/images/tencent-music-migrate-elasticsearch-to-doris.jpg b/static/images/tencent-music-migrate-elasticsearch-to-doris.jpg new file mode 100644 index 00000000000..8615f9cabf4 Binary files /dev/null and b/static/images/tencent-music-migrate-elasticsearch-to-doris.jpg differ --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org