This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new c1b13f35b88 [lakehouse] add lakehouse overview (#2043) c1b13f35b88 is described below commit c1b13f35b88514b5fe492a78eefc0979f41a64ad Author: Mingyu Chen (Rayner) <morning...@163.com> AuthorDate: Mon Feb 17 20:27:48 2025 +0800 [lakehouse] add lakehouse overview (#2043) ## Versions - [x] dev - [ ] 3.0 - [ ] 2.1 - [ ] 2.0 ## Languages - [x] Chinese - [x] English ## Docs Checklist - [x] Checked by AI - [ ] Test Cases Built --- docs/lakehouse/lakehouse-overview.md | 150 +++++++++++++++++++- .../current/lakehouse/lakehouse-overview.md | 151 ++++++++++++++++++++- .../images/Lakehouse/compute-storage-decouple.png | Bin 0 -> 430809 bytes static/images/Lakehouse/data-management.png | Bin 0 -> 38700 bytes static/images/Lakehouse/federation-query.png | Bin 0 -> 56776 bytes static/images/Lakehouse/lakehouse-arch-1.png | Bin 0 -> 288586 bytes static/images/Lakehouse/performance.png | Bin 0 -> 82113 bytes static/images/Lakehouse/query-acceleration.png | Bin 0 -> 53713 bytes static/images/Lakehouse/tpcds1000.png | Bin 0 -> 44750 bytes 9 files changed, 299 insertions(+), 2 deletions(-) diff --git a/docs/lakehouse/lakehouse-overview.md b/docs/lakehouse/lakehouse-overview.md index 1a7ee83bda8..abadb9bd0b2 100644 --- a/docs/lakehouse/lakehouse-overview.md +++ b/docs/lakehouse/lakehouse-overview.md @@ -24,5 +24,153 @@ specific language governing permissions and limitations under the License. --> -The document is under development, please refer to versioned doc 2.1 or 3.0 +**The lakehouse is a modern big data solution that combines the advantages of data lakes and data warehouses**. It integrates the low cost and high scalability of data lakes with the high performance and strong data governance capabilities of data warehouses, enabling efficient, secure, and quality-controlled storage and processing analysis of various data in the big data era. Through standardized open data formats and metadata management, it unifies **real-time** and **historical** data [...] +## Doris Lakehouse Solution + +Doris provides an excellent lakehouse solution for users through an extensible connector framework, a compute-storage decoupled architecture, a high-performance data processing engine, and data ecosystem openness. + + + +### Flexible Data Access + +Doris supports mainstream data systems and data format access through an extensible connector framework and provides unified data analysis capabilities based on SQL, allowing users to easily perform cross-platform data queries and analysis without moving existing data. For details, refer to [Catalog Overview](./catalog-overview.md) + +### Data Source Connectors + +Whether it's Hive, Iceberg, Hudi, Paimon, or database systems supporting the JDBC protocol, Doris can easily connect and efficiently access data. + +For lakehouse systems, Doris can obtain the structure and distribution information of data tables from metadata services such as Hive Metastore, AWS Glue, and Unity Catalog, perform reasonable query planning, and utilize the MPP architecture for distributed computing. + +For details, refer to each catalog document, such as [Iceberg Catalog](./catalogs/iceberg-catalog.md) + +#### Extensible Connector Framework + +Doris provides a good extensibility framework to help developers quickly connect to unique data sources within enterprises, achieving fast data interoperability. + +Doris defines three levels of standard Catalog, Database, and Table, allowing developers to easily map to the required data source levels. Doris also provides standard interfaces for metadata service and storage service accessing, and developers only need to implement the corresponding interface to complete the data source connection. + +Doris is compatible with the Trino Connector plugin, allowing the Trino plugin package to be directly deployed to the Doris cluster, and with minimal configuration, the corresponding data source can be accessed. Doris has already completed connections to data sources such as [Kudu](./catalogs/kudu-catalog.md), [BigQuery](./catalogs/bigquery-catalog.md), and [Delta Lake](./catalogs/delta-lake-catalog.md). You can also [adapt new plugins yourself](https://doris.apache.org/community/how-to- [...] + +#### Convenient Cross-Source Data Processing + +Doris supports creating multiple data catalogs at runtime and using SQL to perform federated queries on these data sources. For example, users can associate query fact table data in Hive with dimension table data in MySQL: + +```sql +SELECT h.id, m.name +FROM hive.db.hive_table h JOIN mysql.db.mysql_table m +ON h.id = m.id; +``` + +Combined with Doris's built-in [job scheduling](../admin-manual/workload-management/job-scheduler.md) capabilities, you can also create scheduled tasks to further simplify system complexity. For example, users can set the result of the above query as a routine task executed every hour and write each result into an Iceberg table: + +```sql +CREATE JOB schedule_load +ON SCHEDULE EVERY 1 HOUR DO +INSERT INTO iceberg.db.ice_table +SELECT h.id, m.name +FROM hive.db.hive_table h JOIN mysql.db.mysql_table m +ON h.id = m.id; +``` + +### High-Performance Data Processing + +As an analytical data warehouse, Doris has made numerous optimizations in lakehouse data processing and computation and provides rich query acceleration features: + +* Execution Engine + + The Doris execution engine is based on the MPP execution framework and Pipeline data processing model, capable of quickly processing massive data in a multi-machine, multi-core distributed environment. Thanks to fully vectorized execution operators, Doris leads in computing performance in standard benchmark datasets like TPC-DS. + +* Query Optimizer + + Doris can automatically optimize and process complex SQL requests through the query optimizer. The query optimizer deeply optimizes various complex SQL operators such as multi-table joins, aggregation, sorting, and pagination, fully utilizing cost models and relational algebra transformations to automatically obtain better or optimal logical and physical execution plans, greatly reducing the difficulty of writing SQL and improving usability and performance. + +* Data Cache and IO Optimization + + Access to external data sources is usually network access, which can have high latency and poor stability. Apache Doris provides rich caching mechanisms and has made numerous optimizations in cache types, timeliness, and strategies, fully utilizing memory and local high-speed disks to enhance the analysis performance of hot data. Additionally, Doris has made targeted optimizations for network IO characteristics such as high throughput, low IOPS, and high latency, providing external d [...] + +* Materialized Views and Transparent Acceleration + + Doris provides rich materialized view update strategies, supporting full and partition-level incremental refresh to reduce construction costs and improve timeliness. In addition to manual refresh, Doris also supports scheduled refresh and data-driven refresh, further reducing maintenance costs and improving data consistency. Materialized views also have transparent acceleration capabilities, allowing the query optimizer to automatically route to appropriate materialized views for sea [...] + +As shown below, on a 1TB TPCDS standard test set based on the Iceberg table format, Doris's overall execution of 99 queries is only 1/3 of Trino's. + + + +In actual user scenarios, Doris reduces average query latency by 20% and 95th percentile latency by 50% compared to Presto while using half the resources, significantly reducing resource costs while enhancing user experience. + + + +### Convenient Service Migration + +In the process of integrating multiple data sources and achieving lakehouse transformation, migrating SQL queries to Doris is a challenge due to differences in SQL dialects across systems in terms of syntax and function support. Without a suitable migration plan, the business side may need significant modifications to adapt to the new system's SQL syntax. + +To address this issue, Doris provides a [SQL Dialect Conversion Service](sql-convertor/sql-convertor-overview.md), allowing users to directly use SQL dialects from other systems for data queries. The conversion service converts these SQL dialects into Doris SQL, greatly reducing user migration costs. Currently, Doris supports SQL dialect conversion for common query engines such as Presto/Trino, Hive, PostgreSQL, and Clickhouse, achieving a compatibility of over 99% in some actual user sc [...] + +### Modern Deployment Architecture + +Since version 3.0, Doris supports a cloud-native [compute-storage separation architecture](../compute-storage-decoupled/overview.md). This architecture, with its low cost and high elasticity, effectively improves resource utilization and enables independent scaling of compute and storage. + + + +The above diagram shows the system architecture of Doris's compute-storage separation, decoupling compute and storage. Compute nodes no longer store primary data, and the underlying shared storage layer (HDFS and object storage) serves as the unified primary data storage space, supporting independent scaling of compute and storage resources. The compute-storage separation architecture brings significant advantages to the lakehouse solution: + +* **Low-Cost Storage**: Storage and compute resources can be independently scaled, allowing enterprises to increase storage capacity without increasing compute resources. Additionally, by using cloud object storage, enterprises can enjoy lower storage costs and higher availability, while still using local high-speed disks for caching relatively low-proportion hot data. + +* **Single Source of Truth**: All data is stored in a unified storage layer, allowing the same data to be accessed and processed by different compute clusters, ensuring data consistency and integrity, and reducing the complexity of data synchronization and duplicate storage. + +* **Workload Diversity**: Users can dynamically allocate compute resources based on different workload needs, supporting various application scenarios such as batch processing, real-time analysis, and machine learning. By separating storage and compute, enterprises can more flexibly optimize resource usage, ensuring efficient operation under different loads. + +In addition, under the storage-computing coupled architecture, [elastic computing nodes](./compute-node.md) can still be used to provide elastic computing capabilities in lake warehouse data query scenarios. + +### Openness + +Doris not only supports access to open lake table formats but also has good openness for its own stored data. Doris provides an open storage API and [implements a high-speed data link based on the Arrow Flight SQL protocol](../db-connect/arrow-flight-sql-connect.md), offering the speed advantages of Arrow Flight and the ease of use of JDBC/ODBC. Based on this interface, users can access data stored in Doris using Python/Java/Spark/Flink's ABDC clients. + +Compared to open file formats, the open storage API abstracts the specific implementation of the underlying file format, allowing Doris to accelerate data access through advanced features in its storage format, such as rich indexing mechanisms. Additionally, upper-layer compute engines do not need to adapt to changes or new features in the underlying storage format, allowing all supported compute engines to simultaneously benefit from new features. + +## Lakehouse Best Practices + +In the lakehouse solution, Doris is mainly used for **lakehouse query acceleration**, **multi-source federated analysis**, and **lakehouse data processing**. + +### Lakehouse Query Acceleration + +In this scenario, Doris acts as a **compute engine**, accelerating query analysis on lakehouse data. + + + +#### Cache Acceleration + +For lakehouse systems like Hive and Iceberg, users can configure local disk caching. Local disk caching automatically stores query-designed data files in local cache directories and manages cache eviction using the LRU strategy. For details, refer to the [Data Cache](./data-cache.md) document. + +#### Materialized Views and Transparent Rewrite + +Doris supports creating materialized views for external data sources. Materialized views store pre-computed results as Doris internal table formats based on SQL definition statements. Additionally, Doris's query optimizer supports a transparent rewrite algorithm based on the SPJG (SELECT-PROJECT-JOIN-GROUP-BY) pattern. This algorithm can analyze the structure information of SQL, automatically find suitable materialized views for transparent rewrite, and select the optimal materialized vi [...] + +This feature can significantly improve query performance by reducing runtime computation. It also allows access to data in materialized views through transparent rewrite without business awareness. For details, refer to the [Materialized Views](../query-acceleration/materialized-view/async-materialized-view/overview.md) document. + +### Multi-Source Federated Analysis + +Doris can act as a **unified SQL query engine**, connecting different data sources for federated analysis, solving data silos. + + + +Users can dynamically create multiple catalogs in Doris to connect different data sources. They can use SQL statements to perform arbitrary join queries on data from different data sources. For details, refer to the [Catalog Overview](catalog-overview.md). + +### Lakehouse Data Processing + +In this scenario, **Doris acts as a data processing engine**, processing lakehouse data. + + + +#### Task Scheduling + +Doris introduces the Job Scheduler feature, enabling efficient and flexible task scheduling, reducing dependency on external systems. Combined with data source connectors, users can achieve periodic processing and storage of external data. For details, refer to the [Job Scheduler](../admin-manual/workload-management/job-scheduler.md). + +#### Data Modeling + +User typically use data lakes to store raw data and perform layered data processing on this basis, making different layers of data available to different business needs. Doris's materialized view feature supports creating materialized views for external data sources and supports further processing based on materialized views, reducing system complexity and improving data processing efficiency. + +#### Data Write-Back + +The data write-back feature forms a closed loop of Doris's lakehouse data processing capabilities. Users can directly create databases and tables in external data sources through Doris and write data. Currently, JDBC, Hive, and Iceberg data sources are supported, with more data sources to be added in the future. For details, refer to the documentation of the corresponding data source. diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/lakehouse-overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/lakehouse-overview.md index 78b437aab2f..f2d1d45ff9b 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/lakehouse-overview.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/lakehouse-overview.md @@ -24,5 +24,154 @@ specific language governing permissions and limitations under the License. --> -文章更新中,请先参阅 2.1/3.0 版本文档。 +**湖仓一体是将数据湖和数据仓库的优势相结合的现代化大数据解决方案**。其融合了数据湖的低成本、高扩展性与数据仓库的高性能、强数据治理能力,从而实现对大数据时代各类数据的高效、安全、质量可控的存储和处理分析。同时通过标准化的数据格式和元数据管理,统一了实时、历史数据,批处理和流处理,正在逐步成为企业大数据解决方案新的标准。 + +## Doris 湖仓一体解决方案 + +Doris 通过可扩展的连接器框架、存算分离架构、高性能的数据处理引擎和数据生态开放性,为用户提供了优秀的湖仓一体解决方案。 + + + +### 灵活的数据接入 + +Doris 通过可扩展的连接器框架,支持主流数据系统和数据格式接入,并提供基于 SQL 的统一数据分析能力,用户能够在不移动现有数据的情况下,轻松实现跨平台的数据查询与分析。具体可参阅 [数据目录概述](./catalog-overview.md) + +### 数据源连接器 + +无论是 Hive、Iceberg、Hudi、Paimon,还是支持 JDBC 协议的数据库系统,Doris 均能轻松连接并高效访问数据。 + +对于湖仓系统,Doris 可从元数据服务,如 Hive Metastore,AWS Glue、Unity Catalog 中获取数据表的结构和分布信息,进行合理的查询规划,并利用 MPP 架构进行分布式计算。 + +具体可参阅各数据目录文档,如 [Iceberg Catalog](./catalogs/iceberg-catalog.md) + +#### 可扩展的连接器框架 + +Doris 提供良好的扩展性框架,帮助开发人员快速对接企业内部特有的数据源,实现数据快速互通。 + +Doris 定义了标准的数据目录(Catalog)、数据库(Database)、数据表(Table)三个层级,开发人员可以方便的映射到所需对接的数据源层级。Doris 同时提供标准的元数据服务和数据读取服务的接口,开发人员只需按照接口定义实现对应的访问逻辑,即可完成数据源的对接。 + +Doris 兼容 Trino Connector 插件,可直接将 Trino 插件包部署到 Doris 集群,经过少量配置即可访问对应的数据源。Doris 目前已经完成了 [Kudu](./catalogs/kudu-catalog.md)、[BigQuery](./catalogs/bigquery-catalog.md)、[Delta Lake](./catalogs/delta-lake-catalog.md) 等数据源的对接。也可以 [自行适配新的插件](https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide)。 + +#### 便捷的跨源数据处理 + +Doris 支持在运行时直接创建多个数据源连接器,并使用 SQL 对这些数据源进行联邦查询。比如用户可以将 Hive 中的事实表数据与 MySQL 中的维度表数据进行关联查询: + +```sql +SELECT h.id, m.name +FROM hive.db.hive_table h JOIN mysql.db.mysql_table m +ON h.id = m.id; +``` + +结合 Doris 内置的 [作业调度](../admin-manual/workload-management/job-scheduler.md) 能力,还可以创建定时任务,进一步简化系统复杂度。比如用户可以将上述查询的结果,设定为每小时执行一次的例行任务,并将每次的结果,写入一张 Iceberg 表: + +```sql +CREATE JOB schedule_load +ON SCHEDULE EVERY 1 HOUR DO +INSERT INTO iceberg.db.ice_table +SELECT h.id, m.name +FROM hive.db.hive_table h JOIN mysql.db.mysql_table m +ON h.id = m.id; +``` + +### 高性能的数据处理 + +Doris 作为分析型数据仓库,在湖仓数据处理和计算方面做了大量优化,并提供了丰富的查询加速功能: + +* 执行引擎 + + Doris 执行引擎基于 MPP 执行框架和 Pipeline 数据处理模型,能够很好的在多机多核的分布式环境下快速处理海量数据。同时,得益于完全的向量化执行算子,在计算性能方面,Doris 在 TPC-DS 等标准评测数据集中处于领先地位。 + +* 查询优化器 + + Doris 能通过查询优化器自动优化和处理复杂的 SQL 请求。查询优化器针对多表关联、聚合、排序、分页等多种复杂 SQL 算子进行了深度优化,充分利用代价模型和关系代数变化,自动获取较优或最优的逻辑执行计划和物理执行计划,极大降低用户编写 SQL 的难度,提升易用性和性能。 + +* 缓存加速与 IO 优化 + + 外部数据源的访问,通常是网络访问,因此存在延迟高、稳定性差等问题。Apache Doris 提供了丰富的缓存机制,并在缓存的类型、时效性、策略方面都做了大量的优化,充分利用内存和本地高速磁盘,提升热点数据的分析性能。同时,针对网络 IO 高吞吐、低 IOPS、高延迟的特性,Doris 也进行了针对性的优化,可以提供媲美本地数据的外部数据源访问性能。 + +* 物化视图与透明加速 + + Doris 提供丰富的物化视图更新策略,支持全量和分区级别的增量刷新,以降低构建成本并提升时效性。除手动刷新外,Doris 还支持定时刷新和数据驱动刷新,进一步降低维护成本并提高数据一致性。物化视图还具备透明加速功能,查询优化器能够自动路由到合适的物化视图,实现无缝查询加速。此外,Doris 的物化视图采用高性能存储格式,通过列存、压缩和智能索引技术,提供高效的数据访问能力,能够作为数据缓存的替代方案,提升查询效率。 + +如下所示,在基于 Iceberg 表格式的 1TB 的 TPCDS 标准测试集上,Doris 执行 99 个查询的总体运行仅为 Trino 的 1/3。 + + + +实际用户场景中,Doris 在使用一半资源的情况下,相比 Presto 平均查询延迟降低了 20%,95 分位延迟更是降低 50%。在提升用户体验的同时,极大降低了资源成本。 + + + +### 便捷的业务迁移 + +在企业整合多个数据源并实现湖仓一体转型的过程中,迁移业务的 SQL 查询到 Doris 是一项挑战,因为不同系统的 SQL 方言在语法和函数支持上存在差异。若没有合适的迁移方案,业务侧可能需要进行大量改造以适应新系统的 SQL 语法。 + +为了解决这个问题,Doris 提供了 [SQL 方言转换服务](sql-convertor/sql-convertor-overview.md),允许用户直接使用其他系统的 SQL 方言进行数据查询。转换服务会将这些 SQL 方言转换为 Doris SQL,极大降低了用户的迁移成本。目前,Doris 支持 Presto/Trino、Hive、PostgreSQL 和 Clickhouse 等常见查询引擎的 SQL 方言转换,在某些实际用户场景中,兼容率可达到 99% 以上。 + +### 现代化的部署架构 + +自 3.0 版本以来,Doris 支持面向云原生的 [存算分离架构](../compute-storage-decoupled/overview.md)。这一架构凭借低成本和高弹性的特点,能够有效提高资源利用率,实现计算和存储的独立扩展。 + + + +上图是 Doris 存算分离的系统架构,对计算与存储进行了解耦,计算节点不再存储主数据,底层共享存储层(HDFS 与对象存储)作为统一的数据主存储空间,并支持计算资源和存储资源独立扩缩容。存算分离架构为湖仓一体解决方案带来了显著的优势: + +* **低成本存储**:储和计算资源可独立扩展,企业可以根据需要增加存储容量而不必增加计算资源。同时,通过使用云上的对象存储,企业可以享受更低的存储成本和更高的可用性,对于比例相对较低的热点数据,依然可以使用本地高速磁盘进行缓存。 + +* **唯一可信来源**:有数据都存储在统一的存储层中,同一份数据供不同的计算集群访问和处理,确保数据的一致性和完整性,也减少数据同步和重复存储的复杂性。 + +* **负载多样性**:以根据不同的工作负载需求动态调配计算资源,支持批处理、实时分析和机器学习等多种应用场景。通过分离存储和计算,企业可以更灵活地优化资源使用,确保在不同负载下的高效运行。 + +此外,在存算一体架构下,依然可以通过 [弹性计算节点](./compute-node.md) 在湖仓数据查询场景提供弹性计算能力。 + +### 开放性 + +Doris 不仅支持开放湖表格式的访问,其自身存储的数据同样拥有良好的开放性。Doris 提供了开放存储 API,并[基于 Arrow Flight SQL 协议实现了高速数据链路](../db-connect/arrow-flight-sql-connect.md),具备 Arrow Flight 的速度优势以及 JDBC/ODBC 的易用性。基于该接口,用户可以使用 Python/Java/Spark/Flink 的 ABDC 客户端访问 Doris 中存储的数据。 + +与开放文件格式相比,开放存储 API 屏蔽了底层的文件格式的具体实现,Doris 可以通过自身存储格式中的高级特性,如丰富的索引机制来加速数据访问。同时,上层的计算引擎无需对底层存储格式的变更或新特性进行适配,所有支持的该协议的计算引擎都可以同步享受到新特性带来的收益。 + +## 湖仓一体最佳实践 + +Doris 在湖仓一体方案中,主要用于 **湖仓查询加速**、**多源联邦分析** 和 **湖仓数据处理**。 + +### 湖仓查询加速 + +在该场景中,Doris 作为 **计算引擎**,对湖仓中数据进行查询分析加速。 + + + +#### 缓存加速 + +针对 Hive、Iceberg 等湖仓系统,用户可以配置本地磁盘缓存。本地磁盘缓存会自动将查询设计的数据文件存储在本地缓存目录中,并使用 LRU 策略管理缓存的汰换。具体可参阅 [数据缓存](./data-cache.md) 文档。 + +#### 物化视图与透明改写 + +Doris 支持对外部数据源创建物化视图。物化视图根据 SQL 定义语句,预先将计算结果存储为 Doris 内表格式。同时,Doris 的查询优化器支持基于 SPJG(SELECT-PROJECT-JOIN-GROUP-BY)模式的透明改写算法。该算法能够分析 SQL 的结构信息,自动寻找合适的物化视图进行透明改写,并选择最优的物化视图来响应查询 SQL。 + +该功能通过减少运行时的计算量,可显著提升查询性能。同时可以在业务无感知的情况下,通过透明改写访问到物化视图中的数据。具体可参阅 [物化视图](../query-acceleration/materialized-view/async-materialized-view/overview.md) 文档。 + +### 多源联邦分析 + +Doris 可以作为 **统一 SQL 查询引擎**,连接不同数据源进行联邦分析,解决数据孤岛。 + + + +用户可以在 Doris 中动态创建多个 Catalog 连接不同的数据源。并使用 SQL 语句对不同数据源中的数据进行任意关联查询。具体可参阅 [数据目录概述](catalog-overview.md)。 + +### 湖仓数据处理 + +在该场景中,**Doris 作为数据处理引擎**,对湖仓数据进行加工处理。 + + + +#### 定时任务调度 + +Doris 通过引入 Job Scheduler 功能,可以实现高效灵活的任务调度,减少了对外部系统的依赖。结合数据源连接器,用户可以实现外部数据的定期加工入库。具体可参阅 [作业调度](../admin-manual/workload-management/job-scheduler.md)。 + +#### 数据分层加工 + +企业通常会使用数据湖存储原始数据,在此基础上进行数据分层加工,将不同层的数据开放给不同的业务需求方。Doris 的物化视图功能支持对外部数据源创建物化视图,并支持在基于物化视图在加工,降低了分层加工的系统复杂度,提升数据处理效率。 + +#### 数据写回 + +数据写回功能将 Doris 的湖仓数据处理能力形成闭环。户可以直接通过 Doris 在外部数据源中创建数据库、表,并写入数据。当前支持 JDBC、Hive 和 Iceberg 三类数据源,后续会增加更多的数据源支持。具体可以参阅对应数据源的文档。 diff --git a/static/images/Lakehouse/compute-storage-decouple.png b/static/images/Lakehouse/compute-storage-decouple.png new file mode 100644 index 00000000000..94f6a7b1b69 Binary files /dev/null and b/static/images/Lakehouse/compute-storage-decouple.png differ diff --git a/static/images/Lakehouse/data-management.png b/static/images/Lakehouse/data-management.png new file mode 100644 index 00000000000..ebfc9755e52 Binary files /dev/null and b/static/images/Lakehouse/data-management.png differ diff --git a/static/images/Lakehouse/federation-query.png b/static/images/Lakehouse/federation-query.png new file mode 100644 index 00000000000..1a94a452286 Binary files /dev/null and b/static/images/Lakehouse/federation-query.png differ diff --git a/static/images/Lakehouse/lakehouse-arch-1.png b/static/images/Lakehouse/lakehouse-arch-1.png new file mode 100644 index 00000000000..aef267c7278 Binary files /dev/null and b/static/images/Lakehouse/lakehouse-arch-1.png differ diff --git a/static/images/Lakehouse/performance.png b/static/images/Lakehouse/performance.png new file mode 100644 index 00000000000..31d2137c708 Binary files /dev/null and b/static/images/Lakehouse/performance.png differ diff --git a/static/images/Lakehouse/query-acceleration.png b/static/images/Lakehouse/query-acceleration.png new file mode 100644 index 00000000000..a322a14006c Binary files /dev/null and b/static/images/Lakehouse/query-acceleration.png differ diff --git a/static/images/Lakehouse/tpcds1000.png b/static/images/Lakehouse/tpcds1000.png new file mode 100644 index 00000000000..780a00de078 Binary files /dev/null and b/static/images/Lakehouse/tpcds1000.png differ --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org