(doris-website) branch master updated: [quick-start] refactor quick start (#857)

morningman Tue, 16 Jul 2024 09:17:48 -0700

This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git



The following commit(s) were added to refs/heads/master by this push:
     new d10f50602a7 [quick-start] refactor quick start (#857)
d10f50602a7 is described below

commit d10f50602a7f3ba5c5523da9f422399c1b28b35d
Author: Mingyu Chen <morning...@163.com>
AuthorDate: Wed Jul 17 00:17:37 2024 +0800

    [quick-start] refactor quick start (#857)
    
    ```
    快速开始
      -- 快速体验
        -- 快速开始
        -- Apache Doris & Hudi 快速开始
    ```
    
    ```
    Getting Started
      -- Quick Start
        -- Quick Start
        -- Apache Doris & Hudi Quick Start
    ```
---
 docs/get-starting/quick-start/doris-hudi.md        | 313 ++++++++++++++++++++
 docs/get-starting/{ => quick-start}/quick-start.md |   0
 docs/lakehouse/datalake-analytics/hudi.md          |   1 +
 docusaurus.config.js                               |   8 +-
 .../docusaurus-plugin-content-docs/current.json    |   6 +-
 .../current/get-starting/quick-start/doris-hudi.md | 314 +++++++++++++++++++++
 .../get-starting/{ => quick-start}/quick-start.md  |   8 +-
 .../current/lakehouse/datalake-analytics/hudi.md   |   2 +
 .../version-2.0.json                               |   4 +
 .../get-starting/quick-start}/quick-start.md       |  48 ++--
 .../version-2.1.json                               |   4 +
 .../get-starting/quick-start/doris-hudi.md         | 314 +++++++++++++++++++++
 .../get-starting/quick-start}/quick-start.md       |  59 ++--
 .../lakehouse/datalake-analytics/hudi.md           |   2 +
 sidebars.json                                      |  11 +-
 src/pages/learning/index.tsx                       |   2 +-
 static/images/quick-start/lakehouse-arch.PNG       | Bin 0 -> 207255 bytes
 .../version-2.0/get-starting/quick-start.md        | 266 -----------------
 .../get-starting/quick-start}/quick-start.md       |   0
 .../version-2.1/get-starting/quick-start.md        | 266 -----------------
 .../get-starting/quick-start/doris-hudi.md         | 313 ++++++++++++++++++++
 .../get-starting/quick-start}/quick-start.md       |   0
 .../lakehouse/datalake-analytics/hudi.md           |   1 +
 versioned_sidebars/version-2.0-sidebars.json       |   8 +-
 versioned_sidebars/version-2.1-sidebars.json       |   9 +-
 25 files changed, 1363 insertions(+), 596 deletions(-)

diff --git a/docs/get-starting/quick-start/doris-hudi.md 
b/docs/get-starting/quick-start/doris-hudi.md
new file mode 100644
index 00000000000..f7426e3a3fc
--- /dev/null
+++ b/docs/get-starting/quick-start/doris-hudi.md
@@ -0,0 +1,313 @@
+---
+{
+    "title": "Apache Doris & Hudi Quick Start",
+    "language": "en"
+}
+
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+As a new open data management architecture, the Data Lakehouse integrates the 
high performance and real-time capabilities of data warehouses with the low 
cost and flexibility of data lakes, helping users more conveniently meet 
various data processing and analysis needs. It has been increasingly applied in 
enterprise big data systems.
+
+In recent versions, Apache Doris has deepened its integration with data lakes 
and has evolved a mature Data Lakehouse solution.
+
+- Since version 0.15, Apache Doris has introduced Hive and Iceberg external 
tables, exploring the capabilities of combining with Apache Iceberg for data 
lakes.
+- Starting from version 1.2, Apache Doris officially introduced the 
Multi-Catalog feature, enabling automatic metadata mapping and data access for 
various data sources, along with numerous performance optimizations for 
external data reading and query execution. It now fully possesses the ability 
to build a high-speed and user-friendly Lakehouse architecture.
+- In version 2.1, Apache Doris's Data Lakehouse architecture was significantly 
enhanced, improving the reading and writing capabilities of mainstream data 
lake formats (Hudi, Iceberg, Paimon, etc.), introducing compatibility with 
multiple SQL dialects, and seamless migration from existing systems to Apache 
Doris. For data science and large-scale data reading scenarios, Doris 
integrated the Arrow Flight high-speed reading interface, achieving a 100-fold 
increase in data transfer efficiency.
+
+![](/images/quick-start/lakehouse-arch.PNG)
+
+## Apache Doris & Hudi
+
+[Apache Hudi](https://hudi.apache.org/) is currently one of the most popular 
open data lake formats and a transactional data lake management platform, 
supporting various mainstream query engines including Apache Doris.
+
+Apache Doris has also enhanced its ability to read Apache Hudi data tables:
+
+- Supports Copy on Write Table: Snapshot Query
+- Supports Merge on Read Table: Snapshot Queries, Read Optimized Queries
+- Supports Time Travel
+- Supports Incremental Read
+
+With Apache Doris's high-performance query execution and Apache Hudi's 
real-time data management capabilities, efficient, flexible, and cost-effective 
data querying and analysis can be achieved. It also provides robust data 
lineage, auditing, and incremental processing functionalities. The combination 
of Apache Doris and Apache Hudi has been validated and promoted in real 
business scenarios by multiple community users:
+
+- Real-time data analysis and processing: Common scenarios such as real-time 
data updates and query analysis in industries like finance, advertising, and 
e-commerce require real-time data processing. Hudi enables real-time data 
updates and management while ensuring data consistency and reliability. Doris 
efficiently handles large-scale data query requests in real-time, meeting the 
demands of real-time data analysis and processing effectively when combined.
+- Data lineage and auditing: For industries with high requirements for data 
security and accuracy like finance and healthcare, data lineage and auditing 
are crucial functionalities. Hudi offers Time Travel functionality for viewing 
historical data states, combined with Apache Doris's efficient querying 
capabilities, enabling quick analysis of data at any point in time for precise 
lineage and auditing.
+- Incremental data reading and analysis: Large-scale data analysis often faces 
challenges of large data volumes and frequent updates. Hudi supports 
incremental data reading, allowing users to process only the changed data 
without full data updates. Additionally, Apache Doris's Incremental Read 
feature enhances this process, significantly improving data processing and 
analysis efficiency.
+- Cross-data source federated queries: Many enterprises have complex data 
sources stored in different databases. Doris's Multi-Catalog feature supports 
automatic mapping and synchronization of various data sources, enabling 
federated queries across data sources. This greatly shortens the data flow path 
and enhances work efficiency for enterprises needing to retrieve and integrate 
data from multiple sources for analysis.
+
+This article will introduce readers to how to quickly set up a test and 
demonstration environment for Apache Doris + Apache Hudi in a Docker 
environment, and demonstrate various operations to help readers get started 
quickly.
+
+For more information, please refer to [Hudi 
Catalog](../../lakehouse/datalake-analytics/hudi.md)
+
+## User Guide
+
+All scripts and code mentioned in this article can be obtained from this 
address: 
[https://github.com/apache/doris/tree/master/samples/datalake/hudi](https://github.com/apache/doris/tree/master/samples/datalake/hudi)
+
+### 01 Environment Preparation
+
+This article uses Docker Compose for deployment, with the following components 
and versions:
+
+| Component | Version |
+| --- | --- |
+| Apache Doris | Default 2.1.4, can be modified |
+| Apache Hudi | 0.14 |
+| Apache Spark | 3.4.2 |
+| Apache Hive | 2.1.3 |
+| MinIO | 2022-05-26T05-48-41Z |
+
+### 02 Environment Deployment
+
+1. Create a Docker network
+
+       `sudo docker network create -d bridge hudi-net`
+
+2. Start all components
+
+       `sudo ./start-hudi-compose.sh`
+       
+       > Note: Before starting, you can modify the `DORIS_PACKAGE` and 
`DORIS_DOWNLOAD_URL` in `start-hudi-compose.sh` to the desired Doris version. 
It is recommended to use version 2.1.4 or higher.
+
+3. After starting, you can use the following script to log in to Spark command 
line or Doris command line:
+
+       ```
+       -- Doris
+       sudo ./login-spark.sh
+       
+       -- Spark
+       sudo ./login-doris.sh
+       ```
+
+### 03 Data Preparation
+
+Next, generate Hudi data through Spark. As shown in the code below, there is 
already a Hive table named `customer` in the cluster. You can create a Hudi 
table using this Hive table:
+
+```
+-- ./login-spark.sh
+spark-sql> use default;
+
+-- create a COW table
+spark-sql> CREATE TABLE customer_cow
+USING hudi
+TBLPROPERTIES (
+  type = 'cow',
+  primaryKey = 'c_custkey',
+  preCombineField = 'c_name'
+)
+PARTITIONED BY (c_nationkey)
+AS SELECT * FROM customer;
+
+-- create a MOR table
+spark-sql> CREATE TABLE customer_mor
+USING hudi
+TBLPROPERTIES (
+  type = 'mor',
+  primaryKey = 'c_custkey',
+  preCombineField = 'c_name'
+)
+PARTITIONED BY (c_nationkey)
+AS SELECT * FROM customer;
+```
+
+### 04 Data Query
+
+As shown below, a Catalog named `hudi` has been created in the Doris cluster 
(can be viewed using `SHOW CATALOGS`). The following is the creation statement 
for this Catalog:
+
+```
+-- Already created, no need to execute again
+CREATE CATALOG `hive` PROPERTIES (
+    "type"="hms",
+    'hive.metastore.uris' = 'thrift://hive-metastore:9083',
+    "s3.access_key" = "minio",
+    "s3.secret_key" = "minio123",
+    "s3.endpoint" = "http://minio:9000";,
+    "s3.region" = "us-east-1",
+    "use_path_style" = "true"
+);
+```
+
+1. Manually refresh this Catalog to synchronize the created Hudi table:
+
+       ```
+       -- ./login-doris.sh
+       doris> REFRESH CATALOG hive;
+       ```
+
+2. Operations on data in Hudi using Spark are immediately visible in Doris 
without the need to refresh the Catalog. We insert a row of data into both COW 
and MOR tables using Spark:
+
+       ```
+       spark-sql> insert into customer_cow values (100, "Customer#000000100", 
"jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", "cial ideas. final, furious 
requests", 25);
+       spark-sql> insert into customer_mor values (100, "Customer#000000100", 
"jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", "cial ideas. final, furious 
requests", 25);
+       ```
+
+3. Through Doris, you can directly query the latest inserted data:
+
+       ```
+       doris> use hive.default;
+       doris> select * from customer_cow where c_custkey = 100;
+       doris> select * from customer_mor where c_custkey = 100;
+       ```
+
+4. Insert data with c_custkey=32 that already exists using Spark, thus 
overwriting the existing data:
+
+       ```
+       spark-sql> insert into customer_cow values (32, 
"Customer#000000032_update", "jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", 
"cial ideas. final, furious requests", 15);
+       spark-sql> insert into customer_mor values (32, 
"Customer#000000032_update", "jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", 
"cial ideas. final, furious requests", 15);
+       ```
+
+5. With Doris, you can query the updated data:
+
+       ```
+       doris> select * from customer_cow where c_custkey = 32;
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       | c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       |        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 | 
  3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       doris> select * from customer_mor where c_custkey = 32;
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       | c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       |        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 | 
  3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       ```
+
+### 05 Incremental Read
+
+Incremental Read is one of the features provided by Hudi. With Incremental 
Read, users can obtain incremental data within a specified time range, enabling 
incremental processing of data. In this regard, Doris can query the changed 
data after inserting `c_custkey=100`. As shown below, we inserted a data with 
`c_custkey=32`:
+
+```
+doris> select * from customer_cow@incr('beginTime'='20240603015018572');
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+| c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+|        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 |   
3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+spark-sql> select * from hudi_table_changes('customer_cow', 'latest_state', 
'20240603015018572');
+
+doris> select * from customer_mor@incr('beginTime'='20240603015058442');
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+| c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+|        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 |   
3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+spark-sql> select * from hudi_table_changes('customer_mor', 'latest_state', 
'20240603015058442');
+```
+
+### 06 TimeTravel
+
+Doris supports querying specific snapshot versions of Hudi data, thereby 
enabling Time Travel functionality for data. First, you can query the commit 
history of two Hudi tables using Spark:
+
+```
+spark-sql> call show_commits(table => 'customer_cow', limit => 10);
+20240603033556094        20240603033558249        commit        448833        
0        1        1        183        0        0
+20240603015444737        20240603015446588        commit        450238        
0        1        1        202        1        0
+20240603015018572        20240603015020503        commit        436692        
1        0        1        1        0        0
+20240603013858098        20240603013907467        commit        44902033       
 100        0        25        18751        0        0
+
+spark-sql> call show_commits(table => 'customer_mor', limit => 10);
+20240603033745977        20240603033748021        deltacommit        1240      
  0        1        1        0        0        0
+20240603015451860        20240603015453539        deltacommit        1434      
  0        1        1        1        1        0
+20240603015058442        20240603015100120        deltacommit        436691    
    1        0        1        1        0        0
+20240603013918515        20240603013922961        deltacommit        44904040  
      100        0        25        18751        0        0
+```
+
+Next, using Doris, you can execute `c_custkey=32` to query the data snapshot 
before the data insertion. As shown below, the data with `c_custkey=32` has not 
been updated yet:
+
+> Note: Time Travel syntax is currently not supported by the new optimizer. 
You need to first execute `set enable_nereids_planner=false;` to disable the 
new optimizer. This issue will be fixed in future versions.
+
+```
+doris> select * from customer_cow for time as of '20240603015018572' where 
c_custkey = 32 or c_custkey = 100;
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+| c_custkey | c_name             | c_address                             | 
c_phone         | c_acctbal | c_mktsegment | c_comment                          
              | c_nationkey |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+|        32 | Customer#000000032 | jD2xZzi UmId,DCtNBLXKj9q0Tlp2iQ6ZcO3J | 
25-430-914-2194 |   3471.53 | BUILDING     | cial ideas. final, furious 
requests across the e |          15 |
+|       100 | Customer#000000100 | jD2xZzi                               | 
25-430-914-2194 |   3471.59 | BUILDING     | cial ideas. final, furious 
requests              |          25 |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+-- compare with spark-sql
+spark-sql> select * from customer_mor timestamp as of '20240603015018572' 
where c_custkey = 32 or c_custkey = 100;
+
+doris> select * from customer_mor for time as of '20240603015058442' where 
c_custkey = 32 or c_custkey = 100;
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+| c_custkey | c_name             | c_address                             | 
c_phone         | c_acctbal | c_mktsegment | c_comment                          
              | c_nationkey |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+|       100 | Customer#000000100 | jD2xZzi                               | 
25-430-914-2194 |   3471.59 | BUILDING     | cial ideas. final, furious 
requests              |          25 |
+|        32 | Customer#000000032 | jD2xZzi UmId,DCtNBLXKj9q0Tlp2iQ6ZcO3J | 
25-430-914-2194 |   3471.53 | BUILDING     | cial ideas. final, furious 
requests across the e |          15 |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+spark-sql> select * from customer_mor timestamp as of '20240603015058442' 
where c_custkey = 32 or c_custkey = 100;
+```
+
+## Query Optimization
+
+Data in Apache Hudi can be roughly divided into two categories - baseline data 
and incremental data. Baseline data is typically merged Parquet files, while 
incremental data refers to data increments generated by INSERT, UPDATE, or 
DELETE operations. Baseline data can be read directly, while incremental data 
needs to be read through Merge on Read.
+
+For querying Hudi COW tables or Read Optimized queries on MOR tables, the data 
belongs to baseline data and can be directly read using Doris's native Parquet 
Reader, providing fast query responses. For incremental data, Doris needs to 
access Hudi's Java SDK through JNI calls. To achieve optimal query performance, 
Apache Doris divides the data in a query into baseline and incremental data 
parts and reads them using the aforementioned methods.
+
+To verify this optimization approach, we can use the EXPLAIN statement to see 
how many baseline and incremental data are present in a query example below. 
For a COW table, all 101 data shards are baseline data 
(`hudiNativeReadSplits=101/101`), so the COW table can be entirely read 
directly using Doris's Parquet Reader, resulting in the best query performance. 
For a ROW table, most data shards are baseline data 
(`hudiNativeReadSplits=100/101`), with one shard being incremental data, which 
[...]
+
+```
+-- COW table is read natively
+doris> explain select * from customer_cow where c_custkey = 32;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_cow                                       |
+|      predicates: (c_custkey[#5] = 32)                          |
+|      inputSplitNum=101, totalFileSize=45338886, scanRanges=101 |
+|      partition=26/26                                           |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=101/101                              |
+
+-- MOR table: because only the base file contains `c_custkey = 32` that is 
updated, 100 splits are read natively, while the split with log file is read by 
JNI.
+doris> explain select * from customer_mor where c_custkey = 32;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_mor                                       |
+|      predicates: (c_custkey[#5] = 32)                          |
+|      inputSplitNum=101, totalFileSize=45340731, scanRanges=101 |
+|      partition=26/26                                           |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=100/101                              |
+```
+
+You can further observe the changes in Hudi baseline data and incremental data 
by performing some deletion operations using Spark:
+
+```
+-- Use delete statement to see more differences
+spark-sql> delete from customer_cow where c_custkey = 64;
+doris> explain select * from customer_cow where c_custkey = 64;
+
+spark-sql> delete from customer_mor where c_custkey = 64;
+doris> explain select * from customer_mor where c_custkey = 64;
+```
+
+Additionally, you can reduce the data volume further by using partition 
conditions for partition pruning to improve query speed. In the example below, 
partition pruning is done using the partition condition `c_nationkey=15`, 
allowing the query request to access data from only one partition 
(`partition=1/26`).
+
+```
+-- customer_xxx is partitioned by c_nationkey, we can use the partition column 
to prune data
+doris> explain select * from customer_mor where c_custkey = 64 and c_nationkey 
= 15;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_mor                                       |
+|      predicates: (c_custkey[#5] = 64), (c_nationkey[#12] = 15) |
+|      inputSplitNum=4, totalFileSize=1798186, scanRanges=4      |
+|      partition=1/26                                            |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=3/4                                  |
+```
diff --git a/docs/get-starting/quick-start.md 
b/docs/get-starting/quick-start/quick-start.md
similarity index 100%
copy from docs/get-starting/quick-start.md
copy to docs/get-starting/quick-start/quick-start.md
diff --git a/docs/lakehouse/datalake-analytics/hudi.md 
b/docs/lakehouse/datalake-analytics/hudi.md
index 985f1aa363f..83ea52476fb 100644
--- a/docs/lakehouse/datalake-analytics/hudi.md
+++ b/docs/lakehouse/datalake-analytics/hudi.md
@@ -24,6 +24,7 @@ specific language governing permissions and limitations
 under the License.
 -->
 
+[Apache Doris & Hudi Quick Start](../../get-starting/quick-start/doris-hudi.md)
 
 ## Usage
 
diff --git a/docusaurus.config.js b/docusaurus.config.js
index a8ecf6ee5af..4ad270771b3 100644
--- a/docusaurus.config.js
+++ b/docusaurus.config.js
@@ -156,11 +156,11 @@ const config = {
                     // /docs/oldDoc -> /docs/newDoc
                     {
                         from: '/docs/dev/summary/basic-summary',
-                        to: '/docs/dev/get-starting/quick-start',
+                        to: '/docs/dev/get-starting/quick-start/',
                     },
                     {
-                        from: '/docs/dev/get-starting',
-                        to: '/docs/dev/get-starting/quick-start',
+                        from: '/docs/dev/get-starting/',
+                        to: '/docs/dev/get-starting/quick-start/',
                     },
                 ],
             },
@@ -247,7 +247,7 @@ const config = {
                     {
                         position: 'left',
                         label: 'Docs',
-                        to: '/docs/get-starting/quick-start',
+                        to: '/docs/get-starting/quick-start/',
                     },
                     { to: '/blog', label: 'Blog', position: 'left' },
                     { to: '/users', label: 'Users', position: 'left' },
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current.json 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
index bb5fe224c59..adab98febf2 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
@@ -7,6 +7,10 @@
     "message": "快速开始",
     "description": "The label for category Getting Started in sidebar docs"
   },
+  "sidebar.docs.category.Quick Start": {
+    "message": "快速体验",
+    "description": "The label for category Quick Start in sidebar docs"
+  },
   "sidebar.docs.category.Installation and Deployment": {
     "message": "安装部署",
     "description": "The label for category Installation and Deployment in 
sidebar docs"
@@ -383,4 +387,4 @@
     "message": "实践教程",
     "description": "The label for category Practical Guide in sidebar docs"
   }
-}
\ No newline at end of file
+}
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/get-starting/quick-start/doris-hudi.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/get-starting/quick-start/doris-hudi.md
new file mode 100644
index 00000000000..28ddfda61cf
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/get-starting/quick-start/doris-hudi.md
@@ -0,0 +1,314 @@
+---
+{
+    "title": "Apache Doris & Hudi 快速开始",
+    "language": "zh-CN"
+}
+
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+作为一种全新的开放式的数据管理架构，湖仓一体（Data 
Lakehouse）融合了数据仓库的高性能、实时性以及数据湖的低成本、灵活性等优势，帮助用户更加便捷地满足各种数据处理分析的需求，在企业的大数据体系中已经得到越来越多的应用。
+
+在过去多个版本中，Apache Doris 持续加深与数据湖的融合，当前已演进出一套成熟的湖仓一体解决方案。
+
+- 自 0.15 版本起，Apache Doris 引入 Hive 和 Iceberg 外部表，尝试在 Apache Iceberg 
之上探索与数据湖的能力结合。
+- 自 1.2 版本起，Apache Doris 正式引入 Multi-Catalog 
功能，实现了多种数据源的自动元数据映射和数据访问、并对外部数据读取和查询执行等方面做了诸多性能优化，完全具备了构建极速易用 Lakehouse 架构的能力。
+- 在 2.1 版本中，Apache Doris 湖仓一体架构得到全面加强，不仅增强了主流数据湖格式（Hudi、Iceberg、Paimon 
等）的读取和写入能力，还引入了多 SQL 方言兼容、可从原有系统无缝切换至 Apache Doris。在数据科学及大规模数据读取场景上， Doris 集成了 
Arrow Flight 高速读取接口，使得数据传输效率实现 100 倍的提升。
+
+![](/images/quick-start/lakehouse-arch.PNG)
+
+## Apache Doris & Hudi
+
+[Apache Hudi](https://hudi.apache.org/) 是目前最主流的开放数据湖格式之一，也是事务性的数据湖管理平台，支持包括 
Apache Doris 在内的多种主流查询引擎。
+
+Apache Doris 同样对 Apache Hudi 数据表的读取能力进行了增强：
+
+- 支持 Copy on Write Table：Snapshot Query
+- 支持 Merge on Read Table：Snapshot Queries, Read Optimized Queries
+- 支持 Time Travel
+- 支持 Incremental Read
+
+凭借 Apache Doris 的高性能查询执行以及 Apache Hudi 
的实时数据管理能力，可以实现高效、灵活、低成本的数据查询和分析，同时也提供了强大的数据回溯、审计和增量处理功能，当前基于 Apache Doris 和 
Apache Hudi 的组合已经在多个社区用户的真实业务场景中得到验证和推广：
+
+- 实时数据分析与处理：比如金融行业交易分析、广告行业实时点击流分析、电商行业用户行为分析等常见场景下，都要求实时的数据更新及查询分析。Hudi 
能够实现对数据的实时更新和管理，并保证数据的一致性和可靠性，Doris 则能够实时高效处理大规模数据查询请求，二者结合能够充分满足实时数据分析与处理的需求。
+- 数据回溯与审计：对于金融、医疗等对数据安全和准确性要求极高的行业来说，数据回溯和审计是非常重要的功能。Hudi 提供了时间旅行（Time 
Travel）功能，允许用户查看历史数据状态，结合 Apache Doris 高效查询能力，可快速查找分析任何时间点的数据，实现精确的回溯和审计。
+- 增量数据读取与分析：在进行大数据分析时往往面临着数据规模庞大、更新频繁的问题，Hudi 
支持增量数据读取，这使得用户可以只需处理变化的数据，不必进行全量数据更新；同时 Apache Doris 的 Incremental Read 
功能也可使这一过程更加高效，显著提升了数据处理和分析的效率。
+- 跨数据源联邦查询：许多企业数据来源复杂，数据可能存储在不同的数据库中。Doris 的 Multi-Catalog 
功能支持多种数据源的自动映射与同步，支持跨数据源的联邦查询。这对于需要从多个数据源中获取和整合数据进行分析的企业来说，极大地缩短了数据流转路径，提升了工作效率。
+
+本文将在 Docker 环境下，为读者介绍如何快速搭建 Apache Doris + Apache Hudi 
的测试及演示环境，并对各功能操作进行演示，帮助读者快速入门。
+
+关于更多说明，请参阅 [Hudi Catalog](../../lakehouse/datalake-analytics/hudi.md)
+
+## 使用指南
+
+本文涉及所有脚本和代码可以从该地址获取：[https://github.com/apache/doris/tree/master/samples/datalake/hudi](https://github.com/apache/doris/tree/master/samples/datalake/hudi)
+
+### 01 环境准备
+
+本文示例采用 Docker Compose 部署，组件及版本号如下：
+
+| 组件名称 | 版本 |
+| --- | --- |
+| Apache Doris | 默认 2.1.4，可修改 |
+| Apache Hudi | 0.14|
+| Apache Spark | 3.4.2|
+| Apache Hive | 2.1.3|
+| MinIO | 2022-05-26T05-48-41Z|
+
+
+### 02 环境部署
+
+1. 创建 Docker 网络
+
+       `sudo docker network create -d bridge hudi-net`
+
+2. 启动所有组件
+
+       `sudo ./start-hudi-compose.sh`
+       
+       > 注：启动前，可将 `start-hudi-compose.sh` 中的 `DORIS_PACKAGE` 和 
`DORIS_DOWNLOAD_URL` 修改成需要的 Doris 版本。建议使用 2.1.4 或更高版本。
+
+3. 启动后，可以使用如下脚本，登陆 Spark 命令行或 Doris 命令行：
+
+       ```
+       -- Doris
+       sudo ./login-spark.sh
+       
+       -- Spark
+       sudo ./login-doris.sh
+       ```
+
+### 03 数据准备
+
+接下来先通过 Spark 生成 Hudi 的数据。如下方代码所示，集群中已经包含一张名为 `customer` 的 Hive 表，可以通过这张 Hive 
表，创建一个 Hudi 表：
+
+```
+-- ./login-spark.sh
+spark-sql> use default;
+
+-- create a COW table
+spark-sql> CREATE TABLE customer_cow
+USING hudi
+TBLPROPERTIES (
+  type = 'cow',
+  primaryKey = 'c_custkey',
+  preCombineField = 'c_name'
+)
+PARTITIONED BY (c_nationkey)
+AS SELECT * FROM customer;
+
+-- create a MOR table
+spark-sql> CREATE TABLE customer_mor
+USING hudi
+TBLPROPERTIES (
+  type = 'mor',
+  primaryKey = 'c_custkey',
+  preCombineField = 'c_name'
+)
+PARTITIONED BY (c_nationkey)
+AS SELECT * FROM customer;
+```
+
+### 04 数据查询
+
+如下所示，Doris 集群中已经创建了名为 `hudi` 的 Catalog（可通过 `SHOW CATALOGS` 查看）。以下为该 Catalog 
的创建语句：
+
+```
+-- 已经创建，无需再次执行
+CREATE CATALOG `hive` PROPERTIES (
+    "type"="hms",
+    'hive.metastore.uris' = 'thrift://hive-metastore:9083',
+    "s3.access_key" = "minio",
+    "s3.secret_key" = "minio123",
+    "s3.endpoint" = "http://minio:9000";,
+    "s3.region" = "us-east-1",
+    "use_path_style" = "true"
+);
+```
+
+1. 手动刷新该 Catalog，对创建的 Hudi 表进行同步： 
+
+       ```
+       -- ./login-doris.sh
+       doris> REFRESH CATALOG hive;
+       ```
+
+2. 使用 Spark 操作 Hudi 中的数据，都可以在 Doris 中实时可见，不需要再次刷新 Catalog。我们通过 Spark 分别给 COW 和 
MOR 表插入一行数据：
+
+       ```
+       spark-sql> insert into customer_cow values (100, "Customer#000000100", 
"jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", "cial ideas. final, furious 
requests", 25);
+       spark-sql> insert into customer_mor values (100, "Customer#000000100", 
"jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", "cial ideas. final, furious 
requests", 25);
+       ```
+
+3. 通过 Doris 可以直接查询到最新插入的数据：
+
+       ```
+       doris> use hive.default;
+       doris> select * from customer_cow where c_custkey = 100;
+       doris> select * from customer_mor where c_custkey = 100;
+       ```
+
+4. 再通过 Spark 插入 c_custkey=32 已经存在的数据，即覆盖已有数据：
+
+       ```
+       spark-sql> insert into customer_cow values (32, 
"Customer#000000032_update", "jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", 
"cial ideas. final, furious requests", 15);
+       spark-sql> insert into customer_mor values (32, 
"Customer#000000032_update", "jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", 
"cial ideas. final, furious requests", 15);
+       ```
+
+5. 通过 Doris 可以查询更新后的数据：
+
+       ```
+       doris> select * from customer_cow where c_custkey = 32;
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       | c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       |        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 | 
  3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       doris> select * from customer_mor where c_custkey = 32;
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       | c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       |        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 | 
  3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       ```
+
+### 05 Incremental Read
+
+Incremental Read 是 Hudi 提供的功能特性之一，通过 Incremental 
Read，用户可以获取指定时间范围的增量数据，从而实现对数据的增量处理。对此，Doris 可对插入 `c_custkey=100` 
后的变更数据进行查询。如下所示，我们插入了一条 `c_custkey=32` 的数据：
+
+```
+doris> select * from customer_cow@incr('beginTime'='20240603015018572');
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+| c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+|        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 |   
3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+spark-sql> select * from hudi_table_changes('customer_cow', 'latest_state', 
'20240603015018572');
+
+doris> select * from customer_mor@incr('beginTime'='20240603015058442');
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+| c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+|        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 |   
3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+spark-sql> select * from hudi_table_changes('customer_mor', 'latest_state', 
'20240603015058442');
+```
+
+### 06 TimeTravel
+
+Doris 支持查询指定快照版本的 Hudi 数据，从而实现对数据的 Time Travel 功能。首先，可以通过 Spark 查询两张 Hudi 
表的提交历史：
+
+```
+spark-sql> call show_commits(table => 'customer_cow', limit => 10);
+20240603033556094        20240603033558249        commit        448833        
0        1        1        183        0        0
+20240603015444737        20240603015446588        commit        450238        
0        1        1        202        1        0
+20240603015018572        20240603015020503        commit        436692        
1        0        1        1        0        0
+20240603013858098        20240603013907467        commit        44902033       
 100        0        25        18751        0        0
+
+spark-sql> call show_commits(table => 'customer_mor', limit => 10);
+20240603033745977        20240603033748021        deltacommit        1240      
  0        1        1        0        0        0
+20240603015451860        20240603015453539        deltacommit        1434      
  0        1        1        1        1        0
+20240603015058442        20240603015100120        deltacommit        436691    
    1        0        1        1        0        0
+20240603013918515        20240603013922961        deltacommit        44904040  
      100        0        25        18751        0        0
+```
+
+接着，可通过 Doris 执行 `c_custkey=32` ，查询数据插入之前的数据快照。如下可看到 `c_custkey=32` 的数据还未更新：
+
+> 注：Time Travel 语法暂时不支持新优化器，需要先执行set 
enable_nereids_planner=false;关闭新优化器，该问题将会在后续版本中修复。
+
+```
+doris> select * from customer_cow for time as of '20240603015018572' where 
c_custkey = 32 or c_custkey = 100;
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+| c_custkey | c_name             | c_address                             | 
c_phone         | c_acctbal | c_mktsegment | c_comment                          
              | c_nationkey |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+|        32 | Customer#000000032 | jD2xZzi UmId,DCtNBLXKj9q0Tlp2iQ6ZcO3J | 
25-430-914-2194 |   3471.53 | BUILDING     | cial ideas. final, furious 
requests across the e |          15 |
+|       100 | Customer#000000100 | jD2xZzi                               | 
25-430-914-2194 |   3471.59 | BUILDING     | cial ideas. final, furious 
requests              |          25 |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+-- compare with spark-sql
+spark-sql> select * from customer_mor timestamp as of '20240603015018572' 
where c_custkey = 32 or c_custkey = 100;
+
+doris> select * from customer_mor for time as of '20240603015058442' where 
c_custkey = 32 or c_custkey = 100;
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+| c_custkey | c_name             | c_address                             | 
c_phone         | c_acctbal | c_mktsegment | c_comment                          
              | c_nationkey |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+|       100 | Customer#000000100 | jD2xZzi                               | 
25-430-914-2194 |   3471.59 | BUILDING     | cial ideas. final, furious 
requests              |          25 |
+|        32 | Customer#000000032 | jD2xZzi UmId,DCtNBLXKj9q0Tlp2iQ6ZcO3J | 
25-430-914-2194 |   3471.53 | BUILDING     | cial ideas. final, furious 
requests across the e |          15 |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+spark-sql> select * from customer_mor timestamp as of '20240603015058442' 
where c_custkey = 32 or c_custkey = 100;
+```
+
+## 查询优化
+
+Apache Hudi 中的数据大致可以分为两类 —— 基线数据和增量数据。基线数据通常是已经经过合并的 Parquet 文件，而增量数据是指由 
INSERT、UPDATE 或 DELETE 产生的数据增量。基线数据可以直接读取，增量数据需要通过 Merge on Read 的方式进行读取。
+
+对于 Hudi COW 表的查询或者 MOR 表的 Read Optimized 查询而言，其数据都属于基线数据，可直接通过 Doris 原生的 
Parquet Reader 读取数据文件，且可获得极速的查询响应。而对于增量数据，Doris 需要通过 JNI 调用 Hudi 的 Java SDK 
进行访问。为了达到最优的查询性能，Apache Doris 在查询时，会将一个查询中的数据分为基线和增量数据两部分，并分别使用上述方式进行读取。
+
+为验证该优化思路，我们通过 EXPLAIN 语句来查看一个下方示例的查询中，分别有多少基线数据和增量数据。对于 COW 表来说，所有 101 
个数据分片均为是基线数据（`hudiNativeReadSplits=101/101`），因此 COW 表全部可直接通过  Doris  Parquet 
Reader 进行读取，因此可获得最佳的查询性能。对于 ROW 
表，大部分数据分片是基线数据（`hudiNativeReadSplits=100/101`），一个分片数为增量数据，基本也能够获得较好的查询性能。
+
+```
+-- COW table is read natively
+doris> explain select * from customer_cow where c_custkey = 32;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_cow                                       |
+|      predicates: (c_custkey[#5] = 32)                          |
+|      inputSplitNum=101, totalFileSize=45338886, scanRanges=101 |
+|      partition=26/26                                           |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=101/101                              |
+
+-- MOR table: because only the base file contains `c_custkey = 32` that is 
updated, 100 splits are read natively, while the split with log file is read by 
JNI.
+doris> explain select * from customer_mor where c_custkey = 32;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_mor                                       |
+|      predicates: (c_custkey[#5] = 32)                          |
+|      inputSplitNum=101, totalFileSize=45340731, scanRanges=101 |
+|      partition=26/26                                           |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=100/101                              |
+```
+
+可以通过 Spark 进行一些删除操作，进一步观察 Hudi 基线数据和增量数据的变化：
+
+```
+-- Use delete statement to see more differences
+spark-sql> delete from customer_cow where c_custkey = 64;
+doris> explain select * from customer_cow where c_custkey = 64;
+
+spark-sql> delete from customer_mor where c_custkey = 64;
+doris> explain select * from customer_mor where c_custkey = 64;
+```
+
+此外，还可以通过分区条件进行分区裁剪，从而进一步减少数据量，以提升查询速度。如下示例中，通过分区条件 `c_nationkey=15` 
进行分区裁减，使得查询请求只需要访问一个分区（`partition=1/26`）的数据即可。
+
+```
+-- customer_xxx is partitioned by c_nationkey, we can use the partition column 
to prune data
+doris> explain select * from customer_mor where c_custkey = 64 and c_nationkey 
= 15;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_mor                                       |
+|      predicates: (c_custkey[#5] = 64), (c_nationkey[#12] = 15) |
+|      inputSplitNum=4, totalFileSize=1798186, scanRanges=4      |
+|      partition=1/26                                            |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=3/4                                  |
+```
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/get-starting/quick-start.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/get-starting/quick-start/quick-start.md
similarity index 98%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/current/get-starting/quick-start.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/current/get-starting/quick-start/quick-start.md
index 953d5bdf95c..4cbb3bcaad6 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/get-starting/quick-start.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/get-starting/quick-start/quick-start.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "快速体验 Apache Doris",
+    "title": "快速开始",
     "language": "zh-CN"
 }
 
@@ -41,13 +41,13 @@ under the License.
 
 ```Bash
 # 下载 Apache Doris 二进制安装包
-server1:~ doris$ wget 
https://apache-doris-releases.oss-accelerate.aliyuncs.com/apache-doris-2.0.3-bin-x64.tar.gz
+server1:~ doris$ wget 
https://apache-doris-releases.oss-accelerate.aliyuncs.com/apache-doris-2.0.12-bin-x64.tar.gz
 
 # 解压安装包
-server1:~ doris$ tar zxf apache-doris-2.0.3-bin-x64.tar.gz
+server1:~ doris$ tar zxf apache-doris-2.0.12-bin-x64.tar.gz
 
 # 目录重命名为更为简单的 apache-doris 
-server1:~ doris$ mv apache-doris-2.0.3-bin-x64 apache-doris
+server1:~ doris$ mv apache-doris-2.0.12-bin-x64 apache-doris
 ```
 
 ## 安装 Doris
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hudi.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hudi.md
index 8aa44d5c712..81088d03c9d 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hudi.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hudi.md
@@ -24,6 +24,8 @@ specific language governing permissions and limitations
 under the License.
 -->
 
+快速体验 [Apache Doris & Hudi](../../get-starting/quick-start/doris-hudi.md)
+
 ## 使用限制
 
 1. Hudi 表支持的查询类型如下：
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0.json 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0.json
index cd5fda2c047..1f7db81ccd2 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0.json
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0.json
@@ -7,6 +7,10 @@
     "message": "快速开始",
     "description": "The label for category Getting Started in sidebar docs"
   },
+  "sidebar.docs.category.Quick Start": {
+    "message": "快速体验",
+    "description": "The label for category Quick Start in sidebar docs"
+  },
   "sidebar.docs.category.Installation and Deployment": {
     "message": "安装部署",
     "description": "The label for category Installation and Deployment in 
sidebar docs"
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/get-starting/quick-start.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/get-starting/quick-start/quick-start.md
similarity index 86%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/get-starting/quick-start.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/get-starting/quick-start/quick-start.md
index 53f7f5b92d9..49915990e02 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/get-starting/quick-start.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/get-starting/quick-start/quick-start.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "快速体验 Apache Doris",
+    "title": "快速开始",
     "language": "zh-CN"
 }
 ---
@@ -24,32 +24,32 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-本指南将告诉你如何下载 Apache Doris 最新的稳定版本，在单节点上安装并运行，包括创建数据库、数据表、导入数据及查询等。
+这个简短的指南将告诉你如何下载 Apache Doris 最新稳定版本，在单节点上安装并运行它，包括创建数据库、数据表、导入数据及查询等。
 
 ## 环境准备
 
 -   选择一个 x86-64 上的主流 Linux 环境，推荐 CentOS 7.1 或者 Ubuntu 16.04 
以上版本。更多运行环境请参考安装部署部分。
 
--   Java 8 运行环境（非 Oracle JDK 商业授权用户，建议使用免费的 Oracle JDK 
8u202，[立即下载](https://www.oracle.com/java/technologies/javase/javase8-archive-downloads.html#license-lightbox)。
+-   Java 8 运行环境（非 Oracle JDK 商业授权用户，建议使用免费的 Oracle JDK 
8u202，[立即下载](https://www.oracle.com/java/technologies/javase/javase8-archive-downloads.html#license-lightbox))。
 
 -   建议在 Linux 上新建一个 Doris 用户。请避免使用 Root 用户，以防对操作系统误操作。
 
 ## 下载二进制包
 
-从 doris.apache.org 下载相应的 Doris 安装包，并且解压。
+从 doris.apache.org 下载相应的 Apache Doris 安装包，并且解压。
 
 ```Bash
-# 下载 Doris 二进制安装包
-server1:~ doris$ wget 
https://apache-doris-releases.oss-accelerate.aliyuncs.com/apache-doris-2.0.3-bin-x64.tar.gz
+# 下载 Apache Doris 二进制安装包
+server1:~ doris$ wget 
https://apache-doris-releases.oss-accelerate.aliyuncs.com/apache-doris-2.0.12-bin-x64.tar.gz
 
 # 解压安装包
-server1:~ doris$ tar zxf apache-doris-2.0.3-bin-x64.tar.gz
+server1:~ doris$ tar zxf apache-doris-2.0.12-bin-x64.tar.gz
 
 # 目录重命名为更为简单的 apache-doris 
-server1:~ doris$ mv apache-doris-2.0.3-bin-x64 apache-doris
+server1:~ doris$ mv apache-doris-2.0.12-bin-x64 apache-doris
 ```
 
-## 安装 Apache Doris
+## 安装 Doris
 
 ### 配置 FE
 
@@ -87,7 +87,7 @@ JAVA_HOME=/home/doris/jdk8
 # priority_networks =
 
 # BE 数据存放的目录，默认是在 DORIS_HOME 下的 storage 下，默认已经创建，可以更改为你的数据存储路径
-# storage_Root_path = ${DORIS_HOME}/storage
+# storage_root_path = ${DORIS_HOME}/storage
 ```
 
 ### 启动 BE
@@ -99,22 +99,24 @@ JAVA_HOME=/home/doris/jdk8
 server1:apache-doris/be doris$ ./bin/start_be.sh --daemon
 ```
 
-### 连接 Doris FE
+### 连接 Apache Doris FE
 
-通过 MySQL 客户端来连接 Doris FE，下载免安装的 [MySQL 
客户端](https://dev.mysql.com/downloads/mysql/)。
+通过 MySQL 客户端来连接 Apache Doris FE，下载免安装的 [MySQL 
客户端](https://dev.mysql.com/downloads/mysql/)。
 
 解压刚才下载的 MySQL 客户端，在 `bin/` 目录下可以找到 `mysql` 命令行工具。然后执行下面的命令连接 Apache Doris。
 
 ```Bash
-mysql -uRoot -P9030 -h127.0.0.1
+mysql -uroot -P9030 -h127.0.0.1
 ```
 
-注意：
+:::caution 注意
 
--   这里使用的 Root 用户是 Doris 内置的超级管理员用户，具体的用户权限查看 
[认证和鉴权](../admin-manual/auth/authentication-and-authorization.md)
+-   这里使用的 Root 用户是 Apache Doris 内置的超级管理员用户，具体的用户权限查看 
[认证和鉴权](../../../admin-manual/auth/authentication-and-authorization.md)
 -   -P：这里是我们连接 Apache Doris 的查询端口，默认端口是 9030，对应的是 fe.conf 里的 `query_port`
 -   -h：这里是我们连接的 FE IP 地址，如果你的客户端和 FE 安装在同一个节点可以使用 127.0.0.1。
 
+:::
+
 ### 将 BE 节点添加到集群
 
 在 MySQL 客户端执行类似下面的 SQL，将 BE 添加到集群中
@@ -123,7 +125,7 @@ mysql -uRoot -P9030 -h127.0.0.1
  ALTER SYSTEM ADD BACKEND "be_host_ip:heartbeat_service_port";
 ```
 
-注意：
+:::caution 注意
 
 1.  be_host_ip：要添加 BE 的 IP 地址
 
@@ -131,9 +133,11 @@ mysql -uRoot -P9030 -h127.0.0.1
 
 3.  通过 show backends 语句可以查看新添加的 BE 节点。
 
-### 修改 Root 和 Admin 的密码
+:::
+
+### 修改 Root 用户和 Admin 用户的密码
 
-在 MySQL 客户端，执行类似下面的 SQL，为 Root 和 Admin 用户设置新密码
+在 MySQL 客户端，执行类似下面的 SQL，为 Root 用户和 Admin 用户设置新密码
 
 ```sql
 mysql> SET PASSWORD FOR 'root' = PASSWORD('doris-root-password');              
                                                                                
                                                                                
     
@@ -144,16 +148,16 @@ Query OK, 0 rows affected (0.00 sec)
 ```
 
 :::tip
-Root 和 Admin 用户的区别
+Root 用户和 Admin 用户的区别
 
-Root 和 Admin 用户都属于 Doris 安装完默认存在的 2 个账户。其中 Root 
拥有整个集群的超级权限，可以对集群完成各种管理操作，比如添加节点，去除节点。Admin 用户没有管理权限，是集群中的 
Superuser，拥有除集群管理相关以外的所有权限。建议只有在需要对集群进行运维管理超级权限时才使用 Root 权限。
+Root 用户和 Admin 用户都属于 Apache Doris 安装完默认存在的 2 个账户。其中 Root 
用户拥有整个集群的超级权限，可以对集群完成各种管理操作，比如添加节点，去除节点。Admin 用户没有管理权限，是集群中的 
Superuser，拥有除集群管理相关以外的所有权限。建议只有在需要对集群进行运维管理超级权限时才使用 Root 权限。
 :::
 
 ## 建库建表
 
-### 连接 Doris
+### 连接 Apache Doris
 
-使用 Admin 账户连接 Doris FE。
+使用 Admin 账户连接 Apache Doris FE。
 
 ```Bash
 mysql -uadmin -P9030 -h127.0.0.1
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1.json 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1.json
index 49c2559ca0d..d4b7dfe95c6 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1.json
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1.json
@@ -7,6 +7,10 @@
     "message": "快速开始",
     "description": "The label for category Getting Started in sidebar docs"
   },
+  "sidebar.docs.category.Quick Start": {
+    "message": "快速体验",
+    "description": "The label for category Quick Start in sidebar docs"
+  },
   "sidebar.docs.category.Installation and Deployment": {
     "message": "安装部署",
     "description": "The label for category Install and Deploy in sidebar docs"
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/get-starting/quick-start/doris-hudi.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/get-starting/quick-start/doris-hudi.md
new file mode 100644
index 00000000000..28ddfda61cf
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/get-starting/quick-start/doris-hudi.md
@@ -0,0 +1,314 @@
+---
+{
+    "title": "Apache Doris & Hudi 快速开始",
+    "language": "zh-CN"
+}
+
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+作为一种全新的开放式的数据管理架构，湖仓一体（Data 
Lakehouse）融合了数据仓库的高性能、实时性以及数据湖的低成本、灵活性等优势，帮助用户更加便捷地满足各种数据处理分析的需求，在企业的大数据体系中已经得到越来越多的应用。
+
+在过去多个版本中，Apache Doris 持续加深与数据湖的融合，当前已演进出一套成熟的湖仓一体解决方案。
+
+- 自 0.15 版本起，Apache Doris 引入 Hive 和 Iceberg 外部表，尝试在 Apache Iceberg 
之上探索与数据湖的能力结合。
+- 自 1.2 版本起，Apache Doris 正式引入 Multi-Catalog 
功能，实现了多种数据源的自动元数据映射和数据访问、并对外部数据读取和查询执行等方面做了诸多性能优化，完全具备了构建极速易用 Lakehouse 架构的能力。
+- 在 2.1 版本中，Apache Doris 湖仓一体架构得到全面加强，不仅增强了主流数据湖格式（Hudi、Iceberg、Paimon 
等）的读取和写入能力，还引入了多 SQL 方言兼容、可从原有系统无缝切换至 Apache Doris。在数据科学及大规模数据读取场景上， Doris 集成了 
Arrow Flight 高速读取接口，使得数据传输效率实现 100 倍的提升。
+
+![](/images/quick-start/lakehouse-arch.PNG)
+
+## Apache Doris & Hudi
+
+[Apache Hudi](https://hudi.apache.org/) 是目前最主流的开放数据湖格式之一，也是事务性的数据湖管理平台，支持包括 
Apache Doris 在内的多种主流查询引擎。
+
+Apache Doris 同样对 Apache Hudi 数据表的读取能力进行了增强：
+
+- 支持 Copy on Write Table：Snapshot Query
+- 支持 Merge on Read Table：Snapshot Queries, Read Optimized Queries
+- 支持 Time Travel
+- 支持 Incremental Read
+
+凭借 Apache Doris 的高性能查询执行以及 Apache Hudi 
的实时数据管理能力，可以实现高效、灵活、低成本的数据查询和分析，同时也提供了强大的数据回溯、审计和增量处理功能，当前基于 Apache Doris 和 
Apache Hudi 的组合已经在多个社区用户的真实业务场景中得到验证和推广：
+
+- 实时数据分析与处理：比如金融行业交易分析、广告行业实时点击流分析、电商行业用户行为分析等常见场景下，都要求实时的数据更新及查询分析。Hudi 
能够实现对数据的实时更新和管理，并保证数据的一致性和可靠性，Doris 则能够实时高效处理大规模数据查询请求，二者结合能够充分满足实时数据分析与处理的需求。
+- 数据回溯与审计：对于金融、医疗等对数据安全和准确性要求极高的行业来说，数据回溯和审计是非常重要的功能。Hudi 提供了时间旅行（Time 
Travel）功能，允许用户查看历史数据状态，结合 Apache Doris 高效查询能力，可快速查找分析任何时间点的数据，实现精确的回溯和审计。
+- 增量数据读取与分析：在进行大数据分析时往往面临着数据规模庞大、更新频繁的问题，Hudi 
支持增量数据读取，这使得用户可以只需处理变化的数据，不必进行全量数据更新；同时 Apache Doris 的 Incremental Read 
功能也可使这一过程更加高效，显著提升了数据处理和分析的效率。
+- 跨数据源联邦查询：许多企业数据来源复杂，数据可能存储在不同的数据库中。Doris 的 Multi-Catalog 
功能支持多种数据源的自动映射与同步，支持跨数据源的联邦查询。这对于需要从多个数据源中获取和整合数据进行分析的企业来说，极大地缩短了数据流转路径，提升了工作效率。
+
+本文将在 Docker 环境下，为读者介绍如何快速搭建 Apache Doris + Apache Hudi 
的测试及演示环境，并对各功能操作进行演示，帮助读者快速入门。
+
+关于更多说明，请参阅 [Hudi Catalog](../../lakehouse/datalake-analytics/hudi.md)
+
+## 使用指南
+
+本文涉及所有脚本和代码可以从该地址获取：[https://github.com/apache/doris/tree/master/samples/datalake/hudi](https://github.com/apache/doris/tree/master/samples/datalake/hudi)
+
+### 01 环境准备
+
+本文示例采用 Docker Compose 部署，组件及版本号如下：
+
+| 组件名称 | 版本 |
+| --- | --- |
+| Apache Doris | 默认 2.1.4，可修改 |
+| Apache Hudi | 0.14|
+| Apache Spark | 3.4.2|
+| Apache Hive | 2.1.3|
+| MinIO | 2022-05-26T05-48-41Z|
+
+
+### 02 环境部署
+
+1. 创建 Docker 网络
+
+       `sudo docker network create -d bridge hudi-net`
+
+2. 启动所有组件
+
+       `sudo ./start-hudi-compose.sh`
+       
+       > 注：启动前，可将 `start-hudi-compose.sh` 中的 `DORIS_PACKAGE` 和 
`DORIS_DOWNLOAD_URL` 修改成需要的 Doris 版本。建议使用 2.1.4 或更高版本。
+
+3. 启动后，可以使用如下脚本，登陆 Spark 命令行或 Doris 命令行：
+
+       ```
+       -- Doris
+       sudo ./login-spark.sh
+       
+       -- Spark
+       sudo ./login-doris.sh
+       ```
+
+### 03 数据准备
+
+接下来先通过 Spark 生成 Hudi 的数据。如下方代码所示，集群中已经包含一张名为 `customer` 的 Hive 表，可以通过这张 Hive 
表，创建一个 Hudi 表：
+
+```
+-- ./login-spark.sh
+spark-sql> use default;
+
+-- create a COW table
+spark-sql> CREATE TABLE customer_cow
+USING hudi
+TBLPROPERTIES (
+  type = 'cow',
+  primaryKey = 'c_custkey',
+  preCombineField = 'c_name'
+)
+PARTITIONED BY (c_nationkey)
+AS SELECT * FROM customer;
+
+-- create a MOR table
+spark-sql> CREATE TABLE customer_mor
+USING hudi
+TBLPROPERTIES (
+  type = 'mor',
+  primaryKey = 'c_custkey',
+  preCombineField = 'c_name'
+)
+PARTITIONED BY (c_nationkey)
+AS SELECT * FROM customer;
+```
+
+### 04 数据查询
+
+如下所示，Doris 集群中已经创建了名为 `hudi` 的 Catalog（可通过 `SHOW CATALOGS` 查看）。以下为该 Catalog 
的创建语句：
+
+```
+-- 已经创建，无需再次执行
+CREATE CATALOG `hive` PROPERTIES (
+    "type"="hms",
+    'hive.metastore.uris' = 'thrift://hive-metastore:9083',
+    "s3.access_key" = "minio",
+    "s3.secret_key" = "minio123",
+    "s3.endpoint" = "http://minio:9000";,
+    "s3.region" = "us-east-1",
+    "use_path_style" = "true"
+);
+```
+
+1. 手动刷新该 Catalog，对创建的 Hudi 表进行同步： 
+
+       ```
+       -- ./login-doris.sh
+       doris> REFRESH CATALOG hive;
+       ```
+
+2. 使用 Spark 操作 Hudi 中的数据，都可以在 Doris 中实时可见，不需要再次刷新 Catalog。我们通过 Spark 分别给 COW 和 
MOR 表插入一行数据：
+
+       ```
+       spark-sql> insert into customer_cow values (100, "Customer#000000100", 
"jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", "cial ideas. final, furious 
requests", 25);
+       spark-sql> insert into customer_mor values (100, "Customer#000000100", 
"jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", "cial ideas. final, furious 
requests", 25);
+       ```
+
+3. 通过 Doris 可以直接查询到最新插入的数据：
+
+       ```
+       doris> use hive.default;
+       doris> select * from customer_cow where c_custkey = 100;
+       doris> select * from customer_mor where c_custkey = 100;
+       ```
+
+4. 再通过 Spark 插入 c_custkey=32 已经存在的数据，即覆盖已有数据：
+
+       ```
+       spark-sql> insert into customer_cow values (32, 
"Customer#000000032_update", "jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", 
"cial ideas. final, furious requests", 15);
+       spark-sql> insert into customer_mor values (32, 
"Customer#000000032_update", "jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", 
"cial ideas. final, furious requests", 15);
+       ```
+
+5. 通过 Doris 可以查询更新后的数据：
+
+       ```
+       doris> select * from customer_cow where c_custkey = 32;
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       | c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       |        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 | 
  3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       doris> select * from customer_mor where c_custkey = 32;
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       | c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       |        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 | 
  3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       ```
+
+### 05 Incremental Read
+
+Incremental Read 是 Hudi 提供的功能特性之一，通过 Incremental 
Read，用户可以获取指定时间范围的增量数据，从而实现对数据的增量处理。对此，Doris 可对插入 `c_custkey=100` 
后的变更数据进行查询。如下所示，我们插入了一条 `c_custkey=32` 的数据：
+
+```
+doris> select * from customer_cow@incr('beginTime'='20240603015018572');
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+| c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+|        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 |   
3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+spark-sql> select * from hudi_table_changes('customer_cow', 'latest_state', 
'20240603015018572');
+
+doris> select * from customer_mor@incr('beginTime'='20240603015058442');
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+| c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+|        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 |   
3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+spark-sql> select * from hudi_table_changes('customer_mor', 'latest_state', 
'20240603015058442');
+```
+
+### 06 TimeTravel
+
+Doris 支持查询指定快照版本的 Hudi 数据，从而实现对数据的 Time Travel 功能。首先，可以通过 Spark 查询两张 Hudi 
表的提交历史：
+
+```
+spark-sql> call show_commits(table => 'customer_cow', limit => 10);
+20240603033556094        20240603033558249        commit        448833        
0        1        1        183        0        0
+20240603015444737        20240603015446588        commit        450238        
0        1        1        202        1        0
+20240603015018572        20240603015020503        commit        436692        
1        0        1        1        0        0
+20240603013858098        20240603013907467        commit        44902033       
 100        0        25        18751        0        0
+
+spark-sql> call show_commits(table => 'customer_mor', limit => 10);
+20240603033745977        20240603033748021        deltacommit        1240      
  0        1        1        0        0        0
+20240603015451860        20240603015453539        deltacommit        1434      
  0        1        1        1        1        0
+20240603015058442        20240603015100120        deltacommit        436691    
    1        0        1        1        0        0
+20240603013918515        20240603013922961        deltacommit        44904040  
      100        0        25        18751        0        0
+```
+
+接着，可通过 Doris 执行 `c_custkey=32` ，查询数据插入之前的数据快照。如下可看到 `c_custkey=32` 的数据还未更新：
+
+> 注：Time Travel 语法暂时不支持新优化器，需要先执行set 
enable_nereids_planner=false;关闭新优化器，该问题将会在后续版本中修复。
+
+```
+doris> select * from customer_cow for time as of '20240603015018572' where 
c_custkey = 32 or c_custkey = 100;
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+| c_custkey | c_name             | c_address                             | 
c_phone         | c_acctbal | c_mktsegment | c_comment                          
              | c_nationkey |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+|        32 | Customer#000000032 | jD2xZzi UmId,DCtNBLXKj9q0Tlp2iQ6ZcO3J | 
25-430-914-2194 |   3471.53 | BUILDING     | cial ideas. final, furious 
requests across the e |          15 |
+|       100 | Customer#000000100 | jD2xZzi                               | 
25-430-914-2194 |   3471.59 | BUILDING     | cial ideas. final, furious 
requests              |          25 |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+-- compare with spark-sql
+spark-sql> select * from customer_mor timestamp as of '20240603015018572' 
where c_custkey = 32 or c_custkey = 100;
+
+doris> select * from customer_mor for time as of '20240603015058442' where 
c_custkey = 32 or c_custkey = 100;
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+| c_custkey | c_name             | c_address                             | 
c_phone         | c_acctbal | c_mktsegment | c_comment                          
              | c_nationkey |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+|       100 | Customer#000000100 | jD2xZzi                               | 
25-430-914-2194 |   3471.59 | BUILDING     | cial ideas. final, furious 
requests              |          25 |
+|        32 | Customer#000000032 | jD2xZzi UmId,DCtNBLXKj9q0Tlp2iQ6ZcO3J | 
25-430-914-2194 |   3471.53 | BUILDING     | cial ideas. final, furious 
requests across the e |          15 |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+spark-sql> select * from customer_mor timestamp as of '20240603015058442' 
where c_custkey = 32 or c_custkey = 100;
+```
+
+## 查询优化
+
+Apache Hudi 中的数据大致可以分为两类 —— 基线数据和增量数据。基线数据通常是已经经过合并的 Parquet 文件，而增量数据是指由 
INSERT、UPDATE 或 DELETE 产生的数据增量。基线数据可以直接读取，增量数据需要通过 Merge on Read 的方式进行读取。
+
+对于 Hudi COW 表的查询或者 MOR 表的 Read Optimized 查询而言，其数据都属于基线数据，可直接通过 Doris 原生的 
Parquet Reader 读取数据文件，且可获得极速的查询响应。而对于增量数据，Doris 需要通过 JNI 调用 Hudi 的 Java SDK 
进行访问。为了达到最优的查询性能，Apache Doris 在查询时，会将一个查询中的数据分为基线和增量数据两部分，并分别使用上述方式进行读取。
+
+为验证该优化思路，我们通过 EXPLAIN 语句来查看一个下方示例的查询中，分别有多少基线数据和增量数据。对于 COW 表来说，所有 101 
个数据分片均为是基线数据（`hudiNativeReadSplits=101/101`），因此 COW 表全部可直接通过  Doris  Parquet 
Reader 进行读取，因此可获得最佳的查询性能。对于 ROW 
表，大部分数据分片是基线数据（`hudiNativeReadSplits=100/101`），一个分片数为增量数据，基本也能够获得较好的查询性能。
+
+```
+-- COW table is read natively
+doris> explain select * from customer_cow where c_custkey = 32;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_cow                                       |
+|      predicates: (c_custkey[#5] = 32)                          |
+|      inputSplitNum=101, totalFileSize=45338886, scanRanges=101 |
+|      partition=26/26                                           |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=101/101                              |
+
+-- MOR table: because only the base file contains `c_custkey = 32` that is 
updated, 100 splits are read natively, while the split with log file is read by 
JNI.
+doris> explain select * from customer_mor where c_custkey = 32;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_mor                                       |
+|      predicates: (c_custkey[#5] = 32)                          |
+|      inputSplitNum=101, totalFileSize=45340731, scanRanges=101 |
+|      partition=26/26                                           |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=100/101                              |
+```
+
+可以通过 Spark 进行一些删除操作，进一步观察 Hudi 基线数据和增量数据的变化：
+
+```
+-- Use delete statement to see more differences
+spark-sql> delete from customer_cow where c_custkey = 64;
+doris> explain select * from customer_cow where c_custkey = 64;
+
+spark-sql> delete from customer_mor where c_custkey = 64;
+doris> explain select * from customer_mor where c_custkey = 64;
+```
+
+此外，还可以通过分区条件进行分区裁剪，从而进一步减少数据量，以提升查询速度。如下示例中，通过分区条件 `c_nationkey=15` 
进行分区裁减，使得查询请求只需要访问一个分区（`partition=1/26`）的数据即可。
+
+```
+-- customer_xxx is partitioned by c_nationkey, we can use the partition column 
to prune data
+doris> explain select * from customer_mor where c_custkey = 64 and c_nationkey 
= 15;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_mor                                       |
+|      predicates: (c_custkey[#5] = 64), (c_nationkey[#12] = 15) |
+|      inputSplitNum=4, totalFileSize=1798186, scanRanges=4      |
+|      partition=1/26                                            |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=3/4                                  |
+```
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/get-starting/quick-start.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/get-starting/quick-start/quick-start.md
similarity index 80%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/get-starting/quick-start.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/get-starting/quick-start/quick-start.md
index e6bbff247d8..4cbb3bcaad6 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/get-starting/quick-start.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/get-starting/quick-start/quick-start.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "快速体验 Apache Doris",
+    "title": "快速开始",
     "language": "zh-CN"
 }
 
@@ -25,29 +25,29 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-这个简短的指南将告诉你如何下载 Doris 最新稳定版本，在单节点上安装并运行它，包括创建数据库、数据表、导入数据及查询等。
+这个简短的指南将告诉你如何下载 Apache Doris 最新稳定版本，在单节点上安装并运行它，包括创建数据库、数据表、导入数据及查询等。
 
 ## 环境准备
 
--   选择一个 X86-64 上的主流 Linux 环境，推荐 CentOS 7.1 或者 Ubuntu 16.04 
以上版本。更多运行环境请参考安装部署部分。
+-   选择一个 x86-64 上的主流 Linux 环境，推荐 CentOS 7.1 或者 Ubuntu 16.04 
以上版本。更多运行环境请参考安装部署部分。
 
 -   Java 8 运行环境（非 Oracle JDK 商业授权用户，建议使用免费的 Oracle JDK 
8u202，[立即下载](https://www.oracle.com/java/technologies/javase/javase8-archive-downloads.html#license-lightbox))。
 
--   建议在 Linux 上新建一个 Doris 用户（避免使用 root 用户，以防对操作系统误操作）
+-   建议在 Linux 上新建一个 Doris 用户。请避免使用 Root 用户，以防对操作系统误操作。
 
 ## 下载二进制包
 
-从 doris.apache.org 下载相应的 Doris 安装包，并且解压。
+从 doris.apache.org 下载相应的 Apache Doris 安装包，并且解压。
 
 ```Bash
-# 下载 Doris 二进制安装包
-server1:~ doris$ wget 
https://apache-doris-releases.oss-accelerate.aliyuncs.com/apache-doris-2.0.3-bin-x64.tar.gz
+# 下载 Apache Doris 二进制安装包
+server1:~ doris$ wget 
https://apache-doris-releases.oss-accelerate.aliyuncs.com/apache-doris-2.0.12-bin-x64.tar.gz
 
 # 解压安装包
-server1:~ doris$ tar zxf apache-doris-2.0.3-bin-x64.tar.gz
+server1:~ doris$ tar zxf apache-doris-2.0.12-bin-x64.tar.gz
 
 # 目录重命名为更为简单的 apache-doris 
-server1:~ doris$ mv apache-doris-2.0.3-bin-x64 apache-doris
+server1:~ doris$ mv apache-doris-2.0.12-bin-x64 apache-doris
 ```
 
 ## 安装 Doris
@@ -60,7 +60,7 @@ FE 的配置文件为 apache-doris/fe/conf/fe.conf。下面是一些需要关注
 # 增加 JAVA_HOME 配置，指向 JDK8 的运行环境。假如我们 JDK8 位于 /home/doris/jdk8, 则设置如下
 JAVA_HOME=/home/doris/jdk8
 
-# FE 监听 IP 的 CIDR 网段。默认设置为空，有 Doris 启动时自动选择一个可用网段。如有多个网段，需要指定一个网段，可以类似设置 
priority_networks=92.168.0.0/24
+# FE 监听 IP 的 CIDR 网段。默认设置为空，有 Apache Doris 
启动时自动选择一个可用网段。如有多个网段，需要指定一个网段，可以类似设置 priority_networks=92.168.0.0/24
 # priority_networks =
 
 # FE 元数据存放的目录，默认是在 DORIS_HOME 下的 doris-meta 目录。已经创建，可以更改为你的元数据存储路径。
@@ -84,7 +84,7 @@ BE 的配置文件为 apache-doris/be/conf/be.conf。下面是一些需要关注
 # 增加 JAVA_HOME 配置，指向 JDK8 的运行环境。假如我们 JDK8 位于 /home/doris/jdk8, 则设置如下
 JAVA_HOME=/home/doris/jdk8
 
-# BE 监听 IP 的 CIDR 网段。默认设置为空，有 Doris 启动时自动选择一个可用网段。如有多个网段，需要指定一个网段，可以类似设置 
priority_networks=192.168.0.0/24
+# BE 监听 IP 的 CIDR 网段。默认设置为空，有 Apache Doris 
启动时自动选择一个可用网段。如有多个网段，需要指定一个网段，可以类似设置 priority_networks=192.168.0.0/24
 # priority_networks =
 
 # BE 数据存放的目录，默认是在 DORIS_HOME 下的 storage 下，默认已经创建，可以更改为你的数据存储路径
@@ -100,23 +100,24 @@ JAVA_HOME=/home/doris/jdk8
 server1:apache-doris/be doris$ ./bin/start_be.sh --daemon
 ```
 
-### 连接 Doris FE
+### 连接 Apache Doris FE
 
-通过 MySQL 客户端来连接 Doris FE，下载免安装的 [MySQL 
客户端](https://dev.mysql.com/downloads/mysql/)。
+通过 MySQL 客户端来连接 Apache Doris FE，下载免安装的 [MySQL 
客户端](https://dev.mysql.com/downloads/mysql/)。
 
-解压刚才下载的 MySQL 客户端，在 `bin/` 目录下可以找到 `mysql` 命令行工具。然后执行下面的命令连接 Doris。
+解压刚才下载的 MySQL 客户端，在 `bin/` 目录下可以找到 `mysql` 命令行工具。然后执行下面的命令连接 Apache Doris。
 
 ```Bash
 mysql -uroot -P9030 -h127.0.0.1
 ```
 
-注意：
+:::caution 注意
 
--   这里使用的 root 用户是 Doris 内置的超级管理员用户，具体的用户权限查看 
[权限管理](../admin-manual/privilege-ldap/user-privilege)
-
--   -P：这里是我们连接 Doris 的查询端口，默认端口是 9030，对应的是 fe.conf 里的 `query_port`
+-   这里使用的 Root 用户是 Apache Doris 内置的超级管理员用户，具体的用户权限查看 
[认证和鉴权](../../../admin-manual/auth/authentication-and-authorization.md)
+-   -P：这里是我们连接 Apache Doris 的查询端口，默认端口是 9030，对应的是 fe.conf 里的 `query_port`
 -   -h：这里是我们连接的 FE IP 地址，如果你的客户端和 FE 安装在同一个节点可以使用 127.0.0.1。
 
+:::
+
 ### 将 BE 节点添加到集群
 
 在 MySQL 客户端执行类似下面的 SQL，将 BE 添加到集群中
@@ -125,7 +126,7 @@ mysql -uroot -P9030 -h127.0.0.1
  ALTER SYSTEM ADD BACKEND "be_host_ip:heartbeat_service_port";
 ```
 
-注意：
+:::caution 注意
 
 1.  be_host_ip：要添加 BE 的 IP 地址
 
@@ -133,9 +134,11 @@ mysql -uroot -P9030 -h127.0.0.1
 
 3.  通过 show backends 语句可以查看新添加的 BE 节点。
 
-### 修改 root 和 admin 的密码
+:::
+
+### 修改 Root 用户和 Admin 用户的密码
 
-在 MySQL 客户端，执行类似下面的 SQL，为 root 和 admin 用户设置新密码
+在 MySQL 客户端，执行类似下面的 SQL，为 Root 用户和 Admin 用户设置新密码
 
 ```sql
 mysql> SET PASSWORD FOR 'root' = PASSWORD('doris-root-password');              
                                                                                
                                                                                
     
@@ -146,16 +149,16 @@ Query OK, 0 rows affected (0.00 sec)
 ```
 
 :::tip
-root 和 admin 用户的区别
+Root 用户和 Admin 用户的区别
 
-root 和 admin 用户都属于 Doris 安装完默认存在的 2 个账户。其中 root 
拥有整个集群的超级权限，可以对集群完成各种管理操作，比如添加节点，去除节点。admin 用户没有管理权限，是集群中的 
Superuser，拥有除集群管理相关以外的所有权限。建议只有在需要对集群进行运维管理超级权限时才使用 root 权限。
+Root 用户和 Admin 用户都属于 Apache Doris 安装完默认存在的 2 个账户。其中 Root 
用户拥有整个集群的超级权限，可以对集群完成各种管理操作，比如添加节点，去除节点。Admin 用户没有管理权限，是集群中的 
Superuser，拥有除集群管理相关以外的所有权限。建议只有在需要对集群进行运维管理超级权限时才使用 Root 权限。
 :::
 
 ## 建库建表
 
-### 连接 Doris
+### 连接 Apache Doris
 
-使用 admin 账户连接 Doris FE。
+使用 Admin 账户连接 Apache Doris FE。
 
 ```Bash
 mysql -uadmin -P9030 -h127.0.0.1
@@ -202,7 +205,7 @@ curl  --location-trusted -u admin:admin_password -T 
data.csv -H "column_separato
 
 -   -T data.csv : 要导入的数据文件名
 
--   -u admin:admin_password:  admin 账户与密码
+-   -u admin:admin_password:  Admin 账户与密码
 
 -   127.0.0.1:8030 : 分别是 FE 的 IP 和 http_port
 
@@ -253,7 +256,7 @@ mysql> select * from mytable;
 4 rows in set (0.01 sec)       
 ```
 
-## 停止 Doris
+## 停止 Apache Doris
 
 ### 停止 FE
 
@@ -269,4 +272,4 @@ server1:apache-doris/fe doris$ ./bin/stop_fe.sh
 
 ```Bash
 server1:apache-doris/be doris$ ./bin/stop_be.sh
-```
\ No newline at end of file
+```
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hudi.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hudi.md
index 8aa44d5c712..81088d03c9d 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hudi.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hudi.md
@@ -24,6 +24,8 @@ specific language governing permissions and limitations
 under the License.
 -->
 
+快速体验 [Apache Doris & Hudi](../../get-starting/quick-start/doris-hudi.md)
+
 ## 使用限制
 
 1. Hudi 表支持的查询类型如下：
diff --git a/sidebars.json b/sidebars.json
index e83dfae1947..d1bcbbb3054 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -5,7 +5,14 @@
             "label": "Getting Started",
             "items": [
                 "get-starting/what-is-apache-doris",
-                "get-starting/quick-start"
+                {
+                    "type": "category",
+                    "label": "Quick Start",
+                    "items": [
+                        "get-starting/quick-start/quick-start",
+                        "get-starting/quick-start/doris-hudi"
+                    ]
+                }
             ]
         },
         {
@@ -1585,4 +1592,4 @@
             ]
         }
     ]
-}
\ No newline at end of file
+}
diff --git a/src/pages/learning/index.tsx b/src/pages/learning/index.tsx
index 21b880d638d..4e1e80bc161 100644
--- a/src/pages/learning/index.tsx
+++ b/src/pages/learning/index.tsx
@@ -18,7 +18,7 @@ const sitemapList = [
             },
             {
                 title: <Translate>Get Started</Translate>,
-                link: '/docs/dev/get-starting/quick-start',
+                link: '/docs/dev/get-starting/quick-start/quick-start',
             },
             {
                 title: <Translate>Installation and deployment</Translate>,
diff --git a/static/images/quick-start/lakehouse-arch.PNG 
b/static/images/quick-start/lakehouse-arch.PNG
new file mode 100644
index 00000000000..153613c2e5b
Binary files /dev/null and b/static/images/quick-start/lakehouse-arch.PNG differ
diff --git a/versioned_docs/version-2.0/get-starting/quick-start.md 
b/versioned_docs/version-2.0/get-starting/quick-start.md
deleted file mode 100644
index 73af4a0f205..00000000000
--- a/versioned_docs/version-2.0/get-starting/quick-start.md
+++ /dev/null
@@ -1,266 +0,0 @@
----
-{
-    "title": "Quick Start",
-    "language": "en"
-}
-
----
-
-<!-- 
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# Quick Start
-
-This guide is about how to download the latest stable version of Doris, 
install it on a single node, and get it running, including steps for creating a 
database, data tables, importing data, and performing queries.
-
-## Prerequisite
-
-- A mainstream Linux X86-64 environment. CentOS 7.1 or Ubuntu 16.04 or later 
versions are recommended. See the "Install and Deploy" section of the doc for 
guides on more environments.
-- Install Java 8 runtime environment. (If you are not an Oracle JDK commercial 
license user, we suggest using the free Oracle JDK 8u202. [Download 
now](https://www.oracle.com/java/technologies/javase/javase8-archive-downloads.html#license-lightbox).)
-- It is recommended to create a new user for Doris on Linux (avoid using the 
root user to prevent accidental operations on the operating system).
-
-## Download binary package
-
-Download the Doris installation package from doris.apache.org and proceed with 
the following steps.
-
-```Bash
-# Download the binary installation package of Doris
-server1:~ doris$ wget 
https://apache-doris-releases.oss-accelerate.aliyuncs.com/apache-doris-2.0.3-bin-x64.tar.gz
-
-# Extract the installation package
-server1:~ doris$ tar zxf apache-doris-2.0.3-bin-x64.tar.gz
-
-# Rename the directory to apache-doris for simplicity
-server1:~ doris$ mv apache-doris-2.0.3-bin-x64 apache-doris
-```
-
-## Install Doris
-
-### Configure FE
-
-Go to the `apache-doris/fe/conf/fe.conf` file for FE configuration. Below are 
some key configurations to pay attention to. Add JAVA_HOME manually and point 
it to your JDK8 runtime environment. For other configurations, you can go with 
the default values for a quick single-machine experience.
-
-```Shell
-# Add JAVA_HOME and point it to your JDK8 runtime environment. Suppose your 
JDK8 is at /home/doris/jdk8, set it as follows:
-JAVA_HOME=/home/doris/jdk8
-
-# The CIDR network segment of FE listening IP is empty by default. When 
started, Doris will automatically select an available network segment. If you 
need to specify a segment, you can set priority_networks=92.168.0.0/24, for 
example.
-# priority_networks =
-
-# By default, FE metadata is stored in the doris-meta directory under 
DORIS_HOME. It is created already. You can change it to your specified path.
-# meta_dir = ${DORIS_HOME}/doris-meta
-```
-
-### Start FE
-
-Run the following command under apache-doris/fe to start FE.
-
-```Bash
-# Start FE in the background to ensure that the process continues running even 
after exiting the terminal.
-server1:apache-doris/fe doris$ ./bin/start_fe.sh --daemon
-```
-
-### Configure BE
-
-Go to the `apache-doris/be/conf/be.conf` file for BE configuration. Below are 
some key configurations to pay attention to. Add JAVA_HOME manually and point 
it to your JDK8 runtime environment. For other configurations, you can go with 
the default values for a quick single-machine experience.
-
-```Shell
-# Add JAVA_HOME and point it to your JDK8 runtime environment. Suppose your 
JDK8 is at /home/doris/jdk8, set it as follows:
-JAVA_HOME=/home/doris/jdk8
-
-# The CIDR network segment of BE listening IP is empty by default. When 
started, Doris will automatically select an available network segment. If you 
need to specify a segment, you can set priority_networks=192.168.0.0/24, for 
example.
-# priority_networks =
-
-# By default, BE data is stored in the storage directory under DORIS_HOME. It 
is created already. You can change it to your specified path.
-# storage_root_path = ${DORIS_HOME}/storage
-```
-
-### Start BE
-
-Run the following command under apache-doris/be to start BE.
-
-```Bash
-# Start BE in the background to ensure that the process continues running even 
after exiting the terminal.
-server1:apache-doris/be doris$ ./bin/start_be.sh --daemon
-```
-
-### Connect to Doris FE
-
-Download the [portable MySQL client](https://dev.mysql.com/downloads/mysql/) 
to connect to Doris FE.
-
-Unpack the client, find the `mysql` command-line tool in the `bin/` directory. 
Then execute the following command to connect to Doris.
-
-```Bash
-mysql -uroot -P9030 -h127.0.0.1
-```
-
-Note:
-
-- The root user here is the built-in super admin user of Doris. See 
[Permission 
Management](https://doris.apache.org/docs/2.0/admin-manual/privilege-ldap/user-privilege/)
 for more information.
-- -P: This specifies the query port that is connected to. The default port is 
9030. It corresponds to the `query_port`setting in fe.conf.
-- -h: This specifies the IP address of the FE that is connected to. If your 
client and FE are installed on the same node, you can use 127.0.0.1.
-
-### Add BE nodes to cluster
-
-An example SQL to execute in the MySQL client to add BE nodes to the cluster:
-
-```SQL
- ALTER SYSTEM ADD BACKEND "be_host_ip:heartbeat_service_port";
-```
-
-Note:
-
-1. be_host_ip: the IP address of the BE node to be added
-2. heartbeat_service_port: the heartbeat reporting port of the BE node to be 
added, which can be found in `be.conf`under `heartbeat_service_port`, set as 
`9050` by default
-3. You can use the "show backends" statement to view the newly added BE nodes.
-
-### Modify passwords for root and admin
-
-Example SQLs to execute in the MySQL client to set new passwords for root and 
admin users:
-
-```SQL
-mysql> SET PASSWORD FOR 'root' = PASSWORD('doris-root-password');              
                                                                                
                                                                                
     
-Query OK, 0 rows affected (0.01 sec)                                           
                                                                                
                                                                            
-                                                                               
                                                                                
                                                                            
-mysql> SET PASSWORD FOR 'admin' = PASSWORD('doris-admin-password');            
                                                                                
                                                                                
     
-Query OK, 0 rows affected (0.00 sec)        
-```
-
-:::tip
-Difference between root and admin users
-
-The root and admin users are two default accounts that are automatically 
created after Doris installation. The root user has superuser privileges for 
the entire cluster and can perform various management operations, such as 
adding or removing nodes. The admin user does not have administrative 
privileges but is a superuser within the cluster, possessing all permissions 
except those related to cluster management. It is recommended to use the root 
privileges only when necessary for cluster  [...]
-:::
-
-## Create database and table
-
-### Connect to Doris
-
-Use admin account to connect to Doris FE.
-
-```Bash
-mysql -uadmin -P9030 -h127.0.0.1
-```
-
-:::tip
-If the MySQL client connecting to 127.0.0.1 is on the same machine as FE, no 
password will be required.
-:::
-
-### Create database and table
-
-```SQL
-create database demo;
-
-use demo; 
-create table mytable
-(
-    k1 TINYINT,
-    k2 DECIMAL(10, 2) DEFAULT "10.05",    
-    k3 CHAR(10) COMMENT "string column",    
-    k4 INT NOT NULL DEFAULT "1" COMMENT "int column"
-) 
-COMMENT "my first table"
-DISTRIBUTED BY HASH(k1) BUCKETS 1
-PROPERTIES ('replication_num' = '1');
-```
-
-### Ingest data
-
-Save the following example data to the local "data.csv" file:
-
-```Plaintext
-1,0.14,a1,20
-2,1.04,b2,21
-3,3.14,c3,22
-4,4.35,d4,23
-```
-
-Load the data from "data.csv" into the newly created table using the Stream 
Load method.
-
-```Bash
-curl  --location-trusted -u admin:admin_password -T data.csv -H 
"column_separator:," http://127.0.0.1:8030/api/demo/mytable/_stream_load
-```
-
-- -T data.csv: data file name
-- -u admin:admin_password: admin account and password
-- 127.0.0.1:8030: IP and http_port of FE
-
-Once it is executed successfully, a message like the following will be 
returned: 
-
-```Bash
-{                                                     
-    "TxnId": 30,                                  
-    "Label": "a56d2861-303a-4b50-9907-238fea904363",        
-    "Comment": "",                                       
-    "TwoPhaseCommit": "false",                           
-    "Status": "Success",                                 
-    "Message": "OK",                                    
-    "NumberTotalRows": 4,                                
-    "NumberLoadedRows": 4,                               
-    "NumberFilteredRows": 0,                             
-    "NumberUnselectedRows": 0,                          
-    "LoadBytes": 52,                                     
-    "LoadTimeMs": 206,                                    
-    "BeginTxnTimeMs": 13,                                
-    "StreamLoadPutTimeMs": 141,                           
-    "ReadDataTimeMs": 0,                                 
-    "WriteDataTimeMs": 7,                                
-    "CommitAndPublishTimeMs": 42                         
-} 
-```
-
-- `NumberLoadedRows`: the number of rows that have been loaded
-- `NumberTotalRows`: the total number of rows to be loaded
-- `Status`: "Success" means data has been loaded successfully.
-
-### Query data
-
-Execute the following SQL in the MySQL client to query the loaded data:
-
-```SQL
-mysql> select * from mytable;                                                  
                                                                                
                                                                            
-+------+------+------+------+                                                  
                                                                                
                                                                            
-| k1   | k2   | k3   | k4   |                                                  
                                                                                
                                                                            
-+------+------+------+------+                                                  
                                                                                
                                                                            
-|    1 | 0.14 | a1   |   20 |                                                  
                                                                                
                                                                            
-|    2 | 1.04 | b2   |   21 |                                                  
                                                                                
                                                                            
-|    3 | 3.14 | c3   |   22 |                                                  
                                                                                
                                                                            
-|    4 | 4.35 | d4   |   23 |                                                  
                                                                                
                                                                            
-+------+------+------+------+                                                  
                                                                                
                                                                            
-4 rows in set (0.01 sec)       
-```
-
-## Stop Doris
-
-### Stop FE
-
-Execute the following command under apache-doris/fe to stop FE.
-
-```Bash
-server1:apache-doris/fe doris$ ./bin/stop_fe.sh
-```
-
-### Stop BE
-
-Execute the following command under apache-doris/be to stop BE.
-
-```Bash
-server1:apache-doris/be doris$ ./bin/stop_be.sh
-```
-
diff --git a/docs/get-starting/quick-start.md 
b/versioned_docs/version-2.0/get-starting/quick-start/quick-start.md
similarity index 100%
copy from docs/get-starting/quick-start.md
copy to versioned_docs/version-2.0/get-starting/quick-start/quick-start.md
diff --git a/versioned_docs/version-2.1/get-starting/quick-start.md 
b/versioned_docs/version-2.1/get-starting/quick-start.md
deleted file mode 100644
index ad3b892ad1f..00000000000
--- a/versioned_docs/version-2.1/get-starting/quick-start.md
+++ /dev/null
@@ -1,266 +0,0 @@
----
-{
-    "title": "Quick Start",
-    "language": "en"
-}
-
----
-
-<!-- 
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# Quick Start
-
-This guide is about how to download the latest stable version of Apache Doris, 
install it on a single node, and get it running, including steps for creating a 
database, data tables, importing data, and performing queries.
-
-## Environment requirements
-
-- A mainstream Linux x86-64 environment. CentOS 7.1 or Ubuntu 16.04 or later 
versions are recommended. See the "Install and Deploy" section of the doc for 
guides on more environments.
-- Install Java 8 runtime environment. (If you are not an Oracle JDK commercial 
license user, we suggest using the free Oracle JDK 8u202. [Download 
now](https://www.oracle.com/java/technologies/javase/javase8-archive-downloads.html#license-lightbox).)
-- It is recommended to create a new user for Doris on Linux (avoid using the 
root user to prevent accidental operations on the operating system).
-
-## Download binary package
-
-Download the Apache Doris installation package from doris.apache.org and 
proceed with the following steps.
-
-```Bash
-# Download the binary installation package of Doris
-server1:~ doris$ wget 
https://apache-doris-releases.oss-accelerate.aliyuncs.com/apache-doris-2.0.3-bin-x64.tar.gz
-
-# Extract the installation package
-server1:~ doris$ tar zxf apache-doris-2.0.3-bin-x64.tar.gz
-
-# Rename the directory to apache-doris for simplicity
-server1:~ doris$ mv apache-doris-2.0.3-bin-x64 apache-doris
-```
-
-## Install Apache Doris
-
-### Configure FE
-
-Go to the `apache-doris/fe/conf/fe.conf` file for FE configuration. Below are 
some key configurations to pay attention to. Add JAVA_HOME manually and point 
it to your JDK8 runtime environment. For other configurations, you can go with 
the default values for a quick single-machine experience.
-
-```Shell
-# Add JAVA_HOME and point it to your JDK8 runtime environment. Suppose your 
JDK8 is at /home/doris/jdk8, set it as follows:
-JAVA_HOME=/home/doris/jdk8
-
-# The CIDR network segment of FE listening IP is empty by default. When 
started, Apache Doris will automatically select an available network segment. 
If you need to specify a segment, you can set priority_networks=92.168.0.0/24, 
for example.
-# priority_networks =
-
-# By default, FE metadata is stored in the doris-meta directory under 
DORIS_HOME. It is created already. You can change it to your specified path.
-# meta_dir = ${DORIS_HOME}/doris-meta
-```
-
-### Start FE
-
-Run the following command under apache-doris/fe to start FE.
-
-```Bash
-# Start FE in the background to ensure that the process continues running even 
after exiting the terminal.
-server1:apache-doris/fe doris$ ./bin/start_fe.sh --daemon
-```
-
-### Configure BE
-
-Go to the `apache-doris/be/conf/be.conf` file for BE configuration. Below are 
some key configurations to pay attention to. Add JAVA_HOME manually and point 
it to your JDK8 runtime environment. For other configurations, you can go with 
the default values for a quick single-machine experience.
-
-```Shell
-# Add JAVA_HOME and point it to your JDK8 runtime environment. Suppose your 
JDK8 is at /home/doris/jdk8, set it as follows:
-JAVA_HOME=/home/doris/jdk8
-
-# The CIDR network segment of BE listening IP is empty by default. When 
started, Doris will automatically select an available network segment. If you 
need to specify a segment, you can set priority_networks=192.168.0.0/24, for 
example.
-# priority_networks =
-
-# By default, BE data is stored in the storage directory under DORIS_HOME. It 
is created already. You can change it to your specified path.
-# storage_root_path = ${DORIS_HOME}/storage
-```
-
-### Start BE
-
-Run the following command under apache-doris/be to start BE.
-
-```Bash
-# Start BE in the background to ensure that the process continues running even 
after exiting the terminal.
-server1:apache-doris/be doris$ ./bin/start_be.sh --daemon
-```
-
-### Connect to Apache Doris FE
-
-Download the [portable MySQL client](https://dev.mysql.com/downloads/mysql/) 
to connect to Doris FE.
-
-Unpack the client, find the `mysql` command-line tool in the `bin/` directory. 
Then execute the following command to connect to Apache Doris.
-
-```Bash
-mysql -uroot -P9030 -h127.0.0.1
-```
-
-Note:
-
-- The root user here is the built-in super admin user of Apache Doris. See 
[Authentication and 
Authorization](../admin-manual/auth/authentication-and-authorization.md) for 
more information.
-- -P: This specifies the query port that is connected to. The default port is 
9030. It corresponds to the `query_port`setting in fe.conf.
-- -h: This specifies the IP address of the FE that is connected to. If your 
client and FE are installed on the same node, you can use 127.0.0.1.
-
-### Add BE nodes to cluster
-
-An example SQL to execute in the MySQL client to add BE nodes to the cluster:
-
-```SQL
- ALTER SYSTEM ADD BACKEND "be_host_ip:heartbeat_service_port";
-```
-
-Note:
-
-1. be_host_ip: the IP address of the BE node to be added
-2. heartbeat_service_port: the heartbeat reporting port of the BE node to be 
added, which can be found in `be.conf`under `heartbeat_service_port`, set as 
`9050` by default
-3. You can use the "show backends" statement to view the newly added BE nodes.
-
-### Modify passwords for root and admin
-
-Example SQLs to execute in the MySQL client to set new passwords for root and 
admin users:
-
-```SQL
-mysql> SET PASSWORD FOR 'root' = PASSWORD('doris-root-password');              
                                                                                
                                                                                
     
-Query OK, 0 rows affected (0.01 sec)                                           
                                                                                
                                                                            
-                                                                               
                                                                                
                                                                            
-mysql> SET PASSWORD FOR 'admin' = PASSWORD('doris-admin-password');            
                                                                                
                                                                                
     
-Query OK, 0 rows affected (0.00 sec)        
-```
-
-:::tip
-Difference between root and admin users
-
-The root and admin users are two default accounts that are automatically 
created after Apache Doris installation. The root user has superuser privileges 
for the entire cluster and can perform various management operations, such as 
adding or removing nodes. The admin user does not have administrative 
privileges but is a superuser within the cluster, possessing all permissions 
except those related to cluster management. It is recommended to use the root 
privileges only when necessary for c [...]
-:::
-
-## Create database and table
-
-### Connect to Apache Doris
-
-Use admin account to connect to Apache Doris FE.
-
-```Bash
-mysql -uadmin -P9030 -h127.0.0.1
-```
-
-:::tip
-If the MySQL client connecting to 127.0.0.1 is on the same machine as FE, no 
password will be required.
-:::
-
-### Create database and table
-
-```SQL
-create database demo;
-
-use demo; 
-create table mytable
-(
-    k1 TINYINT,
-    k2 DECIMAL(10, 2) DEFAULT "10.05",    
-    k3 CHAR(10) COMMENT "string column",    
-    k4 INT NOT NULL DEFAULT "1" COMMENT "int column"
-) 
-COMMENT "my first table"
-DISTRIBUTED BY HASH(k1) BUCKETS 1
-PROPERTIES ('replication_num' = '1');
-```
-
-### Ingest data
-
-Save the following example data to the local "data.csv" file:
-
-```Plaintext
-1,0.14,a1,20
-2,1.04,b2,21
-3,3.14,c3,22
-4,4.35,d4,23
-```
-
-Load the data from "data.csv" into the newly created table using the Stream 
Load method.
-
-```Bash
-curl  --location-trusted -u admin:admin_password -T data.csv -H 
"column_separator:," http://127.0.0.1:8030/api/demo/mytable/_stream_load
-```
-
-- -T data.csv: data file name
-- -u admin:admin_password: admin account and password
-- 127.0.0.1:8030: IP and http_port of FE
-
-Once it is executed successfully, a message like the following will be 
returned: 
-
-```Bash
-{                                                     
-    "TxnId": 30,                                  
-    "Label": "a56d2861-303a-4b50-9907-238fea904363",        
-    "Comment": "",                                       
-    "TwoPhaseCommit": "false",                           
-    "Status": "Success",                                 
-    "Message": "OK",                                    
-    "NumberTotalRows": 4,                                
-    "NumberLoadedRows": 4,                               
-    "NumberFilteredRows": 0,                             
-    "NumberUnselectedRows": 0,                          
-    "LoadBytes": 52,                                     
-    "LoadTimeMs": 206,                                    
-    "BeginTxnTimeMs": 13,                                
-    "StreamLoadPutTimeMs": 141,                           
-    "ReadDataTimeMs": 0,                                 
-    "WriteDataTimeMs": 7,                                
-    "CommitAndPublishTimeMs": 42                         
-} 
-```
-
-- `NumberLoadedRows`: the number of rows that have been loaded
-- `NumberTotalRows`: the total number of rows to be loaded
-- `Status`: "Success" means data has been loaded successfully.
-
-### Query data
-
-Execute the following SQL in the MySQL client to query the loaded data:
-
-```SQL
-mysql> select * from mytable;                                                  
                                                                                
                                                                            
-+------+------+------+------+                                                  
                                                                                
                                                                            
-| k1   | k2   | k3   | k4   |                                                  
                                                                                
                                                                            
-+------+------+------+------+                                                  
                                                                                
                                                                            
-|    1 | 0.14 | a1   |   20 |                                                  
                                                                                
                                                                            
-|    2 | 1.04 | b2   |   21 |                                                  
                                                                                
                                                                            
-|    3 | 3.14 | c3   |   22 |                                                  
                                                                                
                                                                            
-|    4 | 4.35 | d4   |   23 |                                                  
                                                                                
                                                                            
-+------+------+------+------+                                                  
                                                                                
                                                                            
-4 rows in set (0.01 sec)       
-```
-
-## Stop Apache Doris
-
-### Stop FE
-
-Execute the following command under apache-doris/fe to stop FE.
-
-```Bash
-server1:apache-doris/fe doris$ ./bin/stop_fe.sh
-```
-
-### Stop BE
-
-Execute the following command under apache-doris/be to stop BE.
-
-```Bash
-server1:apache-doris/be doris$ ./bin/stop_be.sh
-```
-
diff --git a/versioned_docs/version-2.1/get-starting/quick-start/doris-hudi.md 
b/versioned_docs/version-2.1/get-starting/quick-start/doris-hudi.md
new file mode 100644
index 00000000000..f7426e3a3fc
--- /dev/null
+++ b/versioned_docs/version-2.1/get-starting/quick-start/doris-hudi.md
@@ -0,0 +1,313 @@
+---
+{
+    "title": "Apache Doris & Hudi Quick Start",
+    "language": "en"
+}
+
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+As a new open data management architecture, the Data Lakehouse integrates the 
high performance and real-time capabilities of data warehouses with the low 
cost and flexibility of data lakes, helping users more conveniently meet 
various data processing and analysis needs. It has been increasingly applied in 
enterprise big data systems.
+
+In recent versions, Apache Doris has deepened its integration with data lakes 
and has evolved a mature Data Lakehouse solution.
+
+- Since version 0.15, Apache Doris has introduced Hive and Iceberg external 
tables, exploring the capabilities of combining with Apache Iceberg for data 
lakes.
+- Starting from version 1.2, Apache Doris officially introduced the 
Multi-Catalog feature, enabling automatic metadata mapping and data access for 
various data sources, along with numerous performance optimizations for 
external data reading and query execution. It now fully possesses the ability 
to build a high-speed and user-friendly Lakehouse architecture.
+- In version 2.1, Apache Doris's Data Lakehouse architecture was significantly 
enhanced, improving the reading and writing capabilities of mainstream data 
lake formats (Hudi, Iceberg, Paimon, etc.), introducing compatibility with 
multiple SQL dialects, and seamless migration from existing systems to Apache 
Doris. For data science and large-scale data reading scenarios, Doris 
integrated the Arrow Flight high-speed reading interface, achieving a 100-fold 
increase in data transfer efficiency.
+
+![](/images/quick-start/lakehouse-arch.PNG)
+
+## Apache Doris & Hudi
+
+[Apache Hudi](https://hudi.apache.org/) is currently one of the most popular 
open data lake formats and a transactional data lake management platform, 
supporting various mainstream query engines including Apache Doris.
+
+Apache Doris has also enhanced its ability to read Apache Hudi data tables:
+
+- Supports Copy on Write Table: Snapshot Query
+- Supports Merge on Read Table: Snapshot Queries, Read Optimized Queries
+- Supports Time Travel
+- Supports Incremental Read
+
+With Apache Doris's high-performance query execution and Apache Hudi's 
real-time data management capabilities, efficient, flexible, and cost-effective 
data querying and analysis can be achieved. It also provides robust data 
lineage, auditing, and incremental processing functionalities. The combination 
of Apache Doris and Apache Hudi has been validated and promoted in real 
business scenarios by multiple community users:
+
+- Real-time data analysis and processing: Common scenarios such as real-time 
data updates and query analysis in industries like finance, advertising, and 
e-commerce require real-time data processing. Hudi enables real-time data 
updates and management while ensuring data consistency and reliability. Doris 
efficiently handles large-scale data query requests in real-time, meeting the 
demands of real-time data analysis and processing effectively when combined.
+- Data lineage and auditing: For industries with high requirements for data 
security and accuracy like finance and healthcare, data lineage and auditing 
are crucial functionalities. Hudi offers Time Travel functionality for viewing 
historical data states, combined with Apache Doris's efficient querying 
capabilities, enabling quick analysis of data at any point in time for precise 
lineage and auditing.
+- Incremental data reading and analysis: Large-scale data analysis often faces 
challenges of large data volumes and frequent updates. Hudi supports 
incremental data reading, allowing users to process only the changed data 
without full data updates. Additionally, Apache Doris's Incremental Read 
feature enhances this process, significantly improving data processing and 
analysis efficiency.
+- Cross-data source federated queries: Many enterprises have complex data 
sources stored in different databases. Doris's Multi-Catalog feature supports 
automatic mapping and synchronization of various data sources, enabling 
federated queries across data sources. This greatly shortens the data flow path 
and enhances work efficiency for enterprises needing to retrieve and integrate 
data from multiple sources for analysis.
+
+This article will introduce readers to how to quickly set up a test and 
demonstration environment for Apache Doris + Apache Hudi in a Docker 
environment, and demonstrate various operations to help readers get started 
quickly.
+
+For more information, please refer to [Hudi 
Catalog](../../lakehouse/datalake-analytics/hudi.md)
+
+## User Guide
+
+All scripts and code mentioned in this article can be obtained from this 
address: 
[https://github.com/apache/doris/tree/master/samples/datalake/hudi](https://github.com/apache/doris/tree/master/samples/datalake/hudi)
+
+### 01 Environment Preparation
+
+This article uses Docker Compose for deployment, with the following components 
and versions:
+
+| Component | Version |
+| --- | --- |
+| Apache Doris | Default 2.1.4, can be modified |
+| Apache Hudi | 0.14 |
+| Apache Spark | 3.4.2 |
+| Apache Hive | 2.1.3 |
+| MinIO | 2022-05-26T05-48-41Z |
+
+### 02 Environment Deployment
+
+1. Create a Docker network
+
+       `sudo docker network create -d bridge hudi-net`
+
+2. Start all components
+
+       `sudo ./start-hudi-compose.sh`
+       
+       > Note: Before starting, you can modify the `DORIS_PACKAGE` and 
`DORIS_DOWNLOAD_URL` in `start-hudi-compose.sh` to the desired Doris version. 
It is recommended to use version 2.1.4 or higher.
+
+3. After starting, you can use the following script to log in to Spark command 
line or Doris command line:
+
+       ```
+       -- Doris
+       sudo ./login-spark.sh
+       
+       -- Spark
+       sudo ./login-doris.sh
+       ```
+
+### 03 Data Preparation
+
+Next, generate Hudi data through Spark. As shown in the code below, there is 
already a Hive table named `customer` in the cluster. You can create a Hudi 
table using this Hive table:
+
+```
+-- ./login-spark.sh
+spark-sql> use default;
+
+-- create a COW table
+spark-sql> CREATE TABLE customer_cow
+USING hudi
+TBLPROPERTIES (
+  type = 'cow',
+  primaryKey = 'c_custkey',
+  preCombineField = 'c_name'
+)
+PARTITIONED BY (c_nationkey)
+AS SELECT * FROM customer;
+
+-- create a MOR table
+spark-sql> CREATE TABLE customer_mor
+USING hudi
+TBLPROPERTIES (
+  type = 'mor',
+  primaryKey = 'c_custkey',
+  preCombineField = 'c_name'
+)
+PARTITIONED BY (c_nationkey)
+AS SELECT * FROM customer;
+```
+
+### 04 Data Query
+
+As shown below, a Catalog named `hudi` has been created in the Doris cluster 
(can be viewed using `SHOW CATALOGS`). The following is the creation statement 
for this Catalog:
+
+```
+-- Already created, no need to execute again
+CREATE CATALOG `hive` PROPERTIES (
+    "type"="hms",
+    'hive.metastore.uris' = 'thrift://hive-metastore:9083',
+    "s3.access_key" = "minio",
+    "s3.secret_key" = "minio123",
+    "s3.endpoint" = "http://minio:9000";,
+    "s3.region" = "us-east-1",
+    "use_path_style" = "true"
+);
+```
+
+1. Manually refresh this Catalog to synchronize the created Hudi table:
+
+       ```
+       -- ./login-doris.sh
+       doris> REFRESH CATALOG hive;
+       ```
+
+2. Operations on data in Hudi using Spark are immediately visible in Doris 
without the need to refresh the Catalog. We insert a row of data into both COW 
and MOR tables using Spark:
+
+       ```
+       spark-sql> insert into customer_cow values (100, "Customer#000000100", 
"jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", "cial ideas. final, furious 
requests", 25);
+       spark-sql> insert into customer_mor values (100, "Customer#000000100", 
"jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", "cial ideas. final, furious 
requests", 25);
+       ```
+
+3. Through Doris, you can directly query the latest inserted data:
+
+       ```
+       doris> use hive.default;
+       doris> select * from customer_cow where c_custkey = 100;
+       doris> select * from customer_mor where c_custkey = 100;
+       ```
+
+4. Insert data with c_custkey=32 that already exists using Spark, thus 
overwriting the existing data:
+
+       ```
+       spark-sql> insert into customer_cow values (32, 
"Customer#000000032_update", "jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", 
"cial ideas. final, furious requests", 15);
+       spark-sql> insert into customer_mor values (32, 
"Customer#000000032_update", "jD2xZzi", "25-430-914-2194", 3471.59, "BUILDING", 
"cial ideas. final, furious requests", 15);
+       ```
+
+5. With Doris, you can query the updated data:
+
+       ```
+       doris> select * from customer_cow where c_custkey = 32;
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       | c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       |        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 | 
  3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       doris> select * from customer_mor where c_custkey = 32;
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       | c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       |        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 | 
  3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
+       
+-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+       ```
+
+### 05 Incremental Read
+
+Incremental Read is one of the features provided by Hudi. With Incremental 
Read, users can obtain incremental data within a specified time range, enabling 
incremental processing of data. In this regard, Doris can query the changed 
data after inserting `c_custkey=100`. As shown below, we inserted a data with 
`c_custkey=32`:
+
+```
+doris> select * from customer_cow@incr('beginTime'='20240603015018572');
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+| c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+|        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 |   
3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+spark-sql> select * from hudi_table_changes('customer_cow', 'latest_state', 
'20240603015018572');
+
+doris> select * from customer_mor@incr('beginTime'='20240603015058442');
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+| c_custkey | c_name                    | c_address | c_phone         | 
c_acctbal | c_mktsegment | c_comment                           | c_nationkey |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+|        32 | Customer#000000032_update | jD2xZzi   | 25-430-914-2194 |   
3471.59 | BUILDING     | cial ideas. final, furious requests |          15 |
++-----------+---------------------------+-----------+-----------------+-----------+--------------+-------------------------------------+-------------+
+spark-sql> select * from hudi_table_changes('customer_mor', 'latest_state', 
'20240603015058442');
+```
+
+### 06 TimeTravel
+
+Doris supports querying specific snapshot versions of Hudi data, thereby 
enabling Time Travel functionality for data. First, you can query the commit 
history of two Hudi tables using Spark:
+
+```
+spark-sql> call show_commits(table => 'customer_cow', limit => 10);
+20240603033556094        20240603033558249        commit        448833        
0        1        1        183        0        0
+20240603015444737        20240603015446588        commit        450238        
0        1        1        202        1        0
+20240603015018572        20240603015020503        commit        436692        
1        0        1        1        0        0
+20240603013858098        20240603013907467        commit        44902033       
 100        0        25        18751        0        0
+
+spark-sql> call show_commits(table => 'customer_mor', limit => 10);
+20240603033745977        20240603033748021        deltacommit        1240      
  0        1        1        0        0        0
+20240603015451860        20240603015453539        deltacommit        1434      
  0        1        1        1        1        0
+20240603015058442        20240603015100120        deltacommit        436691    
    1        0        1        1        0        0
+20240603013918515        20240603013922961        deltacommit        44904040  
      100        0        25        18751        0        0
+```
+
+Next, using Doris, you can execute `c_custkey=32` to query the data snapshot 
before the data insertion. As shown below, the data with `c_custkey=32` has not 
been updated yet:
+
+> Note: Time Travel syntax is currently not supported by the new optimizer. 
You need to first execute `set enable_nereids_planner=false;` to disable the 
new optimizer. This issue will be fixed in future versions.
+
+```
+doris> select * from customer_cow for time as of '20240603015018572' where 
c_custkey = 32 or c_custkey = 100;
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+| c_custkey | c_name             | c_address                             | 
c_phone         | c_acctbal | c_mktsegment | c_comment                          
              | c_nationkey |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+|        32 | Customer#000000032 | jD2xZzi UmId,DCtNBLXKj9q0Tlp2iQ6ZcO3J | 
25-430-914-2194 |   3471.53 | BUILDING     | cial ideas. final, furious 
requests across the e |          15 |
+|       100 | Customer#000000100 | jD2xZzi                               | 
25-430-914-2194 |   3471.59 | BUILDING     | cial ideas. final, furious 
requests              |          25 |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+-- compare with spark-sql
+spark-sql> select * from customer_mor timestamp as of '20240603015018572' 
where c_custkey = 32 or c_custkey = 100;
+
+doris> select * from customer_mor for time as of '20240603015058442' where 
c_custkey = 32 or c_custkey = 100;
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+| c_custkey | c_name             | c_address                             | 
c_phone         | c_acctbal | c_mktsegment | c_comment                          
              | c_nationkey |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+|       100 | Customer#000000100 | jD2xZzi                               | 
25-430-914-2194 |   3471.59 | BUILDING     | cial ideas. final, furious 
requests              |          25 |
+|        32 | Customer#000000032 | jD2xZzi UmId,DCtNBLXKj9q0Tlp2iQ6ZcO3J | 
25-430-914-2194 |   3471.53 | BUILDING     | cial ideas. final, furious 
requests across the e |          15 |
++-----------+--------------------+---------------------------------------+-----------------+-----------+--------------+--------------------------------------------------+-------------+
+spark-sql> select * from customer_mor timestamp as of '20240603015058442' 
where c_custkey = 32 or c_custkey = 100;
+```
+
+## Query Optimization
+
+Data in Apache Hudi can be roughly divided into two categories - baseline data 
and incremental data. Baseline data is typically merged Parquet files, while 
incremental data refers to data increments generated by INSERT, UPDATE, or 
DELETE operations. Baseline data can be read directly, while incremental data 
needs to be read through Merge on Read.
+
+For querying Hudi COW tables or Read Optimized queries on MOR tables, the data 
belongs to baseline data and can be directly read using Doris's native Parquet 
Reader, providing fast query responses. For incremental data, Doris needs to 
access Hudi's Java SDK through JNI calls. To achieve optimal query performance, 
Apache Doris divides the data in a query into baseline and incremental data 
parts and reads them using the aforementioned methods.
+
+To verify this optimization approach, we can use the EXPLAIN statement to see 
how many baseline and incremental data are present in a query example below. 
For a COW table, all 101 data shards are baseline data 
(`hudiNativeReadSplits=101/101`), so the COW table can be entirely read 
directly using Doris's Parquet Reader, resulting in the best query performance. 
For a ROW table, most data shards are baseline data 
(`hudiNativeReadSplits=100/101`), with one shard being incremental data, which 
[...]
+
+```
+-- COW table is read natively
+doris> explain select * from customer_cow where c_custkey = 32;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_cow                                       |
+|      predicates: (c_custkey[#5] = 32)                          |
+|      inputSplitNum=101, totalFileSize=45338886, scanRanges=101 |
+|      partition=26/26                                           |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=101/101                              |
+
+-- MOR table: because only the base file contains `c_custkey = 32` that is 
updated, 100 splits are read natively, while the split with log file is read by 
JNI.
+doris> explain select * from customer_mor where c_custkey = 32;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_mor                                       |
+|      predicates: (c_custkey[#5] = 32)                          |
+|      inputSplitNum=101, totalFileSize=45340731, scanRanges=101 |
+|      partition=26/26                                           |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=100/101                              |
+```
+
+You can further observe the changes in Hudi baseline data and incremental data 
by performing some deletion operations using Spark:
+
+```
+-- Use delete statement to see more differences
+spark-sql> delete from customer_cow where c_custkey = 64;
+doris> explain select * from customer_cow where c_custkey = 64;
+
+spark-sql> delete from customer_mor where c_custkey = 64;
+doris> explain select * from customer_mor where c_custkey = 64;
+```
+
+Additionally, you can reduce the data volume further by using partition 
conditions for partition pruning to improve query speed. In the example below, 
partition pruning is done using the partition condition `c_nationkey=15`, 
allowing the query request to access data from only one partition 
(`partition=1/26`).
+
+```
+-- customer_xxx is partitioned by c_nationkey, we can use the partition column 
to prune data
+doris> explain select * from customer_mor where c_custkey = 64 and c_nationkey 
= 15;
+|   0:VHUDI_SCAN_NODE(68)                                        |
+|      table: customer_mor                                       |
+|      predicates: (c_custkey[#5] = 64), (c_nationkey[#12] = 15) |
+|      inputSplitNum=4, totalFileSize=1798186, scanRanges=4      |
+|      partition=1/26                                            |
+|      cardinality=1, numNodes=1                                 |
+|      pushdown agg=NONE                                         |
+|      hudiNativeReadSplits=3/4                                  |
+```
diff --git a/docs/get-starting/quick-start.md 
b/versioned_docs/version-2.1/get-starting/quick-start/quick-start.md
similarity index 100%
rename from docs/get-starting/quick-start.md
rename to versioned_docs/version-2.1/get-starting/quick-start/quick-start.md
diff --git a/versioned_docs/version-2.1/lakehouse/datalake-analytics/hudi.md 
b/versioned_docs/version-2.1/lakehouse/datalake-analytics/hudi.md
index 985f1aa363f..83ea52476fb 100644
--- a/versioned_docs/version-2.1/lakehouse/datalake-analytics/hudi.md
+++ b/versioned_docs/version-2.1/lakehouse/datalake-analytics/hudi.md
@@ -24,6 +24,7 @@ specific language governing permissions and limitations
 under the License.
 -->
 
+[Apache Doris & Hudi Quick Start](../../get-starting/quick-start/doris-hudi.md)
 
 ## Usage
 
diff --git a/versioned_sidebars/version-2.0-sidebars.json 
b/versioned_sidebars/version-2.0-sidebars.json
index 8e272ae23ba..94bedc0a35e 100644
--- a/versioned_sidebars/version-2.0-sidebars.json
+++ b/versioned_sidebars/version-2.0-sidebars.json
@@ -5,7 +5,13 @@
             "label": "Getting Started",
             "items": [
                 "get-starting/what-is-apache-doris",
-                "get-starting/quick-start"
+                {
+                    "type": "category",
+                    "label": "Quick Start",
+                    "items": [
+                        "get-starting/quick-start/quick-start"
+                    ]
+                }
             ]
         },
         {
diff --git a/versioned_sidebars/version-2.1-sidebars.json 
b/versioned_sidebars/version-2.1-sidebars.json
index e895f5dd237..72016b03266 100644
--- a/versioned_sidebars/version-2.1-sidebars.json
+++ b/versioned_sidebars/version-2.1-sidebars.json
@@ -5,7 +5,14 @@
             "label": "Getting Started",
             "items": [
                 "get-starting/what-is-apache-doris",
-                "get-starting/quick-start"
+                {
+                    "type": "category",
+                    "label": "Quick Start",
+                    "items": [
+                        "get-starting/quick-start/quick-start",
+                        "get-starting/quick-start/doris-hudi"
+                    ]
+                }
             ]
         },
         {


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

(doris-website) branch master updated: [quick-start] refactor quick start (#857)

Reply via email to