(doris-website) branch master updated: (add)[hive] add hive build doc (#574)

morningman Sun, 28 Apr 2024 01:13:03 -0700

This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git



The following commit(s) were added to refs/heads/master by this push:
     new 04a7420020c (add)[hive] add hive build doc (#574)
04a7420020c is described below

commit 04a7420020c62626b8b894caa82f85df53fcc91e
Author: Mingyu Chen <morning...@163.com>
AuthorDate: Sun Apr 28 16:12:54 2024 +0800

    (add)[hive] add hive build doc (#574)
    
    This PR mainly changes:
    
    1. Refactored the document classification of the Lakehouse section.
    
            - Renamed the original `datalake` directory to 
`datalake-analytics`, mainly to place documents related to data lake analytics.
            - Added a new `datalake-building` directory for placing documents 
related to building the data lake.
            - Moved the `cloud-auth` document up one level in the hierarchy, 
making it a top-level directory under lakehouse.
            - Changed the title of the TVF document to "File Analysis".
    
    2. The above changes involve the dev and 2.1 documentation branches.
    
    3. The final document structure
    
            ```
            ├── lakehouse-overview.md
            ├── datalake-analytics
            │   ├── dlf.md
            │   ├── hive.md
            │   ├── hudi.md
            │   ├── iceberg.md
            │   └── paimon.md
            ├── datalake-building
            │   └── hive-build.md
            ├── database
            │   ├── es.md
            │   ├── jdbc.md
            │   └── max-compute.md
            ├── file.md
            ├── filecache.md
            ├── external-statistics.md
            ├── sql-dialect.md
            └── cloud-auth.md
            ```
---
 community/how-to-contribute/pull-request.md        |   2 +-
 docs/lakehouse/{datalake => }/cloud-auth.md        |   0
 .../{datalake => datalake-analytics}/dlf.md        |   0
 .../{datalake => datalake-analytics}/hive.md       |   0
 .../{datalake => datalake-analytics}/hudi.md       |   0
 .../{datalake => datalake-analytics}/iceberg.md    |   0
 .../{datalake => datalake-analytics}/paimon.md     |   0
 docs/lakehouse/datalake-building/hive-build.md     | 278 ++++++++++++++++++++
 docs/lakehouse/file.md                             |   6 +-
 .../current/how-to-contribute/pull-request.md      |   2 +-
 .../docusaurus-plugin-content-docs/current.json    |   6 +-
 .../current/lakehouse/{datalake => }/cloud-auth.md |   0
 .../{datalake => datalake-analytics}/dlf.md        |   0
 .../{datalake => datalake-analytics}/hive.md       |   0
 .../{datalake => datalake-analytics}/hudi.md       |   0
 .../{datalake => datalake-analytics}/iceberg.md    |   0
 .../{datalake => datalake-analytics}/paimon.md     |   0
 .../lakehouse/datalake-building/hive-build.md      | 283 +++++++++++++++++++++
 .../current/lakehouse/file.md                      |   2 +-
 .../version-2.1.json                               |   6 +-
 .../lakehouse/{datalake => }/cloud-auth.md         |   0
 .../{datalake => datalake-analytics}/dlf.md        |   0
 .../{datalake => datalake-analytics}/hive.md       |   0
 .../{datalake => datalake-analytics}/hudi.md       |   0
 .../{datalake => datalake-analytics}/iceberg.md    |   0
 .../{datalake => datalake-analytics}/paimon.md     |   0
 .../lakehouse/datalake-building/hive-build.md      | 283 +++++++++++++++++++++
 .../version-2.1/lakehouse/file.md                  |   2 +-
 sidebars.json                                      |  21 +-
 .../lakehouse/{datalake => }/cloud-auth.md         |   0
 .../{datalake => datalake-analytics}/dlf.md        |   0
 .../{datalake => datalake-analytics}/hive.md       |   0
 .../{datalake => datalake-analytics}/hudi.md       |   0
 .../{datalake => datalake-analytics}/iceberg.md    |   0
 .../{datalake => datalake-analytics}/paimon.md     |   0
 .../lakehouse/datalake-building/hive-build.md      | 278 ++++++++++++++++++++
 versioned_docs/version-2.1/lakehouse/file.md       |   4 +-
 versioned_sidebars/version-2.1-sidebars.json       |  21 +-
 38 files changed, 1168 insertions(+), 26 deletions(-)

diff --git a/community/how-to-contribute/pull-request.md 
b/community/how-to-contribute/pull-request.md
index 2c9d381148b..8fd907be632 100644
--- a/community/how-to-contribute/pull-request.md
+++ b/community/how-to-contribute/pull-request.md
@@ -80,7 +80,7 @@ git commit -a -m "<you_commit_message>"
 git push origin <your_branch_name>
 ```
 
-For more git usage, please visit: [git 
usage](https://www.atlassian.com/git/tutorials/set-up-a-repository), not to 
mention here.
+For more git usage, please visit: [git 
usage](https://docs.github.com/en/repositories/creating-and-managing-repositories/quickstart-for-repositories),
 not to mention here.
 
 ## 3. Create PR
 
diff --git a/docs/lakehouse/datalake/cloud-auth.md 
b/docs/lakehouse/cloud-auth.md
similarity index 100%
rename from docs/lakehouse/datalake/cloud-auth.md
rename to docs/lakehouse/cloud-auth.md
diff --git a/docs/lakehouse/datalake/dlf.md 
b/docs/lakehouse/datalake-analytics/dlf.md
similarity index 100%
rename from docs/lakehouse/datalake/dlf.md
rename to docs/lakehouse/datalake-analytics/dlf.md
diff --git a/docs/lakehouse/datalake/hive.md 
b/docs/lakehouse/datalake-analytics/hive.md
similarity index 100%
rename from docs/lakehouse/datalake/hive.md
rename to docs/lakehouse/datalake-analytics/hive.md
diff --git a/docs/lakehouse/datalake/hudi.md 
b/docs/lakehouse/datalake-analytics/hudi.md
similarity index 100%
rename from docs/lakehouse/datalake/hudi.md
rename to docs/lakehouse/datalake-analytics/hudi.md
diff --git a/docs/lakehouse/datalake/iceberg.md 
b/docs/lakehouse/datalake-analytics/iceberg.md
similarity index 100%
rename from docs/lakehouse/datalake/iceberg.md
rename to docs/lakehouse/datalake-analytics/iceberg.md
diff --git a/docs/lakehouse/datalake/paimon.md 
b/docs/lakehouse/datalake-analytics/paimon.md
similarity index 100%
rename from docs/lakehouse/datalake/paimon.md
rename to docs/lakehouse/datalake-analytics/paimon.md
diff --git a/docs/lakehouse/datalake-building/hive-build.md 
b/docs/lakehouse/datalake-building/hive-build.md
new file mode 100644
index 00000000000..e4988794dbe
--- /dev/null
+++ b/docs/lakehouse/datalake-building/hive-build.md
@@ -0,0 +1,278 @@
+---
+{
+    "title": "Hive",
+    "language": "en"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Starting from version 2.1.3, Doris supports DDL and DML operations for Hive. 
Users can directly create databases and tables in Hive through Doris and write 
data into Hive tables. With this feature, users can perform complete data 
queries and write operations on Hive through Doris, further helping to simplify 
the data lake integrated architecture.
+
+This article introduces Hive operations supported in Doris, including syntax 
and usage notes.
+
+## Metadata Creation and Deletion
+
+### Catalog
+
+- Create
+
+       ```
+       CREATE CATALOG [IF NOT EXISTS] hive PROPERTIES (
+           "type"="hms",
+           "hive.metastore.uris" = "thrift://172.21.16.47:7004",
+           "hadoop.username" = "hadoop",
+           "fs.defaultFS" = "hdfs://172.21.16.47:4007"
+       );
+       ```
+
+       Note, if you need to create Hive tables or write data through Doris, 
you must explicitly include the `fs.defaultFS` property in the Catalog 
properties. If creating the Catalog is only for querying, this parameter can be 
omitted.
+
+       For more parameters, please refer to [Hive 
Catalog](../datalake-analytics/hive.md)
+
+- Drop
+
+       ```
+       DROP CATALOG [IF EXISTS] hive;
+       ```
+
+       Deleting a Catalog does not delete any database or table information in 
Hive. It merely removes the mapping to this Hive cluster in Doris.
+
+### Database
+
+- Create
+
+       You can switch to the corresponding Catalog and execute the `CREATE 
DATABASE` statement:
+
+       ```
+       SWITCH hive;
+       CREATE DATABASE [IF NOT EXISTS] hive_db;
+       ```
+
+       You can also create using the fully qualified name or specify the 
location, as:
+
+       ```
+       CREATE DATABASE [IF NOT EXISTS] hive.hive_db;
+
+       CREATE DATABASE [IF NOT EXISTS] hive.hive_db
+       PROPERTIES ('location'='hdfs://172.21.16.47:4007/path/to/db/');
+       ```
+
+       Later, you can view the Database's Location information using the `SHOW 
CREATE DATABASE` command:
+
+       ```
+       mysql> SHOW CREATE DATABASE hive_db;
+       
+----------+---------------------------------------------------------------------------------------------+
+       | Database | Create Database                                            
                                 |
+       
+----------+---------------------------------------------------------------------------------------------+
+       | hive_db  | CREATE DATABASE `hive_db` LOCATION 
'hdfs://172.21.16.47:4007/usr/hive/warehouse/hive_db.db' |
+       
+----------+---------------------------------------------------------------------------------------------+
+       ```
+
+- Drop
+
+       ```
+       DROP DATABASE [IF EXISTS] hive.hive_db;
+       ```
+
+       Note that for Hive Databases, all tables within the Database must be 
deleted first, otherwise an error will occur. This operation will also delete 
the corresponding Database in Hive.
+
+### Table
+
+- Create
+
+       Doris supports creating partitioned or non-partitioned tables in Hive.
+
+       ```
+       -- Create unpartitioned hive table
+       CREATE TABLE unpartitioned_table (
+         `col1` BOOLEAN COMMENT 'col1',
+         `col2` INT COMMENT 'col2',
+         `col3` BIGINT COMMENT 'col3',
+         `col4` CHAR(10) COMMENT 'col4',
+         `col5` FLOAT COMMENT 'col5',
+         `col6` DOUBLE COMMENT 'col6',
+         `col7` DECIMAL(9,4) COMMENT 'col7',
+         `col8` VARCHAR(11) COMMENT 'col8',
+         `col9` STRING COMMENT 'col9'
+       )  ENGINE=hive
+       PROPERTIES (
+         'file_format'='parquet'
+       );
+
+       -- Create partitioned hive table
+       -- The partition columns must be in table's column definition list
+       CREATE TABLE partition_table (
+         `col1` BOOLEAN COMMENT 'col1',
+         `col2` INT COMMENT 'col2',
+         `col3` BIGINT COMMENT 'col3',
+         `col4` DECIMAL(2,1) COMMENT 'col4',
+         `pt1` VARCHAR COMMENT 'pt1',
+         `pt2` VARCHAR COMMENT 'pt2'
+       )  ENGINE=hive
+       PARTITION BY LIST (pt1, pt2) ()
+       PROPERTIES (
+         'file_format'='orc'
+       );
+       ```
+
+       After creation, you can view the Hive table creation statement using 
the `SHOW CREATE TABLE` command.
+
+       Note, unlike Hive's table creation statements. In Doris, when creating 
a Hive partitioned table, the partition columns must also be included in the 
Table's Schema.
+
+- Drop
+
+       You can drop a Hive table using the `DROP TABLE` statement. Currently, 
deleting the table also removes the data, including partition data.
+
+- Column Types
+
+       The column types used when creating Hive tables in Doris correspond to 
those in Hive as follows:
+
+       | Doris | Hive |
+       |---|---|
+       | BOOLEAN    | BOOLEAN |
+       | TINYINT    | TINYINT |
+       | SMALLINT   | SMALLINT |
+       | INT        | INT |
+       | BIGINT     | BIGINT |
+       | DATE     | DATE |
+       | DATETIME | TIMESTAMP |
+       | FLOAT      | FLOAT |
+       | DOUBLE     | DOUBLE |
+       | CHAR       | CHAR |
+       | VARCHAR    | STRING |
+       | STRING     | STRING |
+       | DECIMAL  | DECIMAL |
+       | ARRAY      | ARRAY |
+       | MAP        | MAP |
+       | STRUCT     | STRUCT |
+
+       > - Column types can only be nullable by default, NOT NULL is not 
supported.
+
+       > - Hive 3.0 supports setting default values. If you need to set 
default values, you need to explicitly add `"hive.version" = "3.0.0"` in the 
Catalog properties.
+       
+       > - After inserting data, if the types are not compatible, such as 
`'abc'` being inserted into a numeric type, it will be converted to a null 
value before insertion.
+
+- Partitions
+
+       The partition types in Hive correspond to the List partition in Doris. 
Therefore, when creating a Hive partitioned table in Doris, you need to use the 
List partition table creation statement, but there is no need to explicitly 
enumerate each partition. When writing data, Doris will automatically create 
the corresponding Hive partition based on the values of the data.
+
+       Supports creating single-column or multi-column partitioned tables.
+
+- File Formats
+
+       - Parquet
+       - ORC (default format)
+
+- Compression Formats
+
+       TODO
+
+- Storage Medium
+
+       - Currently, only HDFS is supported, future versions will support 
object storage.
+
+## Data Operations
+
+Data can be written into Hive tables through INSERT statements.
+
+Supports writing to Hive tables created by Doris or existing Hive tables with 
supported format.
+
+For partitioned tables, data will automatically be written to the 
corresponding partition or new partitions will be created.
+
+Currently, writing to specific partitions is not supported.
+
+### INSERT
+
+The INSERT operation appends data to the target table.
+
+```
+INSERT INTO hive_tbl values (val1, val2, val3, val4);
+INSERT INTO hive.hive_db.hive_tbl SELECT col1, col2 FROM internal.db1.tbl1;
+
+INSERT INTO hive_tbl(col1, col2) values (val1, val2);
+INSERT INTO hive_tbl(col1, col2, partition_col1, partition_col2) values (1, 2, 
"beijing", "2023-12-12");
+```
+
+### INSERT OVERWRITE
+
+The INSERT OVERWRITE operation completely overwrites the existing data in the 
table with new data.
+
+```
+INSERT OVERWRITE TABLE VALUES(val1, val2, val3, val4)
+INSERT OVERWRITE TABLE hive.hive_db.hive_tbl(col1, col2) SELECT col1, col2 
FROM internal.db1.tbl1;
+```
+
+### CTAS (CREATE TABLE AS SELECT)
+
+A Hive table can be created and populated with data using the `CTAS (CREATE 
TABLE AS SELECT)` statement:
+
+```
+CREATE TABLE hive_ctas ENGINE=hive AS SELECT * FROM other_table;
+```
+
+CTAS supports specifying file formats, partitioning methods, and other 
information, such as:
+
+```
+CREATE TABLE hive_ctas ENGINE=hive
+PARTITION BY LIST (pt1, pt2) ()
+AS SELECT col1,pt1,pt2 FROM part_ctas_src WHERE col1>0;
+
+CREATE TABLE hive.hive_db.hive_ctas (col1,col2,pt1) ENGINE=hive
+PARTITION BY LIST (pt1) ()
+PROPERTIES (
+"file_format"="parquet",
+"parquet.compression"="zstd"
+)
+AS SELECT col1,pt1 as col2,pt2 as pt1 FROM test_ctas.part_ctas_src WHERE 
col1>0;
+```
+
+## Exception Data and Data Transformation
+
+TODO
+
+## Transaction Mechanism
+
+Write operations to Hive are placed in a separate transaction. Until the 
transaction is committed, the data is not visible externally. Only after 
committing the transaction do the table's related operations become visible to 
others.
+
+Transactions ensure the atomicity of operations—all operations within a 
transaction either succeed completely or fail altogether.
+
+Transactions do not fully guarantee isolation of operations; they strive to 
minimize the inconsistency window by separating file system operations from 
metadata operations on the Hive Metastore.
+
+For example, in a transaction involving multiple partition modifications of a 
Hive table, if the task is divided into two batches, and the first batch is 
completed but the second batch has not yet started, the partitions from the 
first batch are already visible externally, and can be read, but the second 
batch partitions cannot.
+
+If any anomalies occur during the transaction commit process, the transaction 
will be directly rolled back, including modifications to HDFS files and 
metadata in the Hive Metastore, without requiring further action from the user.
+
+## Relevant Parameters
+
+### FE
+
+TODO
+
+### BE
+
+| Parameter Name | Default Value | Description |
+| --- | --- | --- |
+| `hive_sink_max_file_size` | Maximum file size for data files. When the 
volume of written data exceeds this size, the current file is closed, and a new 
file is opened for continued writing. | 1GB |
+| `table_sink_partition_write_max_partition_nums_per_writer` | Maximum number 
of partitions that can be written by each Instance on a BE node. |  128 |
+| `table_sink_non_partition_write_scaling_data_processed_threshold` | 
Threshold of data volume for starting scaling-write in non-partitioned tables. 
For every increase of 
`table_sink_non_partition_write_scaling_data_processed_threshold` in data 
volume, a new writer (instance) will be engaged for writing. The scaling-write 
mechanism aims to use a different number of writers (instances) based on the 
volume of data to increase the throughput of concurrent writing. When the 
volume of data is [...]
+| `table_sink_partition_write_min_data_processed_rebalance_threshold` | 
Minimum data volume threshold for triggering rebalance in partitioned tables. 
If `current accumulated data volume` - `data volume accumulated since the last 
rebalance or from the start` >= 
`table_sink_partition_write_min_data_processed_rebalance_threshold`, 
rebalancing is triggered. If there is a significant difference in the final 
file sizes, you can reduce this threshold to increase balance. However, too 
small a th [...]
+| 
`table_sink_partition_write_min_partition_data_processed_rebalance_threshold` | 
Minimum data volume threshold per partition for rebalancing in partitioned 
tables. If `current partition's data volume` >= `threshold` * `number of tasks 
allocated to the current partition`, rebalancing for that partition begins. If 
there is a significant difference in the final file sizes, you can reduce this 
threshold to increase balance. However, too small a threshold may increase the 
cost of rebalancing [...]
+
diff --git a/docs/lakehouse/file.md b/docs/lakehouse/file.md
index d5b4b7fdc22..b4f7ab24870 100644
--- a/docs/lakehouse/file.md
+++ b/docs/lakehouse/file.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "Table Value Function (TVF)",
+    "title": "File Analytics",
     "language": "en"
 }
 ---
@@ -24,8 +24,6 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-
-
 With the Table Value Function feature, Doris is able to query files in object 
storage or HDFS as simply as querying Tables. In addition, it supports 
automatic column type inference.
 
 ## Usage
@@ -197,4 +195,4 @@ FROM s3(
 
 1. If the URI specified by the `S3 / HDFS` TVF is not matched with the file, 
or all the matched files are empty files, then the` S3 / HDFS` TVF will return 
to the empty result set. In this case, using the `DESC FUNCTION` to view the 
schema of this file, you will get a dummy column` __dummy_col`, which can be 
ignored.
 
-2. If the format of the TVF is specified to `CSV`, and the read file is not a 
empty file but the first line of this file is empty, then it will prompt the 
error `The first line is empty, can not parse column numbers`. This is because 
the schema cannot be parsed from the first line of the file
\ No newline at end of file
+2. If the format of the TVF is specified to `CSV`, and the read file is not a 
empty file but the first line of this file is empty, then it will prompt the 
error `The first line is empty, can not parse column numbers`. This is because 
the schema cannot be parsed from the first line of the file
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/how-to-contribute/pull-request.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/how-to-contribute/pull-request.md
index 24063be28ec..b354df9ddd1 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/how-to-contribute/pull-request.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/how-to-contribute/pull-request.md
@@ -82,7 +82,7 @@ git commit -a -m "<you_commit_message>"
 git push origin <your_branch_name>
 ```
 
-更多 git 使用方法请访问：[git 
使用](https://www.atlassian.com/git/tutorials/setting-up-a-repository)，这里不赘述。
+更多 git 使用方法请访问：[git 
使用](https://docs.github.com/en/repositories/creating-and-managing-repositories/quickstart-for-repositories)，这里不赘述。
 
 ## 3. 创建 PR
 
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current.json 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
index 4fd6f570831..132a179f7ef 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
@@ -67,6 +67,10 @@
     "message": "数据湖分析",
     "description": "The label for category Data Lake Analytics in sidebar docs"
   },
+  "sidebar.docs.category.Data Lake Building": {
+    "message": "数据湖构建",
+    "description": "The label for category Data Lake Building in sidebar docs"
+  },
   "sidebar.docs.category.Database Analytics": {
     "message": "数据库分析",
     "description": "The label for category Database Analytics.Multi Catalog in 
sidebar docs"
@@ -367,4 +371,4 @@
     "message": "存算分离",
     "description": "Separation of Storage and Compute"
   }
-}
\ No newline at end of file
+}
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/cloud-auth.md
 b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/cloud-auth.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/cloud-auth.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/cloud-auth.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/dlf.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/dlf.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/dlf.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/dlf.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/hive.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hive.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/hive.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hive.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/hudi.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hudi.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/hudi.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/hudi.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/iceberg.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/iceberg.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/iceberg.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/iceberg.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/paimon.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/paimon.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake/paimon.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-analytics/paimon.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-building/hive-build.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-building/hive-build.md
new file mode 100644
index 00000000000..ebfeda38fcb
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/datalake-building/hive-build.md
@@ -0,0 +1,283 @@
+---
+{
+    "title": "Hive",
+    "language": "zh-CN"
+}
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+自 2.1.3 版本开始，Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Doris 在 Hive 中创建库表，并将数据写入到 
Hive 表中。通过该功能，用户可以通过 Doris 对 Hive 进行完整的数据查询和写入操作，进一步帮助用户简化湖仓一体架构。
+
+本文介绍在 Doris 中支持的 Hive 操作，语法和使用须知。
+
+## 元数据创建与删除
+
+### Catalog
+
+- 创建
+
+       ```
+       CREATE CATALOG [IF NOT EXISTS] hive PROPERTIES (
+           "type"="hms",
+           "hive.metastore.uris" = "thrift://172.21.16.47:7004",
+           "hadoop.username" = "hadoop",
+           "fs.defaultFS" = "hdfs://172.21.16.47:4007"
+       );
+       ```
+               
+       注意，如如果需要通过 Doris 创建 Hive 表或写入数据，需要在 Catalog 属性中显示增加 `fs.defaultFS` 
属性。如果创建 Catalog 仅用于查询，则该参数可以省略。
+       
+       更多参数，请参阅 [Hive Catalog](../datalake-analytics/hive.md)
+
+- 删除
+
+       ```
+       DROP CATALOG [IF EXISTS] hive;
+       ```
+       
+       删除 Catalog 并不会删除 hive 中的任何库表信息。仅仅是在 Doris 中移除了对这个 Hive 集群的映射。
+       
+### Database
+
+- 创建
+
+       可以通过 `SWITCH` 语句切换到对应的 Catalog 下，执行 `CREATE DATABASE` 语句：
+               
+       ```
+       SWITCH hive;
+       CREATE DATABASE [IF NOT EXISTS] hive_db;
+       ```
+               
+       也可以使用全限定名创建，或指定 location，如：
+               
+       ```
+       CREATE DATABASE [IF NOT EXISTS] hive.hive_db;
+               
+       CREATE DATABASE [IF NOT EXISTS] hive.hive_db
+       PROPERTIES ('location'='hdfs://172.21.16.47:4007/path/to/db/');
+       ```
+               
+       之后可以通过 `SHOW CREATE DATABASE` 命令可以查看 Database 的 Location 信息：
+               
+       ```
+       mysql> SHOW CREATE DATABASE hive_db;
+       
+----------+---------------------------------------------------------------------------------------------+
+       | Database | Create Database                                            
                                 |
+       
+----------+---------------------------------------------------------------------------------------------+
+       | hive_db  | CREATE DATABASE `hive_db` LOCATION 
'hdfs://172.21.16.47:4007/usr/hive/warehouse/hive_db.db' |
+       
+----------+---------------------------------------------------------------------------------------------+
+       ```
+
+- 删除
+
+       ```
+       DROP DATABASE [IF EXISTS] hive.hive_db;
+       ```
+               
+       注意，对于 Hive Database，必须先删除这个 Database 下的所有表后，才能删除 
Database，否则会报错。这个操作会同步删除 Hive 中对应的 Database。
+
+       
+### Table
+
+- 创建
+
+       Doris 支持在 Hive 中创建分区或非分区表。
+       
+       ```
+       -- Create unpartitioned hive table
+       CREATE TABLE unpartitioned_table (
+         `col1` BOOLEAN COMMENT 'col1',
+         `col2` INT COMMENT 'col2',
+         `col3` BIGINT COMMENT 'col3',
+         `col4` CHAR(10) COMMENT 'col4',
+         `col5` FLOAT COMMENT 'col5',
+         `col6` DOUBLE COMMENT 'col6',
+         `col7` DECIMAL(9,4) COMMENT 'col7',
+         `col8` VARCHAR(11) COMMENT 'col8',
+         `col9` STRING COMMENT 'col9'
+       )  ENGINE=hive
+       PROPERTIES (
+         'file_format'='parquet'
+       );
+       
+       -- Create partitioned hive table
+       -- The partition columns must be in table's column definition list
+       CREATE TABLE partition_table (
+         `col1` BOOLEAN COMMENT 'col1',
+         `col2` INT COMMENT 'col2',
+         `col3` BIGINT COMMENT 'col3',
+         `col4` DECIMAL(2,1) COMMENT 'col4',
+         `pt1` VARCHAR COMMENT 'pt1',
+         `pt2` VARCHAR COMMENT 'pt2'
+       )  ENGINE=hive
+       PARTITION BY LIST (pt1, pt2) ()
+       PROPERTIES (
+         'file_format'='orc'
+       );
+       ```
+       
+       创建后，可以通过 `SHOW CREATE TABLE` 命令查看 Hive 的建表语句。
+       
+       注意，不同于 Hive 中的建表语句。在 Doris 中创建 Hive 分区表时，分区列也必须写到 Table 的 Schema 中。
+
+- 删除
+
+       可以通过 `DROP TABLE` 语句删除一个 Hive 表。当前删除表后，会同时删除数据，包括分区数据。
+       
+- 列类型
+
+       在 Doris 中创建 Hive 表所使用的列类型，和 Hive 中的列类型对应关系如下
+       
+       | Doris | Hive |
+       |---|---|
+       | BOOLEAN    | BOOLEAN |
+       | TINYINT    | TINYINT |
+       | SMALLINT   | SMALLINT |
+       | INT        | INT |
+       | BIGINT     | BIGINT |
+       | DATE     | DATE |
+       | DATETIME | TIMESTAMP |
+       | FLOAT      | FLOAT |
+       | DOUBLE     | DOUBLE |
+       | CHAR       | CHAR |
+       | VARCHAR    | STRING |
+       | STRING     | STRING |
+       | DECIMAL  | DECIMAL |
+       | ARRAY      | ARRAY |
+       | MAP        | MAP |
+       | STRUCT     | STRUCT |
+       
+       - 列类型只能为默认的 nullable，不支持  NOT NULL。
+       - Hive 3.0 支持设置默认值。如果需要设置默认值，则需要再 Catalog 属性中显示的添加 `"hive.version" = 
"3.0.0"`
+       - 插入数据后，如果类型不能够兼容，例如 `'abc'` 插入到数值类型，则会转为 null 值后插入。
+
+- 分区
+
+       Hive 中的分区类型对应 Doris 中的 List 分区。因此，在 Doris 中 创建 Hive 分区表，需使用 List 
分区的建表语句，但无需显式的枚举各个分区。在写入数据时，Doris 会根据数据的值，自动创建对应的 Hive 分区。
+
+       支持创建单列或多列分区表。
+       
+- 文件格式
+
+       - Parquet
+       - ORC（默认格式）
+
+- 压缩格式
+
+       TODO
+
+- 存储介质
+
+       - 目前仅支持 HDFS，后续版本将支持对象存储。
+
+## 数据操作
+
+可以通过 INSERT 语句将数据写入到 Hive 表中。
+
+支持写入到由 Doris 创建的 Hive 表，或者 Hive 中已存在的且格式支持的表。
+
+对于分区表，会根据数据，自动写入到对应分区，或者创建新的分区。
+
+目前不支持指定分区写入。
+
+### INSERT
+
+INSERT 操作会数据以追加的方式写入到目标表中。
+
+```
+INSERT INTO hive_tbl values (val1, val2, val3, val4);
+INSERT INTO hive.hive_db.hive_tbl SELECT col1, col2 FROM internal.db1.tbl1;
+
+INSERT INTO hive_tbl(col1, col2) values (val1, val2);
+INSERT INTO hive_tbl(col1, col2, partition_col1, partition_col2) values (1, 2, 
"beijing", "2023-12-12");
+```
+
+### INSERT OVERWRITE
+
+INSERT OVERWRITE 会使用新的数据完全覆盖原有表中的数据。
+
+```
+INSERT OVERWRITE TABLE VALUES(val1, val2, val3, val4)
+INSERT OVERWRITE TABLE hive.hive_db.hive_tbl(col1, col2) SELECT col1, col2 
FROM internal.db1.tbl1;
+```
+
+### CTAS(CREATE TABLE AS SELECT)
+       
+可以通过 `CTAS(CREATE TABLE AS SELECT)` 语句创建 Hive 表并写入数据：
+       
+```
+CREATE TABLE hive_ctas ENGINE=hive AS SELECT * FROM other_table;
+```
+       
+CTAS 支持指定文件格式、分区方式等信息，如：
+       
+```
+CREATE TABLE hive_ctas ENGINE=hive
+PARTITION BY LIST (pt1, pt2) ()
+AS SELECT col1,pt1,pt2 FROM part_ctas_src WHERE col1>0;
+       
+CREATE TABLE hive.hive_db.hive_ctas (col1,col2,pt1) ENGINE=hive
+PARTITION BY LIST (pt1) ()
+PROPERTIES (
+       "file_format"="parquet",
+       "parquet.compression"="zstd"
+)
+AS SELECT col1,pt1 as col2,pt2 as pt1 FROM test_ctas.part_ctas_src WHERE 
col1>0;
+```
+
+## 异常数据和数据转换
+
+TODO
+
+## 事务机制
+
+对 Hive 的写入操作会被放在一个单独的事务里，在事务提交前，数据对外不可见。只有当提交该事务后，表的相关操作才对其他人可见。
+
+事务能保证操作的原子性，事务内的所有操作，要么全部成功，要么全部失败。
+
+事务不能完全保证操作的隔离性，只能尽力而为，通过分离文件系统操作和 对 Hive Metastore 的元数据操作来尽量减少不一致的时间窗口。
+
+比如在一个事务中，需要修改 Hive 
表的多个分区。假设这个任务分成两批进行操作，在第一批操作已经完成、第二批操作还未完成时，第一批分区已经对外可见，外部可以读取到第一批分区，但读不到第二批分区。
+
+在事务提交过程中出现任何异常，都会直接回退该事务，包括对 HDFS 文件的修改、以及对 Hive Metastore 元数据的修改，不需要用户做其他处理。
+
+## 相关参数
+
+### FE
+
+TODO
+
+### BE
+
+| 参数名称 | 默认值 | 描述 |
+| --- | --- | --- |
+| `hive_sink_max_file_size` | 最大的数据文件大小。当写入数据量超过该大小后会关闭当前文件，滚动产生一个新文件继续写入。| 
1GB |
+| `table_sink_partition_write_max_partition_nums_per_writer` | BE 节点上每个 
Instance 最大写入的分区数目。 |  128 |
+| `table_sink_non_partition_write_scaling_data_processed_threshold` | 非分区表开始 
scaling-write 的数据量阈值。每增加 
`table_sink_non_partition_write_scaling_data_processed_threshold` 数据就会发送给一个新的 
writer(instance) 进行写入。scaling-write 机制主要是为了根据数据量来使用不同数目的 writer(instance) 
来进行写入，会随着数据量的增加而增大写入的 writer(instance) 
数目，从而提高并发写入的吞吐。当数据量比较少的时候也会节省资源，并且尽可能地减少产生的文件数目。 | 25MB |
+| `table_sink_partition_write_min_data_processed_rebalance_threshold` | 
分区表开始触发重平衡的最少数据量阈值。如果 `当前累积的数据量` - `自从上次触发重平衡或者最开始累积的数据量` >= 
`table_sink_partition_write_min_data_processed_rebalance_threshold`，就开始触发重平衡机制。如果发现最终生成的文件大小差异过大，可以调小改阈值来增加均衡度。当然过小的阈值会导致重平衡的成本增加，可能会影响性能。
 | 25MB |
+| 
`table_sink_partition_write_min_partition_data_processed_rebalance_threshold` | 
分区表开始进行重平衡时的最少的分区数据量阈值。如果 `当前分区的数据量` >= `阈值` * `当前分区已经分配的 task 
数目`，就开始对该分区进行重平衡。如果发现最终生成的文件大小差异过大，可以调小改阈值来增加均衡度。当然过小的阈值会导致重平衡的成本增加，可能会影响性能。 | 
15MB |
+
+
+
+
+
+
+
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file.md
index 97470c6ea67..0eb31aef3b5 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "TVF",
+    "title": "文件分析",
     "language": "zh-CN"
 }
 ---
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1.json 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1.json
index f3074fdd56c..f4b6eb47b95 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1.json
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1.json
@@ -67,6 +67,10 @@
     "message": "数据湖分析",
     "description": "The label for category Data Lake Analytics in sidebar docs"
   },
+  "sidebar.docs.category.Data Lake Building": {
+    "message": "数据湖构建",
+    "description": "The label for category Data Lake Building in sidebar docs"
+  },
   "sidebar.docs.category.Database Analytics": {
     "message": "数据库分析",
     "description": "The label for category Database Analytics.Multi Catalog in 
sidebar docs"
@@ -363,4 +367,4 @@
     "message": "外部表",
     "description": "The label for category Lakehouse.External Table in sidebar 
docs"
   }
-}
\ No newline at end of file
+}
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/cloud-auth.md
 b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/cloud-auth.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/cloud-auth.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/cloud-auth.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/dlf.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/dlf.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/dlf.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/dlf.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/hive.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hive.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/hive.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hive.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/hudi.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hudi.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/hudi.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/hudi.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/iceberg.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/iceberg.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/iceberg.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/iceberg.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/paimon.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/paimon.md
similarity index 100%
rename from 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake/paimon.md
rename to 
i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-analytics/paimon.md
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-building/hive-build.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-building/hive-build.md
new file mode 100644
index 00000000000..ebfeda38fcb
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/datalake-building/hive-build.md
@@ -0,0 +1,283 @@
+---
+{
+    "title": "Hive",
+    "language": "zh-CN"
+}
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+自 2.1.3 版本开始，Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Doris 在 Hive 中创建库表，并将数据写入到 
Hive 表中。通过该功能，用户可以通过 Doris 对 Hive 进行完整的数据查询和写入操作，进一步帮助用户简化湖仓一体架构。
+
+本文介绍在 Doris 中支持的 Hive 操作，语法和使用须知。
+
+## 元数据创建与删除
+
+### Catalog
+
+- 创建
+
+       ```
+       CREATE CATALOG [IF NOT EXISTS] hive PROPERTIES (
+           "type"="hms",
+           "hive.metastore.uris" = "thrift://172.21.16.47:7004",
+           "hadoop.username" = "hadoop",
+           "fs.defaultFS" = "hdfs://172.21.16.47:4007"
+       );
+       ```
+               
+       注意，如如果需要通过 Doris 创建 Hive 表或写入数据，需要在 Catalog 属性中显示增加 `fs.defaultFS` 
属性。如果创建 Catalog 仅用于查询，则该参数可以省略。
+       
+       更多参数，请参阅 [Hive Catalog](../datalake-analytics/hive.md)
+
+- 删除
+
+       ```
+       DROP CATALOG [IF EXISTS] hive;
+       ```
+       
+       删除 Catalog 并不会删除 hive 中的任何库表信息。仅仅是在 Doris 中移除了对这个 Hive 集群的映射。
+       
+### Database
+
+- 创建
+
+       可以通过 `SWITCH` 语句切换到对应的 Catalog 下，执行 `CREATE DATABASE` 语句：
+               
+       ```
+       SWITCH hive;
+       CREATE DATABASE [IF NOT EXISTS] hive_db;
+       ```
+               
+       也可以使用全限定名创建，或指定 location，如：
+               
+       ```
+       CREATE DATABASE [IF NOT EXISTS] hive.hive_db;
+               
+       CREATE DATABASE [IF NOT EXISTS] hive.hive_db
+       PROPERTIES ('location'='hdfs://172.21.16.47:4007/path/to/db/');
+       ```
+               
+       之后可以通过 `SHOW CREATE DATABASE` 命令可以查看 Database 的 Location 信息：
+               
+       ```
+       mysql> SHOW CREATE DATABASE hive_db;
+       
+----------+---------------------------------------------------------------------------------------------+
+       | Database | Create Database                                            
                                 |
+       
+----------+---------------------------------------------------------------------------------------------+
+       | hive_db  | CREATE DATABASE `hive_db` LOCATION 
'hdfs://172.21.16.47:4007/usr/hive/warehouse/hive_db.db' |
+       
+----------+---------------------------------------------------------------------------------------------+
+       ```
+
+- 删除
+
+       ```
+       DROP DATABASE [IF EXISTS] hive.hive_db;
+       ```
+               
+       注意，对于 Hive Database，必须先删除这个 Database 下的所有表后，才能删除 
Database，否则会报错。这个操作会同步删除 Hive 中对应的 Database。
+
+       
+### Table
+
+- 创建
+
+       Doris 支持在 Hive 中创建分区或非分区表。
+       
+       ```
+       -- Create unpartitioned hive table
+       CREATE TABLE unpartitioned_table (
+         `col1` BOOLEAN COMMENT 'col1',
+         `col2` INT COMMENT 'col2',
+         `col3` BIGINT COMMENT 'col3',
+         `col4` CHAR(10) COMMENT 'col4',
+         `col5` FLOAT COMMENT 'col5',
+         `col6` DOUBLE COMMENT 'col6',
+         `col7` DECIMAL(9,4) COMMENT 'col7',
+         `col8` VARCHAR(11) COMMENT 'col8',
+         `col9` STRING COMMENT 'col9'
+       )  ENGINE=hive
+       PROPERTIES (
+         'file_format'='parquet'
+       );
+       
+       -- Create partitioned hive table
+       -- The partition columns must be in table's column definition list
+       CREATE TABLE partition_table (
+         `col1` BOOLEAN COMMENT 'col1',
+         `col2` INT COMMENT 'col2',
+         `col3` BIGINT COMMENT 'col3',
+         `col4` DECIMAL(2,1) COMMENT 'col4',
+         `pt1` VARCHAR COMMENT 'pt1',
+         `pt2` VARCHAR COMMENT 'pt2'
+       )  ENGINE=hive
+       PARTITION BY LIST (pt1, pt2) ()
+       PROPERTIES (
+         'file_format'='orc'
+       );
+       ```
+       
+       创建后，可以通过 `SHOW CREATE TABLE` 命令查看 Hive 的建表语句。
+       
+       注意，不同于 Hive 中的建表语句。在 Doris 中创建 Hive 分区表时，分区列也必须写到 Table 的 Schema 中。
+
+- 删除
+
+       可以通过 `DROP TABLE` 语句删除一个 Hive 表。当前删除表后，会同时删除数据，包括分区数据。
+       
+- 列类型
+
+       在 Doris 中创建 Hive 表所使用的列类型，和 Hive 中的列类型对应关系如下
+       
+       | Doris | Hive |
+       |---|---|
+       | BOOLEAN    | BOOLEAN |
+       | TINYINT    | TINYINT |
+       | SMALLINT   | SMALLINT |
+       | INT        | INT |
+       | BIGINT     | BIGINT |
+       | DATE     | DATE |
+       | DATETIME | TIMESTAMP |
+       | FLOAT      | FLOAT |
+       | DOUBLE     | DOUBLE |
+       | CHAR       | CHAR |
+       | VARCHAR    | STRING |
+       | STRING     | STRING |
+       | DECIMAL  | DECIMAL |
+       | ARRAY      | ARRAY |
+       | MAP        | MAP |
+       | STRUCT     | STRUCT |
+       
+       - 列类型只能为默认的 nullable，不支持  NOT NULL。
+       - Hive 3.0 支持设置默认值。如果需要设置默认值，则需要再 Catalog 属性中显示的添加 `"hive.version" = 
"3.0.0"`
+       - 插入数据后，如果类型不能够兼容，例如 `'abc'` 插入到数值类型，则会转为 null 值后插入。
+
+- 分区
+
+       Hive 中的分区类型对应 Doris 中的 List 分区。因此，在 Doris 中 创建 Hive 分区表，需使用 List 
分区的建表语句，但无需显式的枚举各个分区。在写入数据时，Doris 会根据数据的值，自动创建对应的 Hive 分区。
+
+       支持创建单列或多列分区表。
+       
+- 文件格式
+
+       - Parquet
+       - ORC（默认格式）
+
+- 压缩格式
+
+       TODO
+
+- 存储介质
+
+       - 目前仅支持 HDFS，后续版本将支持对象存储。
+
+## 数据操作
+
+可以通过 INSERT 语句将数据写入到 Hive 表中。
+
+支持写入到由 Doris 创建的 Hive 表，或者 Hive 中已存在的且格式支持的表。
+
+对于分区表，会根据数据，自动写入到对应分区，或者创建新的分区。
+
+目前不支持指定分区写入。
+
+### INSERT
+
+INSERT 操作会数据以追加的方式写入到目标表中。
+
+```
+INSERT INTO hive_tbl values (val1, val2, val3, val4);
+INSERT INTO hive.hive_db.hive_tbl SELECT col1, col2 FROM internal.db1.tbl1;
+
+INSERT INTO hive_tbl(col1, col2) values (val1, val2);
+INSERT INTO hive_tbl(col1, col2, partition_col1, partition_col2) values (1, 2, 
"beijing", "2023-12-12");
+```
+
+### INSERT OVERWRITE
+
+INSERT OVERWRITE 会使用新的数据完全覆盖原有表中的数据。
+
+```
+INSERT OVERWRITE TABLE VALUES(val1, val2, val3, val4)
+INSERT OVERWRITE TABLE hive.hive_db.hive_tbl(col1, col2) SELECT col1, col2 
FROM internal.db1.tbl1;
+```
+
+### CTAS(CREATE TABLE AS SELECT)
+       
+可以通过 `CTAS(CREATE TABLE AS SELECT)` 语句创建 Hive 表并写入数据：
+       
+```
+CREATE TABLE hive_ctas ENGINE=hive AS SELECT * FROM other_table;
+```
+       
+CTAS 支持指定文件格式、分区方式等信息，如：
+       
+```
+CREATE TABLE hive_ctas ENGINE=hive
+PARTITION BY LIST (pt1, pt2) ()
+AS SELECT col1,pt1,pt2 FROM part_ctas_src WHERE col1>0;
+       
+CREATE TABLE hive.hive_db.hive_ctas (col1,col2,pt1) ENGINE=hive
+PARTITION BY LIST (pt1) ()
+PROPERTIES (
+       "file_format"="parquet",
+       "parquet.compression"="zstd"
+)
+AS SELECT col1,pt1 as col2,pt2 as pt1 FROM test_ctas.part_ctas_src WHERE 
col1>0;
+```
+
+## 异常数据和数据转换
+
+TODO
+
+## 事务机制
+
+对 Hive 的写入操作会被放在一个单独的事务里，在事务提交前，数据对外不可见。只有当提交该事务后，表的相关操作才对其他人可见。
+
+事务能保证操作的原子性，事务内的所有操作，要么全部成功，要么全部失败。
+
+事务不能完全保证操作的隔离性，只能尽力而为，通过分离文件系统操作和 对 Hive Metastore 的元数据操作来尽量减少不一致的时间窗口。
+
+比如在一个事务中，需要修改 Hive 
表的多个分区。假设这个任务分成两批进行操作，在第一批操作已经完成、第二批操作还未完成时，第一批分区已经对外可见，外部可以读取到第一批分区，但读不到第二批分区。
+
+在事务提交过程中出现任何异常，都会直接回退该事务，包括对 HDFS 文件的修改、以及对 Hive Metastore 元数据的修改，不需要用户做其他处理。
+
+## 相关参数
+
+### FE
+
+TODO
+
+### BE
+
+| 参数名称 | 默认值 | 描述 |
+| --- | --- | --- |
+| `hive_sink_max_file_size` | 最大的数据文件大小。当写入数据量超过该大小后会关闭当前文件，滚动产生一个新文件继续写入。| 
1GB |
+| `table_sink_partition_write_max_partition_nums_per_writer` | BE 节点上每个 
Instance 最大写入的分区数目。 |  128 |
+| `table_sink_non_partition_write_scaling_data_processed_threshold` | 非分区表开始 
scaling-write 的数据量阈值。每增加 
`table_sink_non_partition_write_scaling_data_processed_threshold` 数据就会发送给一个新的 
writer(instance) 进行写入。scaling-write 机制主要是为了根据数据量来使用不同数目的 writer(instance) 
来进行写入，会随着数据量的增加而增大写入的 writer(instance) 
数目，从而提高并发写入的吞吐。当数据量比较少的时候也会节省资源，并且尽可能地减少产生的文件数目。 | 25MB |
+| `table_sink_partition_write_min_data_processed_rebalance_threshold` | 
分区表开始触发重平衡的最少数据量阈值。如果 `当前累积的数据量` - `自从上次触发重平衡或者最开始累积的数据量` >= 
`table_sink_partition_write_min_data_processed_rebalance_threshold`，就开始触发重平衡机制。如果发现最终生成的文件大小差异过大，可以调小改阈值来增加均衡度。当然过小的阈值会导致重平衡的成本增加，可能会影响性能。
 | 25MB |
+| 
`table_sink_partition_write_min_partition_data_processed_rebalance_threshold` | 
分区表开始进行重平衡时的最少的分区数据量阈值。如果 `当前分区的数据量` >= `阈值` * `当前分区已经分配的 task 
数目`，就开始对该分区进行重平衡。如果发现最终生成的文件大小差异过大，可以调小改阈值来增加均衡度。当然过小的阈值会导致重平衡的成本增加，可能会影响性能。 | 
15MB |
+
+
+
+
+
+
+
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/file.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/file.md
index 97470c6ea67..0eb31aef3b5 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/file.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/file.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "TVF",
+    "title": "文件分析",
     "language": "zh-CN"
 }
 ---
diff --git a/sidebars.json b/sidebars.json
index 0c5d2d08363..86e4ff1ee1a 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -264,12 +264,18 @@
                     "type": "category",
                     "label": "Data Lake Analytics",
                     "items": [
-                        "lakehouse/datalake/hive",
-                        "lakehouse/datalake/hudi",
-                        "lakehouse/datalake/iceberg",
-                        "lakehouse/datalake/paimon",
-                        "lakehouse/datalake/dlf",
-                        "lakehouse/datalake/cloud-auth"
+                        "lakehouse/datalake-analytics/hive",
+                        "lakehouse/datalake-analytics/hudi",
+                        "lakehouse/datalake-analytics/iceberg",
+                        "lakehouse/datalake-analytics/paimon",
+                        "lakehouse/datalake-analytics/dlf"
+                    ]
+                },
+                {
+                    "type": "category",
+                    "label": "Data Lake Building",
+                    "items": [
+                        "lakehouse/datalake-building/hive-build"
                     ]
                 },
                 {
@@ -284,6 +290,7 @@
                 "lakehouse/file",
                 "lakehouse/filecache",
                 "lakehouse/external-statistics",
+                "lakehouse/cloud-auth",
                 "lakehouse/sql-dialect"
             ]
         },
@@ -1516,4 +1523,4 @@
             ]
         }
     ]
-}
\ No newline at end of file
+}
diff --git a/versioned_docs/version-2.1/lakehouse/datalake/cloud-auth.md 
b/versioned_docs/version-2.1/lakehouse/cloud-auth.md
similarity index 100%
rename from versioned_docs/version-2.1/lakehouse/datalake/cloud-auth.md
rename to versioned_docs/version-2.1/lakehouse/cloud-auth.md
diff --git a/versioned_docs/version-2.1/lakehouse/datalake/dlf.md 
b/versioned_docs/version-2.1/lakehouse/datalake-analytics/dlf.md
similarity index 100%
rename from versioned_docs/version-2.1/lakehouse/datalake/dlf.md
rename to versioned_docs/version-2.1/lakehouse/datalake-analytics/dlf.md
diff --git a/versioned_docs/version-2.1/lakehouse/datalake/hive.md 
b/versioned_docs/version-2.1/lakehouse/datalake-analytics/hive.md
similarity index 100%
rename from versioned_docs/version-2.1/lakehouse/datalake/hive.md
rename to versioned_docs/version-2.1/lakehouse/datalake-analytics/hive.md
diff --git a/versioned_docs/version-2.1/lakehouse/datalake/hudi.md 
b/versioned_docs/version-2.1/lakehouse/datalake-analytics/hudi.md
similarity index 100%
rename from versioned_docs/version-2.1/lakehouse/datalake/hudi.md
rename to versioned_docs/version-2.1/lakehouse/datalake-analytics/hudi.md
diff --git a/versioned_docs/version-2.1/lakehouse/datalake/iceberg.md 
b/versioned_docs/version-2.1/lakehouse/datalake-analytics/iceberg.md
similarity index 100%
rename from versioned_docs/version-2.1/lakehouse/datalake/iceberg.md
rename to versioned_docs/version-2.1/lakehouse/datalake-analytics/iceberg.md
diff --git a/versioned_docs/version-2.1/lakehouse/datalake/paimon.md 
b/versioned_docs/version-2.1/lakehouse/datalake-analytics/paimon.md
similarity index 100%
rename from versioned_docs/version-2.1/lakehouse/datalake/paimon.md
rename to versioned_docs/version-2.1/lakehouse/datalake-analytics/paimon.md
diff --git 
a/versioned_docs/version-2.1/lakehouse/datalake-building/hive-build.md 
b/versioned_docs/version-2.1/lakehouse/datalake-building/hive-build.md
new file mode 100644
index 00000000000..e4988794dbe
--- /dev/null
+++ b/versioned_docs/version-2.1/lakehouse/datalake-building/hive-build.md
@@ -0,0 +1,278 @@
+---
+{
+    "title": "Hive",
+    "language": "en"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Starting from version 2.1.3, Doris supports DDL and DML operations for Hive. 
Users can directly create databases and tables in Hive through Doris and write 
data into Hive tables. With this feature, users can perform complete data 
queries and write operations on Hive through Doris, further helping to simplify 
the data lake integrated architecture.
+
+This article introduces Hive operations supported in Doris, including syntax 
and usage notes.
+
+## Metadata Creation and Deletion
+
+### Catalog
+
+- Create
+
+       ```
+       CREATE CATALOG [IF NOT EXISTS] hive PROPERTIES (
+           "type"="hms",
+           "hive.metastore.uris" = "thrift://172.21.16.47:7004",
+           "hadoop.username" = "hadoop",
+           "fs.defaultFS" = "hdfs://172.21.16.47:4007"
+       );
+       ```
+
+       Note, if you need to create Hive tables or write data through Doris, 
you must explicitly include the `fs.defaultFS` property in the Catalog 
properties. If creating the Catalog is only for querying, this parameter can be 
omitted.
+
+       For more parameters, please refer to [Hive 
Catalog](../datalake-analytics/hive.md)
+
+- Drop
+
+       ```
+       DROP CATALOG [IF EXISTS] hive;
+       ```
+
+       Deleting a Catalog does not delete any database or table information in 
Hive. It merely removes the mapping to this Hive cluster in Doris.
+
+### Database
+
+- Create
+
+       You can switch to the corresponding Catalog and execute the `CREATE 
DATABASE` statement:
+
+       ```
+       SWITCH hive;
+       CREATE DATABASE [IF NOT EXISTS] hive_db;
+       ```
+
+       You can also create using the fully qualified name or specify the 
location, as:
+
+       ```
+       CREATE DATABASE [IF NOT EXISTS] hive.hive_db;
+
+       CREATE DATABASE [IF NOT EXISTS] hive.hive_db
+       PROPERTIES ('location'='hdfs://172.21.16.47:4007/path/to/db/');
+       ```
+
+       Later, you can view the Database's Location information using the `SHOW 
CREATE DATABASE` command:
+
+       ```
+       mysql> SHOW CREATE DATABASE hive_db;
+       
+----------+---------------------------------------------------------------------------------------------+
+       | Database | Create Database                                            
                                 |
+       
+----------+---------------------------------------------------------------------------------------------+
+       | hive_db  | CREATE DATABASE `hive_db` LOCATION 
'hdfs://172.21.16.47:4007/usr/hive/warehouse/hive_db.db' |
+       
+----------+---------------------------------------------------------------------------------------------+
+       ```
+
+- Drop
+
+       ```
+       DROP DATABASE [IF EXISTS] hive.hive_db;
+       ```
+
+       Note that for Hive Databases, all tables within the Database must be 
deleted first, otherwise an error will occur. This operation will also delete 
the corresponding Database in Hive.
+
+### Table
+
+- Create
+
+       Doris supports creating partitioned or non-partitioned tables in Hive.
+
+       ```
+       -- Create unpartitioned hive table
+       CREATE TABLE unpartitioned_table (
+         `col1` BOOLEAN COMMENT 'col1',
+         `col2` INT COMMENT 'col2',
+         `col3` BIGINT COMMENT 'col3',
+         `col4` CHAR(10) COMMENT 'col4',
+         `col5` FLOAT COMMENT 'col5',
+         `col6` DOUBLE COMMENT 'col6',
+         `col7` DECIMAL(9,4) COMMENT 'col7',
+         `col8` VARCHAR(11) COMMENT 'col8',
+         `col9` STRING COMMENT 'col9'
+       )  ENGINE=hive
+       PROPERTIES (
+         'file_format'='parquet'
+       );
+
+       -- Create partitioned hive table
+       -- The partition columns must be in table's column definition list
+       CREATE TABLE partition_table (
+         `col1` BOOLEAN COMMENT 'col1',
+         `col2` INT COMMENT 'col2',
+         `col3` BIGINT COMMENT 'col3',
+         `col4` DECIMAL(2,1) COMMENT 'col4',
+         `pt1` VARCHAR COMMENT 'pt1',
+         `pt2` VARCHAR COMMENT 'pt2'
+       )  ENGINE=hive
+       PARTITION BY LIST (pt1, pt2) ()
+       PROPERTIES (
+         'file_format'='orc'
+       );
+       ```
+
+       After creation, you can view the Hive table creation statement using 
the `SHOW CREATE TABLE` command.
+
+       Note, unlike Hive's table creation statements. In Doris, when creating 
a Hive partitioned table, the partition columns must also be included in the 
Table's Schema.
+
+- Drop
+
+       You can drop a Hive table using the `DROP TABLE` statement. Currently, 
deleting the table also removes the data, including partition data.
+
+- Column Types
+
+       The column types used when creating Hive tables in Doris correspond to 
those in Hive as follows:
+
+       | Doris | Hive |
+       |---|---|
+       | BOOLEAN    | BOOLEAN |
+       | TINYINT    | TINYINT |
+       | SMALLINT   | SMALLINT |
+       | INT        | INT |
+       | BIGINT     | BIGINT |
+       | DATE     | DATE |
+       | DATETIME | TIMESTAMP |
+       | FLOAT      | FLOAT |
+       | DOUBLE     | DOUBLE |
+       | CHAR       | CHAR |
+       | VARCHAR    | STRING |
+       | STRING     | STRING |
+       | DECIMAL  | DECIMAL |
+       | ARRAY      | ARRAY |
+       | MAP        | MAP |
+       | STRUCT     | STRUCT |
+
+       > - Column types can only be nullable by default, NOT NULL is not 
supported.
+
+       > - Hive 3.0 supports setting default values. If you need to set 
default values, you need to explicitly add `"hive.version" = "3.0.0"` in the 
Catalog properties.
+       
+       > - After inserting data, if the types are not compatible, such as 
`'abc'` being inserted into a numeric type, it will be converted to a null 
value before insertion.
+
+- Partitions
+
+       The partition types in Hive correspond to the List partition in Doris. 
Therefore, when creating a Hive partitioned table in Doris, you need to use the 
List partition table creation statement, but there is no need to explicitly 
enumerate each partition. When writing data, Doris will automatically create 
the corresponding Hive partition based on the values of the data.
+
+       Supports creating single-column or multi-column partitioned tables.
+
+- File Formats
+
+       - Parquet
+       - ORC (default format)
+
+- Compression Formats
+
+       TODO
+
+- Storage Medium
+
+       - Currently, only HDFS is supported, future versions will support 
object storage.
+
+## Data Operations
+
+Data can be written into Hive tables through INSERT statements.
+
+Supports writing to Hive tables created by Doris or existing Hive tables with 
supported format.
+
+For partitioned tables, data will automatically be written to the 
corresponding partition or new partitions will be created.
+
+Currently, writing to specific partitions is not supported.
+
+### INSERT
+
+The INSERT operation appends data to the target table.
+
+```
+INSERT INTO hive_tbl values (val1, val2, val3, val4);
+INSERT INTO hive.hive_db.hive_tbl SELECT col1, col2 FROM internal.db1.tbl1;
+
+INSERT INTO hive_tbl(col1, col2) values (val1, val2);
+INSERT INTO hive_tbl(col1, col2, partition_col1, partition_col2) values (1, 2, 
"beijing", "2023-12-12");
+```
+
+### INSERT OVERWRITE
+
+The INSERT OVERWRITE operation completely overwrites the existing data in the 
table with new data.
+
+```
+INSERT OVERWRITE TABLE VALUES(val1, val2, val3, val4)
+INSERT OVERWRITE TABLE hive.hive_db.hive_tbl(col1, col2) SELECT col1, col2 
FROM internal.db1.tbl1;
+```
+
+### CTAS (CREATE TABLE AS SELECT)
+
+A Hive table can be created and populated with data using the `CTAS (CREATE 
TABLE AS SELECT)` statement:
+
+```
+CREATE TABLE hive_ctas ENGINE=hive AS SELECT * FROM other_table;
+```
+
+CTAS supports specifying file formats, partitioning methods, and other 
information, such as:
+
+```
+CREATE TABLE hive_ctas ENGINE=hive
+PARTITION BY LIST (pt1, pt2) ()
+AS SELECT col1,pt1,pt2 FROM part_ctas_src WHERE col1>0;
+
+CREATE TABLE hive.hive_db.hive_ctas (col1,col2,pt1) ENGINE=hive
+PARTITION BY LIST (pt1) ()
+PROPERTIES (
+"file_format"="parquet",
+"parquet.compression"="zstd"
+)
+AS SELECT col1,pt1 as col2,pt2 as pt1 FROM test_ctas.part_ctas_src WHERE 
col1>0;
+```
+
+## Exception Data and Data Transformation
+
+TODO
+
+## Transaction Mechanism
+
+Write operations to Hive are placed in a separate transaction. Until the 
transaction is committed, the data is not visible externally. Only after 
committing the transaction do the table's related operations become visible to 
others.
+
+Transactions ensure the atomicity of operations—all operations within a 
transaction either succeed completely or fail altogether.
+
+Transactions do not fully guarantee isolation of operations; they strive to 
minimize the inconsistency window by separating file system operations from 
metadata operations on the Hive Metastore.
+
+For example, in a transaction involving multiple partition modifications of a 
Hive table, if the task is divided into two batches, and the first batch is 
completed but the second batch has not yet started, the partitions from the 
first batch are already visible externally, and can be read, but the second 
batch partitions cannot.
+
+If any anomalies occur during the transaction commit process, the transaction 
will be directly rolled back, including modifications to HDFS files and 
metadata in the Hive Metastore, without requiring further action from the user.
+
+## Relevant Parameters
+
+### FE
+
+TODO
+
+### BE
+
+| Parameter Name | Default Value | Description |
+| --- | --- | --- |
+| `hive_sink_max_file_size` | Maximum file size for data files. When the 
volume of written data exceeds this size, the current file is closed, and a new 
file is opened for continued writing. | 1GB |
+| `table_sink_partition_write_max_partition_nums_per_writer` | Maximum number 
of partitions that can be written by each Instance on a BE node. |  128 |
+| `table_sink_non_partition_write_scaling_data_processed_threshold` | 
Threshold of data volume for starting scaling-write in non-partitioned tables. 
For every increase of 
`table_sink_non_partition_write_scaling_data_processed_threshold` in data 
volume, a new writer (instance) will be engaged for writing. The scaling-write 
mechanism aims to use a different number of writers (instances) based on the 
volume of data to increase the throughput of concurrent writing. When the 
volume of data is [...]
+| `table_sink_partition_write_min_data_processed_rebalance_threshold` | 
Minimum data volume threshold for triggering rebalance in partitioned tables. 
If `current accumulated data volume` - `data volume accumulated since the last 
rebalance or from the start` >= 
`table_sink_partition_write_min_data_processed_rebalance_threshold`, 
rebalancing is triggered. If there is a significant difference in the final 
file sizes, you can reduce this threshold to increase balance. However, too 
small a th [...]
+| 
`table_sink_partition_write_min_partition_data_processed_rebalance_threshold` | 
Minimum data volume threshold per partition for rebalancing in partitioned 
tables. If `current partition's data volume` >= `threshold` * `number of tasks 
allocated to the current partition`, rebalancing for that partition begins. If 
there is a significant difference in the final file sizes, you can reduce this 
threshold to increase balance. However, too small a threshold may increase the 
cost of rebalancing [...]
+
diff --git a/versioned_docs/version-2.1/lakehouse/file.md 
b/versioned_docs/version-2.1/lakehouse/file.md
index d5b4b7fdc22..f75e0bd522a 100644
--- a/versioned_docs/version-2.1/lakehouse/file.md
+++ b/versioned_docs/version-2.1/lakehouse/file.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "Table Value Function (TVF)",
+    "title": "File Analytics",
     "language": "en"
 }
 ---
@@ -197,4 +197,4 @@ FROM s3(
 
 1. If the URI specified by the `S3 / HDFS` TVF is not matched with the file, 
or all the matched files are empty files, then the` S3 / HDFS` TVF will return 
to the empty result set. In this case, using the `DESC FUNCTION` to view the 
schema of this file, you will get a dummy column` __dummy_col`, which can be 
ignored.
 
-2. If the format of the TVF is specified to `CSV`, and the read file is not a 
empty file but the first line of this file is empty, then it will prompt the 
error `The first line is empty, can not parse column numbers`. This is because 
the schema cannot be parsed from the first line of the file
\ No newline at end of file
+2. If the format of the TVF is specified to `CSV`, and the read file is not a 
empty file but the first line of this file is empty, then it will prompt the 
error `The first line is empty, can not parse column numbers`. This is because 
the schema cannot be parsed from the first line of the file
diff --git a/versioned_sidebars/version-2.1-sidebars.json 
b/versioned_sidebars/version-2.1-sidebars.json
index a12d9e939c3..04680ae4d19 100644
--- a/versioned_sidebars/version-2.1-sidebars.json
+++ b/versioned_sidebars/version-2.1-sidebars.json
@@ -264,12 +264,18 @@
                     "type": "category",
                     "label": "Data Lake Analytics",
                     "items": [
-                        "lakehouse/datalake/hive",
-                        "lakehouse/datalake/hudi",
-                        "lakehouse/datalake/iceberg",
-                        "lakehouse/datalake/paimon",
-                        "lakehouse/datalake/dlf",
-                        "lakehouse/datalake/cloud-auth"
+                        "lakehouse/datalake-analytics/hive",
+                        "lakehouse/datalake-analytics/hudi",
+                        "lakehouse/datalake-analytics/iceberg",
+                        "lakehouse/datalake-analytics/paimon",
+                        "lakehouse/datalake-analytics/dlf"
+                    ]
+                },
+                {
+                    "type": "category",
+                    "label": "Data Lake Building",
+                    "items": [
+                        "lakehouse/datalake-building/hive-build"
                     ]
                 },
                 {
@@ -284,6 +290,7 @@
                 "lakehouse/file",
                 "lakehouse/filecache",
                 "lakehouse/external-statistics",
+                "lakehouse/cloud-auth",
                 "lakehouse/sql-dialect"
             ]
         },
@@ -1479,4 +1486,4 @@
             ]
         }
     ]
-}
\ No newline at end of file
+}


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

(doris-website) branch master updated: (add)[hive] add hive build doc (#574)

Reply via email to