(doris-website) branch master updated: [opt](docs) Opt documents for datatypes and compaction (#792)

luzhijing Thu, 27 Jun 2024 19:20:00 -0700

This is an automated email from the ASF dual-hosted git repository.

luzhijing pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git



The following commit(s) were added to refs/heads/master by this push:
     new 0b0e8103f9f [opt](docs) Opt documents for datatypes and compaction 
(#792)
0b0e8103f9f is described below

commit 0b0e8103f9f5b9ee9fb6d801e3ba2cea016ba67e
Author: Kang <kxiao.ti...@gmail.com>
AuthorDate: Fri Jun 28 10:19:51 2024 +0800

    [opt](docs) Opt documents for datatypes and compaction (#792)
    
    1. reorg datetypes order
    2. opt JSON type docs
    3. add single replica and time_series compaction docs
    
    ---------
    
    Co-authored-by: Luzhijing <82810928+luzhij...@users.noreply.github.com>
---
 docs/admin-manual/compaction.md                    | 58 ++++++++++++++--
 docs/sql-manual/sql-types/Data-Types/JSON.md       | 57 ++++++++++++---
 docs/table-design/data-type.md                     | 24 +++----
 .../current/admin-manual/compaction.md             | 51 +++++++++++++-
 .../sql-manual/sql-types/Data-Types/JSON.md        | 81 ++++++++++++++++------
 .../current/table-design/data-type.md              | 30 ++++----
 .../version-2.0/admin-manual/compaction.md         | 51 +++++++++++++-
 .../sql-manual/sql-reference/Data-Types/JSON.md    | 81 ++++++++++++++++------
 .../version-2.0/table-design/data-type.md          | 29 ++++----
 .../version-2.1/admin-manual/compaction.md         | 51 +++++++++++++-
 .../sql-manual/sql-types/Data-Types/JSON.md        | 61 ++++++++++++----
 .../version-2.1/table-design/data-type.md          | 30 ++++----
 sidebars.json                                      | 12 ++--
 .../version-2.0/admin-manual/compaction.md         | 55 ++++++++++++++-
 .../sql-manual/sql-reference/Data-Types/JSON.md    | 57 ++++++++++++---
 .../version-2.0/table-design/data-type.md          | 22 +++---
 .../version-2.1/admin-manual/compaction.md         | 55 ++++++++++++++-
 .../sql-manual/sql-types/Data-Types/JSON.md        | 57 ++++++++++++---
 .../version-2.1/table-design/data-type.md          | 24 +++----
 versioned_sidebars/version-2.0-sidebars.json       |  6 +-
 versioned_sidebars/version-2.1-sidebars.json       | 12 ++--
 21 files changed, 708 insertions(+), 196 deletions(-)

diff --git a/docs/admin-manual/compaction.md b/docs/admin-manual/compaction.md
index e48756b57e2..b953cb5c137 100644
--- a/docs/admin-manual/compaction.md
+++ b/docs/admin-manual/compaction.md
@@ -29,13 +29,10 @@ under the License.
 
 Doris writes data through a structure similar to LSM-Tree, and continuously 
merges small files into large ordered files through compaction in the 
background. Compaction handles operations such as deletion and updating. 
 
-Appropriately adjusting the compaction strategy can greatly improve load and 
query efficiency. Doris provides the following two compaction strategies for 
tuning:
+Appropriately adjusting the compaction strategy can greatly improve load and 
query efficiency. Doris provides the following compaction strategies for tuning:
 
 
-## Vertical Compaction
-
-<version since="1.2.2">
-</version>
+## Vertical compaction
 
 Vertical compaction is a new compaction algorithm implemented in Doris 1.2.2, 
which is used to optimize compaction execution efficiency and resource overhead 
in large-scale and wide table scenarios. It can effectively reduce the memory 
overhead of compaction and improve the execution speed of compaction. The test 
results show that the memory consumption by vertical compaction is only 1/10 of 
the original compaction algorithm, and the compaction rate is increased by 15%.
 
@@ -47,7 +44,7 @@ BE configuration：
 - `vertical_compaction_max_segment_size` is used to configure the size of the 
disk file after vertical compaction, the default value is 268435456 (bytes)
 
 
-## Segment Compaction
+## Segment compaction
 
 Segment compaction mainly deals with the large-scale data load. Segment 
compaction operates during the load process and compact segments inside the 
job, which is different from normal compaction and vertical compaction. This 
mechanism can effectively reduce the number of generated segments and avoid the 
-238 (OLAP_ERR_TOO_MANY_SEGMENTS) errors.
 
@@ -72,3 +69,52 @@ Situations where segment compaction is not recommended:
 - When the load operation itself has exhausted memory resources, it is not 
recommended to use the segment compaction to avoid further increasing memory 
pressure and causing the load job to fail.
 
 Refer to this [link](https://github.com/apache/doris/pull/12866) for more 
information about implementation and test results.
+
+## Single replica compaction
+
+By default, compaction for multiple replicas is performed independently, with 
each replica consuming CPU and IO resources. When single replica compaction is 
enabled, only one replica performs the compaction. Afterward, the other 
replicas pull the compacted files from this replica, resulting in CPU resources 
being consumed only once, saving N - 1 times CPU usage (where N is the number 
of replicas).
+
+Single replica compaction is specified in the table's PROPERTIES via the 
parameter `enable_single_replica_compaction`, which is false by default 
(disabled). To enable it, set the parameter to true.
+
+This parameter can be specified when creating the table or modified later 
using:
+```sql
+ALTER TABLE table_name SET("enable_single_replica_compaction" = "true");
+```
+
+## Compaction strategy
+
+The compaction strategy determines when and which small files are merged into 
larger files. Doris currently offers two compaction strategies, specified by 
the `compaction_policy` parameter in the table properties.
+
+### Size-based compaction strategy
+
+The size-based compaction strategy is the default strategy and is suitable for 
most scenarios.
+```
+"compaction_policy" = "size_based"
+```
+
+### Time series compaction strategy
+
+The time series compaction strategy is optimized for scenarios like logs and 
time-series data. It leverages the time locality of time-series data, merging 
small files written in adjacent times into larger files. Each file participates 
in compaction only once, reducing write amplification from repeated compaction.
+
+```
+"compaction_policy" = "time_series"
+```
+
+The time series compaction strategy is triggered when any of the following 
conditions are met:
+- The size of unmerged files exceeds `time_series_compaction_goal_size_mbytes` 
(default 1 GB).
+- The number of unmerged files exceeds 
`time_series_compaction_file_count_threshold` (default 2000).
+- The time since the last compaction exceeds 
`time_series_compaction_time_threshold_seconds` (default 1 hour).
+
+These parameters are set in the table's PROPERTIES and can be specified when 
creating the table or modified later using:
+```
+ALTER TABLE table_name SET("name" = "value");
+```
+
+## Compaction concurrency control
+
+Compaction runs in the background and consumes CPU and IO resources. The 
resource consumption can be controlled by adjusting the number of concurrent 
compaction threads.
+
+The number of concurrent compaction threads is configured in the BE 
configuration file, including the following parameters:
+- `max_base_compaction_threads`: Number of base compaction threads, default is 
4.
+- `max_cumu_compaction_threads`: Number of cumulative compaction threads, 
default is 10.
+- `max_single_replica_compaction_threads`: Number of threads for fetching data 
files during single replica compaction, default is 10.
diff --git a/docs/sql-manual/sql-types/Data-Types/JSON.md 
b/docs/sql-manual/sql-types/Data-Types/JSON.md
index 341e43dbd55..4ce84debda6 100644
--- a/docs/sql-manual/sql-types/Data-Types/JSON.md
+++ b/docs/sql-manual/sql-types/Data-Types/JSON.md
@@ -26,20 +26,57 @@ under the License.
 
 ## JSON
 
-<version since="1.2.0">
+The JSON data type stores [JSON](https://www.rfc-editor.org/rfc/rfc8785) data 
efficiently in a binary format and allows access to its internal fields through 
JSON functions.
 
-</version>
+By default, it supports up to 1048576 bytes (1MB), and can be increased up to 
2147483643 bytes (2GB). This can be adjusted via the 
`string_type_length_soft_limit_bytes` configuration.
 
-NOTICE: In version 1.2.x the data type name is JSONB. It's renamed to JSON to 
be more compatible to version 2.0.0. And the old tables can still be used.
+Compared to storing JSON strings in a regular STRING type, the JSON type has 
two main advantages:
+1. JSON format validation during data insertion.
+2. More efficient binary storage format, enabling faster access to JSON 
internal fields using functions like `json_extract`, compared to `get_json_xx` 
functions.
 
-### description
-    JSON (Binary) datatype.
-        Use binary JSON format for storage and json function to extract field. 
Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes 
(2G),and the JSONB type is also limited by the be configuration 
`jsonb_type_length_soft_limit_bytes`.
+**Note**: In version 1.2.x, the JSON type was named JSONB. To maintain 
compatibility with MySQL, it was renamed to JSON starting from version 2.0.0. 
Older tables can still use the previous name.
 
-### note
-    There are some advantanges for JSON over plain JSON STRING.
-    1. JSON syntax will be validated on write to ensure data quality
-    2. JSON binary format is more efficient. Using json_extract functions on 
JSON datatype is 2-4 times faster than get_json_xx on JSON STRING format.
+### Syntax
+
+**Definition:**
+```sql
+json_column_name JSON
+```
+
+**Insertion:**
+- Using `INSERT INTO VALUES` with the format as a string surrounded by quotes. 
For example:
+```sql
+INSERT INTO table_name(id, json_column_name) VALUES (1, '{"k1": "100"}')
+```
+
+- For STREAM LOAD, the format for the corresponding column is a string without 
additional quotes. For example:
+```
+12     {"k1":"v31", "k2": 300}
+13     []
+14     [123, 456]
+```
+
+**Query:**
+- Directly select the entire JSON column:
+```sql
+SELECT json_column_name FROM table_name;
+```
+
+- Extract specific fields or other information from JSON using JSON functions. 
For example:
+```sql
+SELECT json_extract(json_column_name, '$.k1') FROM table_name;
+```
+
+- The JSON type can be cast to and from integers, strings, BOOLEAN, ARRAY, and 
MAP. For example:
+```sql
+SELECT CAST('{"k1": "100"}' AS JSON);
+SELECT CAST(json_column_name AS STRING) FROM table_name;
+SELECT CAST(json_extract(json_column_name, '$.k1') AS INT) FROM table_name;
+```
+
+:::tip
+The JSON type currently cannot be used for `GROUP BY`, `ORDER BY`, or 
comparison operations.
+:::
 
 ### example
 A tutorial for JSON datatype including create table, load data and query.
diff --git a/docs/table-design/data-type.md b/docs/table-design/data-type.md
index 47ba73ecbba..281aba8bafa 100644
--- a/docs/table-design/data-type.md
+++ b/docs/table-design/data-type.md
@@ -28,7 +28,7 @@ Apache Doris support standard SQL syntax, using MySQL Network 
Connection Protoco
 
 The list of data types supported by Doris is as follows:
 
-| Type name      | Number of bytes | Description                               
                   |
+| Type name      | Storeage (bytes)| Description                               
                   |
 | -------------- | --------------- | 
------------------------------------------------------------ |
 | BOOLEAN        | 1               | Boolean data type hat stores only two 
types of values , 0 represents false, 1 represents true. |
 | TINYINT        | 1               | Integer value, signed range is from -128 
to 127.             |
@@ -42,16 +42,16 @@ The list of data types supported by Doris is as follows:
 | DATE           | 16              | DATE holds values for a calendar year, 
month and day, the  supported range is ['0000-01-01', '9999-12-31'].  Default 
print format: 'yyyy-MM-dd'. |
 | DATETIME       | 16              | A DATE and TIME combination  Format: 
DATETIME ([P]).   The optional parameter P represents time precision, with a 
value range of [0,6], supporting up to 6 decimal places (microseconds). When 
not set, it is 0.   The supported range is ['0000-01-01 00:00:00 [.000000]', 
'9999-12-31 23:59:59 [.999999]'].   Default print format: 'yyy-MM-dd HH: mm: 
ss. SSSSSS '. |
 | CHAR           | M               | A FIXED length string, the parameter M 
specifies the column length in characters. The range of M is from 1 to 255. |
-| VARCHAR        | M               | A VARIABLE length string , the parameter 
M specifies the maximum string length in characters. The range of M is from 1 
to 65533.   The variable-length string is stored in UTF-8 encoding. English 
characters occupy 1 byte, and Chinese characters occupy 3 bytes. |
-| STRING         | /               | A VARIABLE length string, default 
supports 1048576 bytes (1 MB), and a limit of maximum precision of 2147483643 
bytes (2 GB).   Size can be configured string_type_length_soft_limit_bytes 
adjusted through BE.   String type can only be used in value column, not in key 
column and partition bucket column. |
-| HLL            | /               | HLL stands for HyperLogLog, is a fuzzy 
deduplication. It performs better than Count Distinct when dealing with large 
datasets.   The error rate of HLL is typically around 1%, and sometimes it can 
reach 2%. HLL cannot be used as a key column, and the aggregation type is 
HLL_UNION when creating a table.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data.  HLL columns can on [...]
-| BITMAP         | /               | BITMAP type can be used in Aggregate 
tables or Unique tables.  - When used in a Unique table, BITMAP must be 
employed as non-key columns.  - When used in an Aggregate table, BITMAP must 
also serve as non-key columns, and the aggregation type must be set to 
BITMAP_UNION during table creation.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data. BITMAP columns can only be qu [...]
-| QUANTILE_STATE | /               | A type used to calculate approximate 
quantile values.  When loading, it performs pre-aggregation for the same keys 
with different values. When the number of values does not exceed 2048, it 
records all data in detail. When the number of values is greater than 2048, it 
employs the TDigest algorithm to aggregate (cluster) the data and store the 
centroid points after clustering.   QUANTILE_STATE cannot be used as a key 
column and should be paired with the [...]
-| ARRAY          | /               | Arrays composed of elements of type T 
cannot be used as key columns. Currently supported for use in tables with 
Duplicate and Unique models. |
-| MAP            | /               | Maps consisting of elements of type K and 
V, cannot be used as Key columns. These maps are currently supported in tables 
using the Duplicate and Unique models. |
-| STRUCT         | /               | A structure composed of multiple Fields 
can also be understood as a collection of multiple columns. It cannot be used 
as a Key. Currently, STRUCT can only be used in tables of Duplicate models. The 
name and number of Fields in a Struct are fixed and are always Nullable.|
-| JSON           | /               | Binary JSON type, stored in binary JSON 
format, access internal JSON fields through JSON function.   Supported up to 
1048576 bytes (1MB) by default, and can be adjusted to a maximum of 2147483643 
bytes (2GB). This limit can be modified through the BE configuration parameter 
'jsonb_type_length_soft_limit_bytes'. |
-| AGG_STATE      | /               | Aggregate function can only be used with 
state/merge/union function combiners.   AGG_STATE cannot be used as a key 
column. When creating a table, the signature of the aggregate function needs to 
be declared alongside.   Users do not need to specify the length or default 
value. The actual data storage size depends on the function's implementation. |
-| VARIANT        | /               | Variant allows storing complex data 
structures containing different data types (such as integers, strings, boolean 
values, etc.) without the need to define specific columns in the table 
structure beforehand.During the writing process, this type can automatically 
infer column information based on the structure and types of the columns, 
dynamicly merge written schemas. It stores JSON keys and their corresponding 
values as columns and dynamic sub-columns. |
+| VARCHAR        | Variable Length | A VARIABLE length string , the parameter 
M specifies the maximum string length in characters. The range of M is from 1 
to 65533.   The variable-length string is stored in UTF-8 encoding. English 
characters occupy 1 byte, and Chinese characters occupy 3 bytes. |
+| STRING         | Variable Length | A VARIABLE length string, default 
supports 1048576 bytes (1 MB), and a limit of maximum precision of 2147483643 
bytes (2 GB).   Size can be configured string_type_length_soft_limit_bytes 
adjusted through BE.   String type can only be used in value column, not in key 
column and partition bucket column. |
+| ARRAY          | Variable Length | Arrays composed of elements of type T 
cannot be used as key columns. Currently supported for use in tables with 
Duplicate and Unique models. |
+| MAP            | Variable Length | Maps consisting of elements of type K and 
V, cannot be used as Key columns. These maps are currently supported in tables 
using the Duplicate and Unique models. |
+| STRUCT         | Variable Length | A structure composed of multiple Fields 
can also be understood as a collection of multiple columns. It cannot be used 
as a Key. Currently, STRUCT can only be used in tables of Duplicate models. The 
name and number of Fields in a Struct are fixed and are always Nullable.|
+| JSON           | Variable Length | Binary JSON type, stored in binary JSON 
format, access internal JSON fields through JSON function.   Supported up to 
1048576 bytes (1MB) by default, and can be adjusted to a maximum of 2147483643 
bytes (2GB). This limit can be modified through the BE configuration parameter 
'jsonb_type_length_soft_limit_bytes'. |
+| VARIANT        | Variable Length | The VARIANT data type is dynamically 
adaptable, specifically designed for semi-structured data like JSON. It can 
store any JSON object and automatically splits JSON fields into subcolumns for 
improved storage efficiency and query performance. The length limits and 
configuration methods are the same as for the STRING type. However, the VARIANT 
type can only be used in value columns and cannot be used in key columns or 
partition / bucket columns. |
+| HLL            | Variable Length | HLL stands for HyperLogLog, is a fuzzy 
deduplication. It performs better than Count Distinct when dealing with large 
datasets.   The error rate of HLL is typically around 1%, and sometimes it can 
reach 2%. HLL cannot be used as a key column, and the aggregation type is 
HLL_UNION when creating a table.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data.  HLL columns can on [...]
+| BITMAP         | Variable Length | BITMAP type can be used in Aggregate 
tables or Unique tables.  - When used in a Unique table, BITMAP must be 
employed as non-key columns.  - When used in an Aggregate table, BITMAP must 
also serve as non-key columns, and the aggregation type must be set to 
BITMAP_UNION during table creation.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data. BITMAP columns can only be qu [...]
+| QUANTILE_STATE | Variable Length | A type used to calculate approximate 
quantile values.  When loading, it performs pre-aggregation for the same keys 
with different values. When the number of values does not exceed 2048, it 
records all data in detail. When the number of values is greater than 2048, it 
employs the TDigest algorithm to aggregate (cluster) the data and store the 
centroid points after clustering.   QUANTILE_STATE cannot be used as a key 
column and should be paired with the [...]
+| AGG_STATE      | Variable Length | Aggregate function can only be used with 
state/merge/union function combiners.   AGG_STATE cannot be used as a key 
column. When creating a table, the signature of the aggregate function needs to 
be declared alongside.   Users do not need to specify the length or default 
value. The actual data storage size depends on the function's implementation. |
 
 You can also view all the data types supported by Doris with the `SHOW DATA 
TYPES; `statement.
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/compaction.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/compaction.md
index 0c79b1d60f1..0e91d0733fb 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/compaction.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/compaction.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "Compaction 调整",
+    "title": "Compaction 优化",
     "language": "zh-CN"
 }
 ---
@@ -27,7 +27,7 @@ under the License.
 
 
 Doris 通过类似 LSM-Tree 的结构写入数据，在后台通过 Compaction 
机制不断将小文件合并成有序的大文件，同时也会处理数据的删除、更新等操作。适当的调整 Compaction 的策略，可以极大地提升导入效率和查询效率。
-Doris 提供如下 2 种 compaction 方式进行调优：
+Doris 提供如下几种 compaction 方式进行调优：
 
 
 ## Vertical compaction
@@ -84,3 +84,50 @@ Segment compaction 有以下特点：
 - 导入操作本身已经耗尽了内存资源时，不建议使用 segment compaction 以免进一步增加内存压力使导入失败。
 
 关于 segment compaction 
的实现和测试结果可以查阅[此链接](https://github.com/apache/doris/pull/12866)。
+
+
+## 单副本 compaction
+
+默认情况下，多个副本的 compaction 是独立进行的，每个副本在都需要消耗 CPU 和 IO 资源。开启单副本 compaction 
后，在一个副本进行 compaction 后，其他几个副本拉取 compaction 后的文件，因此 CPU 资源只需要消耗 1次，节省了 N - 1 倍 
CPU 消耗（ N 是副本数）。
+
+单副本 compaction 在表的 PROPERTIES 中通过参数 `enable_single_replica_compaction` 指定，默认为 
false 不开启，设置为 true 开启。
+
+该参数可以在建表时指定，或者通过 `ALTER TABLE table_name 
SET("enable_single_replica_compaction" = "true")` 来修改。
+
+## Compaction 策略
+
+Compaction 策略决定什么时候将哪些小文件合并成大文件。Doris 当前提供了 2种 compaction 策略，通过表属性的 
`compaction_policy` 参数指定。
+
+### size_based compaction 策略
+
+size_based compaction 策略是默认策略，对大多数场景适用。
+
+```
+"compaction_policy" = "size_based"
+```
+
+### time_series compaction 策略
+
+time_series compaction 
策略是为日志、时序等场景优化的策略。它利用时序数据具有时间局部性的特点，将相邻时间写入的小文件合并成大文件，每个文件只会参与一次 compaction 
就合并成比较大的文件，减少反复 compaction 带来的写放大。
+
+```
+"compaction_policy" = "time_series"
+```
+
+time_series compaction 策略在下面 3 个条件任意一个满足的时候触发小文件合并：
+- 未合并的文件大小超过 `time_series_compaction_goal_size_mbytes` (默认 1GB)
+- 未合并的文件个数超过 `time_series_compaction_file_count_threshold` (默认 2000)
+- 距离上次合并的时间超过 `time_series_compaction_time_threshold_seconds` (默认 1小时)
+
+上述参数在表的 PROPERTIES 中设置，可以在建表时指定，或者通过 `ALTER TABLE table_name SET("name" = 
"value")` 修改。
+
+
+## Compaction 并发控制
+
+Compaction 在后台执行需要消耗 CPU 和 IO 资源，可以通过控制 compaction 并发线程数来控制资源消耗。
+
+compaction 并发线程数在 BE 的配置文件中配置，包括下面几个：
+- `max_base_compaction_threads`：base compaction 的线程数，默认是 4
+- `max_cumu_compaction_threads`：cumulative compaction 的线程数，默认是 10
+- `max_single_replica_compaction_threads`：单副本 compaction 拉取数据文件的线程数，默认是 10
+
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-types/Data-Types/JSON.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-types/Data-Types/JSON.md
index fb6e4755914..751c22c0714 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-types/Data-Types/JSON.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-types/Data-Types/JSON.md
@@ -24,24 +24,63 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## JSON
+JSON 数据类型，用二进制格式高效存储 [JSON](https://www.rfc-editor.org/rfc/rfc8785) 数据，通过 JSON 
函数访问其内部字段。
 
-<version since="1.2.0">
+默认支持1048576 字节（1 MB），可调大到 2147483643 字节（2 GB），可通过 BE 
配置`String_type_length_soft_limit_bytes` 调整。
 
-</version>
+    与普通 String 类型存储的 JSON 字符串相比，JSON 类型有两点优势
+    1. 数据写入时进行 JSON 格式校验
+    2. 二进制存储格式更加高效，通过json_extract等函数可以高效访问JSON内部字段，比get_json_xx函数快几倍
 
-注意：在1.2.x版本中，JSON类型的名字是JSONB，为了尽量跟MySQL兼容，从2.0.0版本开始改名为JSON，老的表仍然可以使用。
+    :::caution 注意
+    在1.2.x版本中，JSON 类型的名字是 JSONB，为了尽量跟 MySQL 兼容，从 2.0.0 版本开始改名为 JSON，老的表仍然可以使用。
+    :::
 
-### description
-    JSON类型
-        二进制JSON类型，采用二进制JSON格式存储，通过json函数访问JSON内部字段。默认支持1048576 字节（1M），可调大到 
2147483643 字节（2G），可通过be配置`jsonb_type_length_soft_limit_bytes`调整
+### 语法
 
-### note
-    与普通STRING类型存储的JSON字符串相比，JSON类型有两点优势
-    1. 数据写入时进行JSON格式校验
-    2. 二进制存储格式更加高效，通过json_extract等函数可以高效访问JSON内部字段，比get_json_xx函数快几倍
+**定义**
+```sql
+json_column_name JSON
+```
+
+**写入**
+- INSERT INTO VALUE 格式是引号包围的字符串。例如：
+```sql
+INSERT INTO table_name(id, json_column_name) VALUES (1, '{"k1": "100"}')
+```
+
+- STREAM LOAD 对应列的格式是字符串，不需要额外引号包围。例如：
+```
+12     {"k1":"v31", "k2": 300}
+13     []
+14     [123, 456]
+```
+
+**查询**
+- 直接将整个 JSON 列 SELECT 出来
+```sql
+SELECT json_column_name FROM table_name;
+```
+
+- 从 JSON 中提取需要的字段，或者其他信息，参考 JSON 函数，例如：
+```sql
+SELECT json_extract(json_column_name, '$.k1') FROM table_name;
+```
+
+- JSON 类型可以与整数、字符串、BOOLEAN、ARRAY、MAP 进行类型转换 CAST，例如：
+```sql
+SELECT CAST('{"k1": "100"}' AS JSON)
+SELECT CAST(json_column_name AS String) FROM table_name;
+SELECT CAST(json_extract(json_column_name, '$.k1') AS INT) FROM table_name;
+```
+
+:::tip
+
+JSON 类型暂时不能用于 GROUP BY，ORDER BY，比较大小
+
+:::
 
-### example
+### 使用示例
     用一个从建表、导数据、查询全周期的例子说明JSON数据类型的功能和用法。
 
 #### 创建库表
@@ -64,8 +103,8 @@ PROPERTIES("replication_num" = "1");
 
 ##### stream load 导入test_json.csv测试数据
 
-- 测试数据有2列，第一列id，第二列是json
-- 测试数据有25行，其中前18行的json是合法的，后7行的json是非法的
+- 测试数据有2列，第一列ID，第二列是JSON
+- 测试数据有25行，其中前18行的JSON是合法的，后7行的JSON是非法的
 
 ```
 1      \N
@@ -143,7 +182,7 @@ curl --location-trusted -u root: -H 'max_filter_ratio: 0.3' 
-T test_json.csv htt
 }
 ```
 
-- 查看stream load导入的数据，JSON类型的列j会自动转成JSON string展示
+- 查看stream load导入的数据，JSON类型的列j会自动转成JSON String展示
 
 ```
 mysql> SELECT * FROM test_json ORDER BY id;
@@ -359,7 +398,7 @@ mysql> SELECT id, j, json_extract(j, '$.a1[0]'), 
json_extract(j, '$.a1[0].k1') F
 ```
 
 1. 获取具体类型的
-- json_extract_string 获取string类型字段，非string类型转成string
+- json_extract_string 获取String类型字段，非String类型转成String
 ```
 mysql> SELECT id, j, json_extract_string(j, '$') FROM test_json ORDER BY id;
 
+------+---------------------------------------------------------------+---------------------------------------------------------------+
@@ -636,8 +675,8 @@ mysql> SELECT id, j, json_extract_bool(j, '$[1]') FROM 
test_json ORDER BY id;
 19 rows in set (0.01 sec)
 ```
 
-- json_extract_isnull 获取json null类型字段，null返回1，非null返回0
-- 需要注意的是json null和SQL NULL不一样，SQL NULL表示某个字段的值不存在，而json null表示值存在但是是一个特殊值null
+- json_extract_isnull 获取JSON NULL类型字段，null返回1，非null返回0
+- 需要注意的是JSON NULL和SQL NULL不一样，SQL NULL表示某个字段的值不存在，而JSON表示值存在但是是一个特殊值NULL
 ```
 mysql> SELECT id, j, json_extract_isnull(j, '$') FROM test_json ORDER BY id;
 
+------+---------------------------------------------------------------+--------------------------------+
@@ -751,9 +790,9 @@ mysql> SELECT id, j, json_exists_path(j, '$[2]') FROM 
test_json ORDER BY id;
 
 ```
 
-##### 用json_type获取json内的某个字段的类型
+##### 用json_type获取JSON内的某个字段的类型
 
-- 返回json path对应的json字段类型，如果不存在返回NULL
+- 返回json path对应的JSON字段类型，如果不存在返回NULL
 ```
 mysql> SELECT id, j, json_type(j, '$') FROM test_json ORDER BY id;
 
+------+---------------------------------------------------------------+----------------------+
@@ -810,4 +849,4 @@ mysql> select id, j, json_type(j, '$.k1') from test_json 
order by id;
 ```
 
 ### keywords
-JSON, json_parse, json_parse_error_to_null, json_parse_error_to_value, 
json_extract, json_extract_isnull, json_extract_bool, json_extract_int, 
json_extract_bigint, json_extract_double, json_extract_string, 
json_exists_path, json_type
+JSON, json_parse, json_parse_error_to_null, json_parse_error_to_value, 
json_extract, json_extract_isnull, json_extract_bool, json_extract_int, 
json_extract_bigint, json_extract_double, json_extract_String, 
json_exists_path, json_type
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md
index f685aa7ab3b..b6e9b6f95af 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md
@@ -28,9 +28,9 @@ Apache Doris 支持标准 SQL 语法，采用 MySQL 网络连接协议，高度
 
 ## 数据类型
 
-Doris 已支持的数据类型列表如下：
+Apache Doris 已支持的数据类型列表如下：
 
-| 类型名         | 字节数    | 描述                                                    
     |
+| 类型名         | 存储空间（字节）| 描述                                                   
  |
 | -------------- | --------- | 
------------------------------------------------------------ |
 | BOOLEAN        | 1         | 布尔值，0 代表 false，1 代表 true。                       
   |
 | TINYINT        | 1         | 有符号整数，范围 [-128, 127]。                           
    |
@@ -44,16 +44,16 @@ Doris 已支持的数据类型列表如下：
 | DATE           | 16        | 日期类型，目前的取值范围是 ['0000-01-01', 
'9999-12-31']，默认的打印形式是 'yyyy-MM-dd'。 |
 | DATETIME       | 16        | 日期时间类型，格式：DATETIME([P])。可选参数 P 表示时间精度，取值范围是 [0, 
6]，即最多支持 6 位小数（微秒）。不设置时为 0。<p>取值范围是 ['0000-01-01 00:00:00[.000000]', 
'9999-12-31 23:59:59[.999999]']。打印的形式是 'yyyy-MM-dd HH:mm:ss.SSSSSS'。 </p>|
 | CHAR           | M         | 定长字符串，M 代表的是定长字符串的字节长度。M 的范围是 1-255。 |
-| VARCHAR        | M         | 变长字符串，M 代表的是变长字符串的字节长度。M 的范围是 1-65533。变长字符串是以 
UTF-8 编码存储的，因此通常英文字符占 1 个字节，中文字符占 3 个字节。 |
-| STRING         | /         | 变长字符串，默认支持 1048576 字节（1MB），可调大到 2147483643 
字节（2GB）。可通过 BE 配置 string_type_length_soft_limit_bytes 调整。String 类型只能用在 Value 
列，不能用在 Key 列和分区分桶列。 |
-| HLL            | /         | HLL 是模糊去重，在数据量大的情况性能优于 Count Distinct。HLL 
的误差通常在 1% 左右，有时会达到 2%。HLL 不能作为 Key 列使用，建表时配合聚合类型为 
HLL_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。HLL 列只能通过配套的 
hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 进行查询或使用。</p> |
-| BITMAP         | /         | BITMAP 类型的列可以在 Aggregate 表或 Unique 表中使用。在 
Unique 表中使用时，其必须作为非 Key 列使用。在 Aggregate 表中使用时，其必须作为非 Key 列使用，且建表时配合的聚合类型为 
BITMAP_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。BITMAP 列只能通过配套的 
bitmap_union_count、bitmap_union、bitmap_hash、bitmap_hash64 等函数进行查询或使用。</p> |
-| QUANTILE_STATE | /         | QUANTILE_STATE 是一种计算分位数近似值的类型，在导入时会对相同的 Key，不同 
Value 进行预聚合，当 value 数量不超过 2048 时采用明细记录所有数据，当 Value 数量大于 2048 时采用 TDigest 
算法，对数据进行聚合（聚类）保存聚类后的质心点。QUANTILE_STATE 不能作为 Key 列使用，建表时配合聚合类型为 
QUANTILE_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。QUANTILE_STATE 列只能通过配套的 
QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数进行查询或使用。</p> |
-| ARRAY          | /         | 由 T 类型元素组成的数组，不能作为 Key 列使用。目前支持在 Duplicate 和 
Unique 模型的表中使用。 |
-| MAP            | /         | 由 K, V 类型元素组成的 map，不能作为 Key 列使用。目前支持在 Duplicate 
和 Unique 模型的表中使用。 |
-| STRUCT         | /        | 由多个 Field 组成的结构体，也可被理解为多个列的集合。不能作为 Key 使用，目前 
STRUCT 仅支持在 Duplicate 模型的表中使用。一个 Struct 中的 Field 的名字和数量固定，总是为 Nullable。|
-| JSON           | /         | 二进制 JSON 类型，采用二进制 JSON 格式存储，通过 JSON 函数访问 JSON 
内部字段。默认支持 1048576 字节（1MB），可调大到 2147483643 字节（2GB）。可通过 BE 配置 
jsonb_type_length_soft_limit_bytes 调整。 |
-| AGG_STATE      | /         | 聚合函数，只能配合 state/merge/union 函数组合器使用。AGG_STATE 
不能作为 key 列使用，建表时需要同时声明聚合函数的签名。用户不需要指定长度和默认值。实际存储的数据大小与函数实现有关。 |
-| VARIANT        | /         | 
VARIANT允许存储包含不同数据类型（如整数、字符串、布尔值等）的复杂数据结构，而无需在表结构中提前定义具体的列。VARIANT 
类型特别适用于处理复杂的嵌套结构，而这些结构可能随时会发生变化。在写入过程中，该类型可以自动根据列的结构、类型推断列信息，动态合并写入的 
schema，并通过将 JSON 键及其对应的值存储为列和动态子列 |
-
-您也可通过`SHOW DATA TYPES;`语句查看 Doris 支持的所有数据类型。
+| VARCHAR        | 不定长     | 变长字符串，M 代表的是变长字符串的字节长度。M 的范围是 1-65533。变长字符串是以 
UTF-8 编码存储的，因此通常英文字符占 1 个字节，中文字符占 3 个字节。 |
+| STRING         | 不定长     | 变长字符串，默认支持 1048576 字节（1MB），可调大到 2147483643 
字节（2GB）。可通过 BE 配置 string_type_length_soft_limit_bytes 调整。String 类型只能用在 Value 
列，不能用在 Key 列和分区分桶列。 |
+| ARRAY          | 不定长     | 由 T 类型元素组成的数组，不能作为 Key 列使用。目前支持在 Duplicate 和 
Unique 模型的表中使用。 |
+| MAP            | 不定长     | 由 K, V 类型元素组成的 map，不能作为 Key 列使用。目前支持在 Duplicate 和 
Unique 模型的表中使用。 |
+| STRUCT         | 不定长     | 由多个 Field 组成的结构体，也可被理解为多个列的集合。不能作为 Key 使用，目前 
STRUCT 仅支持在 Duplicate 模型的表中使用。一个 Struct 中的 Field 的名字和数量固定，总是为 Nullable。|
+| JSON           | 不定长     | 二进制 JSON 类型，采用二进制 JSON 格式存储，通过 JSON 函数访问 JSON 
内部字段。长度限制和配置方式与 String 相同 |
+| VARIANT        | 不定长     | 动态可变数据类型，专为半结构化数据如 JSON 设计，可以存入任意 JSON，自动将 JSON 
中的字段拆分成子列存储，提升存储效率和查询分析性能。长度限制和配置方式与 String 相同。Variant 类型只能用在 Value 列，不能用在 Key 
列和分区分桶列。|
+| HLL            | 不定长     | HLL 是模糊去重，在数据量大的情况性能优于 Count Distinct。HLL 的误差通常在 
1% 左右，有时会达到 2%。HLL 不能作为 Key 列使用，建表时配合聚合类型为 
HLL_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。HLL 列只能通过配套的 
hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 进行查询或使用。</p> |
+| BITMAP         | 不定长     | Bitmap 类型的列可以在 Aggregate 表或 Unique 表中使用。在 Unique 
表中使用时，其必须作为非 Key 列使用。在 Aggregate 表中使用时，其必须作为非 Key 列使用，且建表时配合的聚合类型为 
BITMAP_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。BITMAP 列只能通过配套的 
bitmap_union_count、bitmap_union、bitmap_hash、bitmap_hash64 等函数进行查询或使用。</p> |
+| QUANTILE_STATE | 不定长     | QUANTILE_STATE 是一种计算分位数近似值的类型，在导入时会对相同的 Key，不同 
Value 进行预聚合，当 value 数量不超过 2048 时采用明细记录所有数据，当 Value 数量大于 2048 时采用 TDigest 
算法，对数据进行聚合（聚类）保存聚类后的质心点。QUANTILE_STATE 不能作为 Key 列使用，建表时配合聚合类型为 
QUANTILE_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。QUANTILE_STATE 列只能通过配套的 
QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数进行查询或使用。</p> |
+| AGG_STATE      | 不定长     | 聚合函数，只能配合 state/merge/union 函数组合器使用。AGG_STATE 
不能作为 Key 列使用，建表时需要同时声明聚合函数的签名。用户不需要指定长度和默认值。实际存储的数据大小与函数实现有关。 |
+
+您也可通过`SHOW DATA TYPES;`语句查看 Apache Doris 支持的所有数据类型。
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/admin-manual/compaction.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/admin-manual/compaction.md
index 0c79b1d60f1..0e91d0733fb 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/admin-manual/compaction.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/admin-manual/compaction.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "Compaction 调整",
+    "title": "Compaction 优化",
     "language": "zh-CN"
 }
 ---
@@ -27,7 +27,7 @@ under the License.
 
 
 Doris 通过类似 LSM-Tree 的结构写入数据，在后台通过 Compaction 
机制不断将小文件合并成有序的大文件，同时也会处理数据的删除、更新等操作。适当的调整 Compaction 的策略，可以极大地提升导入效率和查询效率。
-Doris 提供如下 2 种 compaction 方式进行调优：
+Doris 提供如下几种 compaction 方式进行调优：
 
 
 ## Vertical compaction
@@ -84,3 +84,50 @@ Segment compaction 有以下特点：
 - 导入操作本身已经耗尽了内存资源时，不建议使用 segment compaction 以免进一步增加内存压力使导入失败。
 
 关于 segment compaction 
的实现和测试结果可以查阅[此链接](https://github.com/apache/doris/pull/12866)。
+
+
+## 单副本 compaction
+
+默认情况下，多个副本的 compaction 是独立进行的，每个副本在都需要消耗 CPU 和 IO 资源。开启单副本 compaction 
后，在一个副本进行 compaction 后，其他几个副本拉取 compaction 后的文件，因此 CPU 资源只需要消耗 1次，节省了 N - 1 倍 
CPU 消耗（ N 是副本数）。
+
+单副本 compaction 在表的 PROPERTIES 中通过参数 `enable_single_replica_compaction` 指定，默认为 
false 不开启，设置为 true 开启。
+
+该参数可以在建表时指定，或者通过 `ALTER TABLE table_name 
SET("enable_single_replica_compaction" = "true")` 来修改。
+
+## Compaction 策略
+
+Compaction 策略决定什么时候将哪些小文件合并成大文件。Doris 当前提供了 2种 compaction 策略，通过表属性的 
`compaction_policy` 参数指定。
+
+### size_based compaction 策略
+
+size_based compaction 策略是默认策略，对大多数场景适用。
+
+```
+"compaction_policy" = "size_based"
+```
+
+### time_series compaction 策略
+
+time_series compaction 
策略是为日志、时序等场景优化的策略。它利用时序数据具有时间局部性的特点，将相邻时间写入的小文件合并成大文件，每个文件只会参与一次 compaction 
就合并成比较大的文件，减少反复 compaction 带来的写放大。
+
+```
+"compaction_policy" = "time_series"
+```
+
+time_series compaction 策略在下面 3 个条件任意一个满足的时候触发小文件合并：
+- 未合并的文件大小超过 `time_series_compaction_goal_size_mbytes` (默认 1GB)
+- 未合并的文件个数超过 `time_series_compaction_file_count_threshold` (默认 2000)
+- 距离上次合并的时间超过 `time_series_compaction_time_threshold_seconds` (默认 1小时)
+
+上述参数在表的 PROPERTIES 中设置，可以在建表时指定，或者通过 `ALTER TABLE table_name SET("name" = 
"value")` 修改。
+
+
+## Compaction 并发控制
+
+Compaction 在后台执行需要消耗 CPU 和 IO 资源，可以通过控制 compaction 并发线程数来控制资源消耗。
+
+compaction 并发线程数在 BE 的配置文件中配置，包括下面几个：
+- `max_base_compaction_threads`：base compaction 的线程数，默认是 4
+- `max_cumu_compaction_threads`：cumulative compaction 的线程数，默认是 10
+- `max_single_replica_compaction_threads`：单副本 compaction 拉取数据文件的线程数，默认是 10
+
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-reference/Data-Types/JSON.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-reference/Data-Types/JSON.md
index fb6e4755914..751c22c0714 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-reference/Data-Types/JSON.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-reference/Data-Types/JSON.md
@@ -24,24 +24,63 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## JSON
+JSON 数据类型，用二进制格式高效存储 [JSON](https://www.rfc-editor.org/rfc/rfc8785) 数据，通过 JSON 
函数访问其内部字段。
 
-<version since="1.2.0">
+默认支持1048576 字节（1 MB），可调大到 2147483643 字节（2 GB），可通过 BE 
配置`String_type_length_soft_limit_bytes` 调整。
 
-</version>
+    与普通 String 类型存储的 JSON 字符串相比，JSON 类型有两点优势
+    1. 数据写入时进行 JSON 格式校验
+    2. 二进制存储格式更加高效，通过json_extract等函数可以高效访问JSON内部字段，比get_json_xx函数快几倍
 
-注意：在1.2.x版本中，JSON类型的名字是JSONB，为了尽量跟MySQL兼容，从2.0.0版本开始改名为JSON，老的表仍然可以使用。
+    :::caution 注意
+    在1.2.x版本中，JSON 类型的名字是 JSONB，为了尽量跟 MySQL 兼容，从 2.0.0 版本开始改名为 JSON，老的表仍然可以使用。
+    :::
 
-### description
-    JSON类型
-        二进制JSON类型，采用二进制JSON格式存储，通过json函数访问JSON内部字段。默认支持1048576 字节（1M），可调大到 
2147483643 字节（2G），可通过be配置`jsonb_type_length_soft_limit_bytes`调整
+### 语法
 
-### note
-    与普通STRING类型存储的JSON字符串相比，JSON类型有两点优势
-    1. 数据写入时进行JSON格式校验
-    2. 二进制存储格式更加高效，通过json_extract等函数可以高效访问JSON内部字段，比get_json_xx函数快几倍
+**定义**
+```sql
+json_column_name JSON
+```
+
+**写入**
+- INSERT INTO VALUE 格式是引号包围的字符串。例如：
+```sql
+INSERT INTO table_name(id, json_column_name) VALUES (1, '{"k1": "100"}')
+```
+
+- STREAM LOAD 对应列的格式是字符串，不需要额外引号包围。例如：
+```
+12     {"k1":"v31", "k2": 300}
+13     []
+14     [123, 456]
+```
+
+**查询**
+- 直接将整个 JSON 列 SELECT 出来
+```sql
+SELECT json_column_name FROM table_name;
+```
+
+- 从 JSON 中提取需要的字段，或者其他信息，参考 JSON 函数，例如：
+```sql
+SELECT json_extract(json_column_name, '$.k1') FROM table_name;
+```
+
+- JSON 类型可以与整数、字符串、BOOLEAN、ARRAY、MAP 进行类型转换 CAST，例如：
+```sql
+SELECT CAST('{"k1": "100"}' AS JSON)
+SELECT CAST(json_column_name AS String) FROM table_name;
+SELECT CAST(json_extract(json_column_name, '$.k1') AS INT) FROM table_name;
+```
+
+:::tip
+
+JSON 类型暂时不能用于 GROUP BY，ORDER BY，比较大小
+
+:::
 
-### example
+### 使用示例
     用一个从建表、导数据、查询全周期的例子说明JSON数据类型的功能和用法。
 
 #### 创建库表
@@ -64,8 +103,8 @@ PROPERTIES("replication_num" = "1");
 
 ##### stream load 导入test_json.csv测试数据
 
-- 测试数据有2列，第一列id，第二列是json
-- 测试数据有25行，其中前18行的json是合法的，后7行的json是非法的
+- 测试数据有2列，第一列ID，第二列是JSON
+- 测试数据有25行，其中前18行的JSON是合法的，后7行的JSON是非法的
 
 ```
 1      \N
@@ -143,7 +182,7 @@ curl --location-trusted -u root: -H 'max_filter_ratio: 0.3' 
-T test_json.csv htt
 }
 ```
 
-- 查看stream load导入的数据，JSON类型的列j会自动转成JSON string展示
+- 查看stream load导入的数据，JSON类型的列j会自动转成JSON String展示
 
 ```
 mysql> SELECT * FROM test_json ORDER BY id;
@@ -359,7 +398,7 @@ mysql> SELECT id, j, json_extract(j, '$.a1[0]'), 
json_extract(j, '$.a1[0].k1') F
 ```
 
 1. 获取具体类型的
-- json_extract_string 获取string类型字段，非string类型转成string
+- json_extract_string 获取String类型字段，非String类型转成String
 ```
 mysql> SELECT id, j, json_extract_string(j, '$') FROM test_json ORDER BY id;
 
+------+---------------------------------------------------------------+---------------------------------------------------------------+
@@ -636,8 +675,8 @@ mysql> SELECT id, j, json_extract_bool(j, '$[1]') FROM 
test_json ORDER BY id;
 19 rows in set (0.01 sec)
 ```
 
-- json_extract_isnull 获取json null类型字段，null返回1，非null返回0
-- 需要注意的是json null和SQL NULL不一样，SQL NULL表示某个字段的值不存在，而json null表示值存在但是是一个特殊值null
+- json_extract_isnull 获取JSON NULL类型字段，null返回1，非null返回0
+- 需要注意的是JSON NULL和SQL NULL不一样，SQL NULL表示某个字段的值不存在，而JSON表示值存在但是是一个特殊值NULL
 ```
 mysql> SELECT id, j, json_extract_isnull(j, '$') FROM test_json ORDER BY id;
 
+------+---------------------------------------------------------------+--------------------------------+
@@ -751,9 +790,9 @@ mysql> SELECT id, j, json_exists_path(j, '$[2]') FROM 
test_json ORDER BY id;
 
 ```
 
-##### 用json_type获取json内的某个字段的类型
+##### 用json_type获取JSON内的某个字段的类型
 
-- 返回json path对应的json字段类型，如果不存在返回NULL
+- 返回json path对应的JSON字段类型，如果不存在返回NULL
 ```
 mysql> SELECT id, j, json_type(j, '$') FROM test_json ORDER BY id;
 
+------+---------------------------------------------------------------+----------------------+
@@ -810,4 +849,4 @@ mysql> select id, j, json_type(j, '$.k1') from test_json 
order by id;
 ```
 
 ### keywords
-JSON, json_parse, json_parse_error_to_null, json_parse_error_to_value, 
json_extract, json_extract_isnull, json_extract_bool, json_extract_int, 
json_extract_bigint, json_extract_double, json_extract_string, 
json_exists_path, json_type
+JSON, json_parse, json_parse_error_to_null, json_parse_error_to_value, 
json_extract, json_extract_isnull, json_extract_bool, json_extract_int, 
json_extract_bigint, json_extract_double, json_extract_String, 
json_exists_path, json_type
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/data-type.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/data-type.md
index eb7813b7553..b6e9b6f95af 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/data-type.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/data-type.md
@@ -28,9 +28,9 @@ Apache Doris 支持标准 SQL 语法，采用 MySQL 网络连接协议，高度
 
 ## 数据类型
 
-Doris 已支持的数据类型列表如下：
+Apache Doris 已支持的数据类型列表如下：
 
-| 类型名         | 字节数    | 描述                                                    
     |
+| 类型名         | 存储空间（字节）| 描述                                                   
  |
 | -------------- | --------- | 
------------------------------------------------------------ |
 | BOOLEAN        | 1         | 布尔值，0 代表 false，1 代表 true。                       
   |
 | TINYINT        | 1         | 有符号整数，范围 [-128, 127]。                           
    |
@@ -44,15 +44,16 @@ Doris 已支持的数据类型列表如下：
 | DATE           | 16        | 日期类型，目前的取值范围是 ['0000-01-01', 
'9999-12-31']，默认的打印形式是 'yyyy-MM-dd'。 |
 | DATETIME       | 16        | 日期时间类型，格式：DATETIME([P])。可选参数 P 表示时间精度，取值范围是 [0, 
6]，即最多支持 6 位小数（微秒）。不设置时为 0。<p>取值范围是 ['0000-01-01 00:00:00[.000000]', 
'9999-12-31 23:59:59[.999999]']。打印的形式是 'yyyy-MM-dd HH:mm:ss.SSSSSS'。 </p>|
 | CHAR           | M         | 定长字符串，M 代表的是定长字符串的字节长度。M 的范围是 1-255。 |
-| VARCHAR        | M         | 变长字符串，M 代表的是变长字符串的字节长度。M 的范围是 1-65533。变长字符串是以 
UTF-8 编码存储的，因此通常英文字符占 1 个字节，中文字符占 3 个字节。 |
-| STRING         | /         | 变长字符串，默认支持 1048576 字节（1MB），可调大到 2147483643 
字节（2GB）。可通过 BE 配置 string_type_length_soft_limit_bytes 调整。String 类型只能用在 Value 
列，不能用在 Key 列和分区分桶列。 |
-| HLL            | /         | HLL 是模糊去重，在数据量大的情况性能优于 Count Distinct。HLL 
的误差通常在 1% 左右，有时会达到 2%。HLL 不能作为 Key 列使用，建表时配合聚合类型为 
HLL_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。HLL 列只能通过配套的 
hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 进行查询或使用。</p> |
-| BITMAP         | /         | BITMAP 类型的列可以在 Aggregate 表或 Unique 表中使用。在 
Unique 表中使用时，其必须作为非 Key 列使用。在 Aggregate 表中使用时，其必须作为非 Key 列使用，且建表时配合的聚合类型为 
BITMAP_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。BITMAP 列只能通过配套的 
bitmap_union_count、bitmap_union、bitmap_hash、bitmap_hash64 等函数进行查询或使用。</p> |
-| QUANTILE_STATE | /         | QUANTILE_STATE 是一种计算分位数近似值的类型，在导入时会对相同的 Key，不同 
Value 进行预聚合，当 value 数量不超过 2048 时采用明细记录所有数据，当 Value 数量大于 2048 时采用 TDigest 
算法，对数据进行聚合（聚类）保存聚类后的质心点。QUANTILE_STATE 不能作为 Key 列使用，建表时配合聚合类型为 
QUANTILE_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。QUANTILE_STATE 列只能通过配套的 
QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数进行查询或使用。</p> |
-| ARRAY          | /         | 由 T 类型元素组成的数组，不能作为 Key 列使用。目前支持在 Duplicate 和 
Unique 模型的表中使用。 |
-| MAP            | /         | 由 K, V 类型元素组成的 map，不能作为 Key 列使用。目前支持在 Duplicate 
和 Unique 模型的表中使用。 |
-| STRUCT         | /        | 由多个 Field 组成的结构体，也可被理解为多个列的集合。不能作为 Key 使用，目前 
STRUCT 仅支持在 Duplicate 模型的表中使用。一个 Struct 中的 Field 的名字和数量固定，总是为 Nullable。|
-| JSON           | /         | 二进制 JSON 类型，采用二进制 JSON 格式存储，通过 JSON 函数访问 JSON 
内部字段。默认支持 1048576 字节（1MB），可调大到 2147483643 字节（2GB）。可通过 BE 配置 
jsonb_type_length_soft_limit_bytes 调整。 |
-| AGG_STATE      | /         | 聚合函数，只能配合 state/merge/union 函数组合器使用。AGG_STATE 
不能作为 key 列使用，建表时需要同时声明聚合函数的签名。用户不需要指定长度和默认值。实际存储的数据大小与函数实现有关。 |
-
-您也可通过`SHOW DATA TYPES;`语句查看 Doris 支持的所有数据类型。
+| VARCHAR        | 不定长     | 变长字符串，M 代表的是变长字符串的字节长度。M 的范围是 1-65533。变长字符串是以 
UTF-8 编码存储的，因此通常英文字符占 1 个字节，中文字符占 3 个字节。 |
+| STRING         | 不定长     | 变长字符串，默认支持 1048576 字节（1MB），可调大到 2147483643 
字节（2GB）。可通过 BE 配置 string_type_length_soft_limit_bytes 调整。String 类型只能用在 Value 
列，不能用在 Key 列和分区分桶列。 |
+| ARRAY          | 不定长     | 由 T 类型元素组成的数组，不能作为 Key 列使用。目前支持在 Duplicate 和 
Unique 模型的表中使用。 |
+| MAP            | 不定长     | 由 K, V 类型元素组成的 map，不能作为 Key 列使用。目前支持在 Duplicate 和 
Unique 模型的表中使用。 |
+| STRUCT         | 不定长     | 由多个 Field 组成的结构体，也可被理解为多个列的集合。不能作为 Key 使用，目前 
STRUCT 仅支持在 Duplicate 模型的表中使用。一个 Struct 中的 Field 的名字和数量固定，总是为 Nullable。|
+| JSON           | 不定长     | 二进制 JSON 类型，采用二进制 JSON 格式存储，通过 JSON 函数访问 JSON 
内部字段。长度限制和配置方式与 String 相同 |
+| VARIANT        | 不定长     | 动态可变数据类型，专为半结构化数据如 JSON 设计，可以存入任意 JSON，自动将 JSON 
中的字段拆分成子列存储，提升存储效率和查询分析性能。长度限制和配置方式与 String 相同。Variant 类型只能用在 Value 列，不能用在 Key 
列和分区分桶列。|
+| HLL            | 不定长     | HLL 是模糊去重，在数据量大的情况性能优于 Count Distinct。HLL 的误差通常在 
1% 左右，有时会达到 2%。HLL 不能作为 Key 列使用，建表时配合聚合类型为 
HLL_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。HLL 列只能通过配套的 
hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 进行查询或使用。</p> |
+| BITMAP         | 不定长     | Bitmap 类型的列可以在 Aggregate 表或 Unique 表中使用。在 Unique 
表中使用时，其必须作为非 Key 列使用。在 Aggregate 表中使用时，其必须作为非 Key 列使用，且建表时配合的聚合类型为 
BITMAP_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。BITMAP 列只能通过配套的 
bitmap_union_count、bitmap_union、bitmap_hash、bitmap_hash64 等函数进行查询或使用。</p> |
+| QUANTILE_STATE | 不定长     | QUANTILE_STATE 是一种计算分位数近似值的类型，在导入时会对相同的 Key，不同 
Value 进行预聚合，当 value 数量不超过 2048 时采用明细记录所有数据，当 Value 数量大于 2048 时采用 TDigest 
算法，对数据进行聚合（聚类）保存聚类后的质心点。QUANTILE_STATE 不能作为 Key 列使用，建表时配合聚合类型为 
QUANTILE_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。QUANTILE_STATE 列只能通过配套的 
QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数进行查询或使用。</p> |
+| AGG_STATE      | 不定长     | 聚合函数，只能配合 state/merge/union 函数组合器使用。AGG_STATE 
不能作为 Key 列使用，建表时需要同时声明聚合函数的签名。用户不需要指定长度和默认值。实际存储的数据大小与函数实现有关。 |
+
+您也可通过`SHOW DATA TYPES;`语句查看 Apache Doris 支持的所有数据类型。
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/admin-manual/compaction.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/admin-manual/compaction.md
index 0c79b1d60f1..0e91d0733fb 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/admin-manual/compaction.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/admin-manual/compaction.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "Compaction 调整",
+    "title": "Compaction 优化",
     "language": "zh-CN"
 }
 ---
@@ -27,7 +27,7 @@ under the License.
 
 
 Doris 通过类似 LSM-Tree 的结构写入数据，在后台通过 Compaction 
机制不断将小文件合并成有序的大文件，同时也会处理数据的删除、更新等操作。适当的调整 Compaction 的策略，可以极大地提升导入效率和查询效率。
-Doris 提供如下 2 种 compaction 方式进行调优：
+Doris 提供如下几种 compaction 方式进行调优：
 
 
 ## Vertical compaction
@@ -84,3 +84,50 @@ Segment compaction 有以下特点：
 - 导入操作本身已经耗尽了内存资源时，不建议使用 segment compaction 以免进一步增加内存压力使导入失败。
 
 关于 segment compaction 
的实现和测试结果可以查阅[此链接](https://github.com/apache/doris/pull/12866)。
+
+
+## 单副本 compaction
+
+默认情况下，多个副本的 compaction 是独立进行的，每个副本在都需要消耗 CPU 和 IO 资源。开启单副本 compaction 
后，在一个副本进行 compaction 后，其他几个副本拉取 compaction 后的文件，因此 CPU 资源只需要消耗 1次，节省了 N - 1 倍 
CPU 消耗（ N 是副本数）。
+
+单副本 compaction 在表的 PROPERTIES 中通过参数 `enable_single_replica_compaction` 指定，默认为 
false 不开启，设置为 true 开启。
+
+该参数可以在建表时指定，或者通过 `ALTER TABLE table_name 
SET("enable_single_replica_compaction" = "true")` 来修改。
+
+## Compaction 策略
+
+Compaction 策略决定什么时候将哪些小文件合并成大文件。Doris 当前提供了 2种 compaction 策略，通过表属性的 
`compaction_policy` 参数指定。
+
+### size_based compaction 策略
+
+size_based compaction 策略是默认策略，对大多数场景适用。
+
+```
+"compaction_policy" = "size_based"
+```
+
+### time_series compaction 策略
+
+time_series compaction 
策略是为日志、时序等场景优化的策略。它利用时序数据具有时间局部性的特点，将相邻时间写入的小文件合并成大文件，每个文件只会参与一次 compaction 
就合并成比较大的文件，减少反复 compaction 带来的写放大。
+
+```
+"compaction_policy" = "time_series"
+```
+
+time_series compaction 策略在下面 3 个条件任意一个满足的时候触发小文件合并：
+- 未合并的文件大小超过 `time_series_compaction_goal_size_mbytes` (默认 1GB)
+- 未合并的文件个数超过 `time_series_compaction_file_count_threshold` (默认 2000)
+- 距离上次合并的时间超过 `time_series_compaction_time_threshold_seconds` (默认 1小时)
+
+上述参数在表的 PROPERTIES 中设置，可以在建表时指定，或者通过 `ALTER TABLE table_name SET("name" = 
"value")` 修改。
+
+
+## Compaction 并发控制
+
+Compaction 在后台执行需要消耗 CPU 和 IO 资源，可以通过控制 compaction 并发线程数来控制资源消耗。
+
+compaction 并发线程数在 BE 的配置文件中配置，包括下面几个：
+- `max_base_compaction_threads`：base compaction 的线程数，默认是 4
+- `max_cumu_compaction_threads`：cumulative compaction 的线程数，默认是 10
+- `max_single_replica_compaction_threads`：单副本 compaction 拉取数据文件的线程数，默认是 10
+
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-types/Data-Types/JSON.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-types/Data-Types/JSON.md
index fb6e4755914..473b39b46b4 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-types/Data-Types/JSON.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-types/Data-Types/JSON.md
@@ -24,24 +24,61 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## JSON
+JSON 数据类型，用二进制格式高效存储 [JSON](https://www.rfc-editor.org/rfc/rfc8785) 数据，通过 JSON 
函数访问其内部字段。
 
-<version since="1.2.0">
+默认支持1048576 字节（1M），可调大到 2147483643 
字节（2G），可通过be配置`string_type_length_soft_limit_bytes` 调整。
 
-</version>
-
-注意：在1.2.x版本中，JSON类型的名字是JSONB，为了尽量跟MySQL兼容，从2.0.0版本开始改名为JSON，老的表仍然可以使用。
-
-### description
-    JSON类型
-        二进制JSON类型，采用二进制JSON格式存储，通过json函数访问JSON内部字段。默认支持1048576 字节（1M），可调大到 
2147483643 字节（2G），可通过be配置`jsonb_type_length_soft_limit_bytes`调整
-
-### note
     与普通STRING类型存储的JSON字符串相比，JSON类型有两点优势
     1. 数据写入时进行JSON格式校验
     2. 二进制存储格式更加高效，通过json_extract等函数可以高效访问JSON内部字段，比get_json_xx函数快几倍
 
-### example
+    注意：在1.2.x版本中，JSON类型的名字是JSONB，为了尽量跟MySQL兼容，从2.0.0版本开始改名为JSON，老的表仍然可以使用。
+
+### 语法
+
+**定义**
+```sql
+json_column_name JSON
+```
+
+**写入**
+- INSERT INTO VALUE 格式是引号包围的字符串。例如：
+```sql
+INSERT INTO table_name(id, json_column_name) VALUES (1, '{"k1": "100"}')
+```
+
+- STREAM LOAD 对应列的格式是字符串，不需要额外引号包围。例如：
+```
+12     {"k1":"v31", "k2": 300}
+13     []
+14     [123, 456]
+```
+
+**查询**
+- 直接将整个 JSON 列 SELECT 出来
+```sql
+SELECT json_column_name FROM table_name;
+```
+
+- 从 JSON 中提取需要的字段，或者其他信息，参考 JSON 函数，例如：
+```sql
+SELECT json_extract(json_column_name, '$.k1') FROM table_name;
+```
+
+- JSON 类型可以与整数、字符串、BOOLEAN、ARRAY、MAP 进行类型转换 CAST，例如：
+```sql
+SELECT CAST('{"k1": "100"}' AS JSON)
+SELECT CAST(json_column_name AS STRING) FROM table_name;
+SELECT CAST(json_extract(json_column_name, '$.k1') AS INT) FROM table_name;
+```
+
+:::tip
+
+JSON 类型暂时不能用于 GROUP BY，ORDER BY，比较大小
+
+:::
+
+### 使用示例
     用一个从建表、导数据、查询全周期的例子说明JSON数据类型的功能和用法。
 
 #### 创建库表
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md
index f685aa7ab3b..b6e9b6f95af 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md
@@ -28,9 +28,9 @@ Apache Doris 支持标准 SQL 语法，采用 MySQL 网络连接协议，高度
 
 ## 数据类型
 
-Doris 已支持的数据类型列表如下：
+Apache Doris 已支持的数据类型列表如下：
 
-| 类型名         | 字节数    | 描述                                                    
     |
+| 类型名         | 存储空间（字节）| 描述                                                   
  |
 | -------------- | --------- | 
------------------------------------------------------------ |
 | BOOLEAN        | 1         | 布尔值，0 代表 false，1 代表 true。                       
   |
 | TINYINT        | 1         | 有符号整数，范围 [-128, 127]。                           
    |
@@ -44,16 +44,16 @@ Doris 已支持的数据类型列表如下：
 | DATE           | 16        | 日期类型，目前的取值范围是 ['0000-01-01', 
'9999-12-31']，默认的打印形式是 'yyyy-MM-dd'。 |
 | DATETIME       | 16        | 日期时间类型，格式：DATETIME([P])。可选参数 P 表示时间精度，取值范围是 [0, 
6]，即最多支持 6 位小数（微秒）。不设置时为 0。<p>取值范围是 ['0000-01-01 00:00:00[.000000]', 
'9999-12-31 23:59:59[.999999]']。打印的形式是 'yyyy-MM-dd HH:mm:ss.SSSSSS'。 </p>|
 | CHAR           | M         | 定长字符串，M 代表的是定长字符串的字节长度。M 的范围是 1-255。 |
-| VARCHAR        | M         | 变长字符串，M 代表的是变长字符串的字节长度。M 的范围是 1-65533。变长字符串是以 
UTF-8 编码存储的，因此通常英文字符占 1 个字节，中文字符占 3 个字节。 |
-| STRING         | /         | 变长字符串，默认支持 1048576 字节（1MB），可调大到 2147483643 
字节（2GB）。可通过 BE 配置 string_type_length_soft_limit_bytes 调整。String 类型只能用在 Value 
列，不能用在 Key 列和分区分桶列。 |
-| HLL            | /         | HLL 是模糊去重，在数据量大的情况性能优于 Count Distinct。HLL 
的误差通常在 1% 左右，有时会达到 2%。HLL 不能作为 Key 列使用，建表时配合聚合类型为 
HLL_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。HLL 列只能通过配套的 
hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 进行查询或使用。</p> |
-| BITMAP         | /         | BITMAP 类型的列可以在 Aggregate 表或 Unique 表中使用。在 
Unique 表中使用时，其必须作为非 Key 列使用。在 Aggregate 表中使用时，其必须作为非 Key 列使用，且建表时配合的聚合类型为 
BITMAP_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。BITMAP 列只能通过配套的 
bitmap_union_count、bitmap_union、bitmap_hash、bitmap_hash64 等函数进行查询或使用。</p> |
-| QUANTILE_STATE | /         | QUANTILE_STATE 是一种计算分位数近似值的类型，在导入时会对相同的 Key，不同 
Value 进行预聚合，当 value 数量不超过 2048 时采用明细记录所有数据，当 Value 数量大于 2048 时采用 TDigest 
算法，对数据进行聚合（聚类）保存聚类后的质心点。QUANTILE_STATE 不能作为 Key 列使用，建表时配合聚合类型为 
QUANTILE_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。QUANTILE_STATE 列只能通过配套的 
QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数进行查询或使用。</p> |
-| ARRAY          | /         | 由 T 类型元素组成的数组，不能作为 Key 列使用。目前支持在 Duplicate 和 
Unique 模型的表中使用。 |
-| MAP            | /         | 由 K, V 类型元素组成的 map，不能作为 Key 列使用。目前支持在 Duplicate 
和 Unique 模型的表中使用。 |
-| STRUCT         | /        | 由多个 Field 组成的结构体，也可被理解为多个列的集合。不能作为 Key 使用，目前 
STRUCT 仅支持在 Duplicate 模型的表中使用。一个 Struct 中的 Field 的名字和数量固定，总是为 Nullable。|
-| JSON           | /         | 二进制 JSON 类型，采用二进制 JSON 格式存储，通过 JSON 函数访问 JSON 
内部字段。默认支持 1048576 字节（1MB），可调大到 2147483643 字节（2GB）。可通过 BE 配置 
jsonb_type_length_soft_limit_bytes 调整。 |
-| AGG_STATE      | /         | 聚合函数，只能配合 state/merge/union 函数组合器使用。AGG_STATE 
不能作为 key 列使用，建表时需要同时声明聚合函数的签名。用户不需要指定长度和默认值。实际存储的数据大小与函数实现有关。 |
-| VARIANT        | /         | 
VARIANT允许存储包含不同数据类型（如整数、字符串、布尔值等）的复杂数据结构，而无需在表结构中提前定义具体的列。VARIANT 
类型特别适用于处理复杂的嵌套结构，而这些结构可能随时会发生变化。在写入过程中，该类型可以自动根据列的结构、类型推断列信息，动态合并写入的 
schema，并通过将 JSON 键及其对应的值存储为列和动态子列 |
-
-您也可通过`SHOW DATA TYPES;`语句查看 Doris 支持的所有数据类型。
+| VARCHAR        | 不定长     | 变长字符串，M 代表的是变长字符串的字节长度。M 的范围是 1-65533。变长字符串是以 
UTF-8 编码存储的，因此通常英文字符占 1 个字节，中文字符占 3 个字节。 |
+| STRING         | 不定长     | 变长字符串，默认支持 1048576 字节（1MB），可调大到 2147483643 
字节（2GB）。可通过 BE 配置 string_type_length_soft_limit_bytes 调整。String 类型只能用在 Value 
列，不能用在 Key 列和分区分桶列。 |
+| ARRAY          | 不定长     | 由 T 类型元素组成的数组，不能作为 Key 列使用。目前支持在 Duplicate 和 
Unique 模型的表中使用。 |
+| MAP            | 不定长     | 由 K, V 类型元素组成的 map，不能作为 Key 列使用。目前支持在 Duplicate 和 
Unique 模型的表中使用。 |
+| STRUCT         | 不定长     | 由多个 Field 组成的结构体，也可被理解为多个列的集合。不能作为 Key 使用，目前 
STRUCT 仅支持在 Duplicate 模型的表中使用。一个 Struct 中的 Field 的名字和数量固定，总是为 Nullable。|
+| JSON           | 不定长     | 二进制 JSON 类型，采用二进制 JSON 格式存储，通过 JSON 函数访问 JSON 
内部字段。长度限制和配置方式与 String 相同 |
+| VARIANT        | 不定长     | 动态可变数据类型，专为半结构化数据如 JSON 设计，可以存入任意 JSON，自动将 JSON 
中的字段拆分成子列存储，提升存储效率和查询分析性能。长度限制和配置方式与 String 相同。Variant 类型只能用在 Value 列，不能用在 Key 
列和分区分桶列。|
+| HLL            | 不定长     | HLL 是模糊去重，在数据量大的情况性能优于 Count Distinct。HLL 的误差通常在 
1% 左右，有时会达到 2%。HLL 不能作为 Key 列使用，建表时配合聚合类型为 
HLL_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。HLL 列只能通过配套的 
hll_union_agg、hll_raw_agg、hll_cardinality、hll_hash 进行查询或使用。</p> |
+| BITMAP         | 不定长     | Bitmap 类型的列可以在 Aggregate 表或 Unique 表中使用。在 Unique 
表中使用时，其必须作为非 Key 列使用。在 Aggregate 表中使用时，其必须作为非 Key 列使用，且建表时配合的聚合类型为 
BITMAP_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。BITMAP 列只能通过配套的 
bitmap_union_count、bitmap_union、bitmap_hash、bitmap_hash64 等函数进行查询或使用。</p> |
+| QUANTILE_STATE | 不定长     | QUANTILE_STATE 是一种计算分位数近似值的类型，在导入时会对相同的 Key，不同 
Value 进行预聚合，当 value 数量不超过 2048 时采用明细记录所有数据，当 Value 数量大于 2048 时采用 TDigest 
算法，对数据进行聚合（聚类）保存聚类后的质心点。QUANTILE_STATE 不能作为 Key 列使用，建表时配合聚合类型为 
QUANTILE_UNION。<p>用户不需要指定长度和默认值。长度根据数据的聚合程度系统内控制。QUANTILE_STATE 列只能通过配套的 
QUANTILE_PERCENT、QUANTILE_UNION、TO_QUANTILE_STATE 等函数进行查询或使用。</p> |
+| AGG_STATE      | 不定长     | 聚合函数，只能配合 state/merge/union 函数组合器使用。AGG_STATE 
不能作为 Key 列使用，建表时需要同时声明聚合函数的签名。用户不需要指定长度和默认值。实际存储的数据大小与函数实现有关。 |
+
+您也可通过`SHOW DATA TYPES;`语句查看 Apache Doris 支持的所有数据类型。
diff --git a/sidebars.json b/sidebars.json
index 6cc02caaca0..a2314401d86 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -1149,17 +1149,17 @@
                                 "sql-manual/sql-types/Data-Types/CHAR",
                                 "sql-manual/sql-types/Data-Types/VARCHAR",
                                 "sql-manual/sql-types/Data-Types/STRING",
-                                "sql-manual/sql-types/Data-Types/HLL",
-                                "sql-manual/sql-types/Data-Types/BITMAP",
-                                
"sql-manual/sql-types/Data-Types/QUANTILE_STATE",
+                                "sql-manual/sql-types/Data-Types/IPV4",
+                                "sql-manual/sql-types/Data-Types/IPV6",
                                 "sql-manual/sql-types/Data-Types/ARRAY",
                                 "sql-manual/sql-types/Data-Types/MAP",
                                 "sql-manual/sql-types/Data-Types/STRUCT",
                                 "sql-manual/sql-types/Data-Types/JSON",
-                                "sql-manual/sql-types/Data-Types/AGG_STATE",
                                 "sql-manual/sql-types/Data-Types/VARIANT",
-                                "sql-manual/sql-types/Data-Types/IPV4",
-                                "sql-manual/sql-types/Data-Types/IPV6"
+                                "sql-manual/sql-types/Data-Types/HLL",
+                                "sql-manual/sql-types/Data-Types/BITMAP",
+                                
"sql-manual/sql-types/Data-Types/QUANTILE_STATE",
+                                "sql-manual/sql-types/Data-Types/AGG_STATE"
                             ]
                         }
                     ]
diff --git a/versioned_docs/version-2.0/admin-manual/compaction.md 
b/versioned_docs/version-2.0/admin-manual/compaction.md
index e48756b57e2..c103d5e4e93 100644
--- a/versioned_docs/version-2.0/admin-manual/compaction.md
+++ b/versioned_docs/version-2.0/admin-manual/compaction.md
@@ -29,10 +29,10 @@ under the License.
 
 Doris writes data through a structure similar to LSM-Tree, and continuously 
merges small files into large ordered files through compaction in the 
background. Compaction handles operations such as deletion and updating. 
 
-Appropriately adjusting the compaction strategy can greatly improve load and 
query efficiency. Doris provides the following two compaction strategies for 
tuning:
+Appropriately adjusting the compaction strategy can greatly improve load and 
query efficiency. Doris provides the following compaction strategies for tuning:
 
 
-## Vertical Compaction
+## Vertical compaction
 
 <version since="1.2.2">
 </version>
@@ -47,7 +47,7 @@ BE configuration：
 - `vertical_compaction_max_segment_size` is used to configure the size of the 
disk file after vertical compaction, the default value is 268435456 (bytes)
 
 
-## Segment Compaction
+## Segment compaction
 
 Segment compaction mainly deals with the large-scale data load. Segment 
compaction operates during the load process and compact segments inside the 
job, which is different from normal compaction and vertical compaction. This 
mechanism can effectively reduce the number of generated segments and avoid the 
-238 (OLAP_ERR_TOO_MANY_SEGMENTS) errors.
 
@@ -72,3 +72,52 @@ Situations where segment compaction is not recommended:
 - When the load operation itself has exhausted memory resources, it is not 
recommended to use the segment compaction to avoid further increasing memory 
pressure and causing the load job to fail.
 
 Refer to this [link](https://github.com/apache/doris/pull/12866) for more 
information about implementation and test results.
+
+## Single replica compaction
+
+By default, compaction for multiple replicas is performed independently, with 
each replica consuming CPU and IO resources. When single replica compaction is 
enabled, only one replica performs the compaction. Afterward, the other 
replicas pull the compacted files from this replica, resulting in CPU resources 
being consumed only once, saving N - 1 times CPU usage (where N is the number 
of replicas).
+
+Single replica compaction is specified in the table's PROPERTIES via the 
parameter `enable_single_replica_compaction`, which is false by default 
(disabled). To enable it, set the parameter to true.
+
+This parameter can be specified when creating the table or modified later 
using:
+```sql
+ALTER TABLE table_name SET("enable_single_replica_compaction" = "true");
+```
+
+## Compaction strategy
+
+The compaction strategy determines when and which small files are merged into 
larger files. Doris currently offers two compaction strategies, specified by 
the `compaction_policy` parameter in the table properties.
+
+### Size-based compaction strategy
+
+The size-based compaction strategy is the default strategy and is suitable for 
most scenarios.
+```
+"compaction_policy" = "size_based"
+```
+
+### Time series compaction strategy
+
+The time series compaction strategy is optimized for scenarios like logs and 
time-series data. It leverages the time locality of time-series data, merging 
small files written in adjacent times into larger files. Each file participates 
in compaction only once, reducing write amplification from repeated compaction.
+
+```
+"compaction_policy" = "time_series"
+```
+
+The time series compaction strategy is triggered when any of the following 
conditions are met:
+- The size of unmerged files exceeds `time_series_compaction_goal_size_mbytes` 
(default 1 GB).
+- The number of unmerged files exceeds 
`time_series_compaction_file_count_threshold` (default 2000).
+- The time since the last compaction exceeds 
`time_series_compaction_time_threshold_seconds` (default 1 hour).
+
+These parameters are set in the table's PROPERTIES and can be specified when 
creating the table or modified later using:
+```
+ALTER TABLE table_name SET("name" = "value");
+```
+
+## Compaction concurrency control
+
+Compaction runs in the background and consumes CPU and IO resources. The 
resource consumption can be controlled by adjusting the number of concurrent 
compaction threads.
+
+The number of concurrent compaction threads is configured in the BE 
configuration file, including the following parameters:
+- `max_base_compaction_threads`: Number of base compaction threads, default is 
4.
+- `max_cumu_compaction_threads`: Number of cumulative compaction threads, 
default is 10.
+- `max_single_replica_compaction_threads`: Number of threads for fetching data 
files during single replica compaction, default is 10.
diff --git 
a/versioned_docs/version-2.0/sql-manual/sql-reference/Data-Types/JSON.md 
b/versioned_docs/version-2.0/sql-manual/sql-reference/Data-Types/JSON.md
index 341e43dbd55..4ce84debda6 100644
--- a/versioned_docs/version-2.0/sql-manual/sql-reference/Data-Types/JSON.md
+++ b/versioned_docs/version-2.0/sql-manual/sql-reference/Data-Types/JSON.md
@@ -26,20 +26,57 @@ under the License.
 
 ## JSON
 
-<version since="1.2.0">
+The JSON data type stores [JSON](https://www.rfc-editor.org/rfc/rfc8785) data 
efficiently in a binary format and allows access to its internal fields through 
JSON functions.
 
-</version>
+By default, it supports up to 1048576 bytes (1MB), and can be increased up to 
2147483643 bytes (2GB). This can be adjusted via the 
`string_type_length_soft_limit_bytes` configuration.
 
-NOTICE: In version 1.2.x the data type name is JSONB. It's renamed to JSON to 
be more compatible to version 2.0.0. And the old tables can still be used.
+Compared to storing JSON strings in a regular STRING type, the JSON type has 
two main advantages:
+1. JSON format validation during data insertion.
+2. More efficient binary storage format, enabling faster access to JSON 
internal fields using functions like `json_extract`, compared to `get_json_xx` 
functions.
 
-### description
-    JSON (Binary) datatype.
-        Use binary JSON format for storage and json function to extract field. 
Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes 
(2G),and the JSONB type is also limited by the be configuration 
`jsonb_type_length_soft_limit_bytes`.
+**Note**: In version 1.2.x, the JSON type was named JSONB. To maintain 
compatibility with MySQL, it was renamed to JSON starting from version 2.0.0. 
Older tables can still use the previous name.
 
-### note
-    There are some advantanges for JSON over plain JSON STRING.
-    1. JSON syntax will be validated on write to ensure data quality
-    2. JSON binary format is more efficient. Using json_extract functions on 
JSON datatype is 2-4 times faster than get_json_xx on JSON STRING format.
+### Syntax
+
+**Definition:**
+```sql
+json_column_name JSON
+```
+
+**Insertion:**
+- Using `INSERT INTO VALUES` with the format as a string surrounded by quotes. 
For example:
+```sql
+INSERT INTO table_name(id, json_column_name) VALUES (1, '{"k1": "100"}')
+```
+
+- For STREAM LOAD, the format for the corresponding column is a string without 
additional quotes. For example:
+```
+12     {"k1":"v31", "k2": 300}
+13     []
+14     [123, 456]
+```
+
+**Query:**
+- Directly select the entire JSON column:
+```sql
+SELECT json_column_name FROM table_name;
+```
+
+- Extract specific fields or other information from JSON using JSON functions. 
For example:
+```sql
+SELECT json_extract(json_column_name, '$.k1') FROM table_name;
+```
+
+- The JSON type can be cast to and from integers, strings, BOOLEAN, ARRAY, and 
MAP. For example:
+```sql
+SELECT CAST('{"k1": "100"}' AS JSON);
+SELECT CAST(json_column_name AS STRING) FROM table_name;
+SELECT CAST(json_extract(json_column_name, '$.k1') AS INT) FROM table_name;
+```
+
+:::tip
+The JSON type currently cannot be used for `GROUP BY`, `ORDER BY`, or 
comparison operations.
+:::
 
 ### example
 A tutorial for JSON datatype including create table, load data and query.
diff --git a/versioned_docs/version-2.0/table-design/data-type.md 
b/versioned_docs/version-2.0/table-design/data-type.md
index 2b5822b22f2..f093502a21f 100644
--- a/versioned_docs/version-2.0/table-design/data-type.md
+++ b/versioned_docs/version-2.0/table-design/data-type.md
@@ -28,7 +28,7 @@ Apache Doris support standard SQL syntax, using MySQL Network 
Connection Protoco
 
 The list of data types supported by Doris is as follows:
 
-| Type name      | Number of bytes | Description                               
                   |
+| Type name      | Storeage (bytes)| Description                               
                   |
 | -------------- | --------------- | 
------------------------------------------------------------ |
 | BOOLEAN        | 1               | Boolean data type hat stores only two 
types of values , 0 represents false, 1 represents true. |
 | TINYINT        | 1               | Integer value, signed range is from -128 
to 127.             |
@@ -42,15 +42,15 @@ The list of data types supported by Doris is as follows:
 | DATE           | 16              | DATE holds values for a calendar year, 
month and day, the  supported range is ['0000-01-01', '9999-12-31'].  Default 
print format: 'yyyy-MM-dd'. |
 | DATETIME       | 16              | A DATE and TIME combination  Format: 
DATETIME ([P]).   The optional parameter P represents time precision, with a 
value range of [0,6], supporting up to 6 decimal places (microseconds). When 
not set, it is 0.   The supported range is ['0000-01-01 00:00:00 [.000000]', 
'9999-12-31 23:59:59 [.999999]'].   Default print format: 'yyy-MM-dd HH: mm: 
ss. SSSSSS '. |
 | CHAR           | M               | A FIXED length string, the parameter M 
specifies the column length in characters. The range of M is from 1 to 255. |
-| VARCHAR        | M               | A VARIABLE length string , the parameter 
M specifies the maximum string length in characters. The range of M is from 1 
to 65533.   The variable-length string is stored in UTF-8 encoding. English 
characters occupy 1 byte, and Chinese characters occupy 3 bytes. |
-| STRING         | /               | A VARIABLE length string, default 
supports 1048576 bytes (1 MB), and a limit of maximum precision of 2147483643 
bytes (2 GB).   Size can be configured string_type_length_soft_limit_bytes 
adjusted through BE.   String type can only be used in value column, not in key 
column and partition bucket column. |
-| HLL            | /               | HLL stands for HyperLogLog, is a fuzzy 
deduplication. It performs better than Count Distinct when dealing with large 
datasets.   The error rate of HLL is typically around 1%, and sometimes it can 
reach 2%. HLL cannot be used as a key column, and the aggregation type is 
HLL_UNION when creating a table.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data.  HLL columns can on [...]
-| BITMAP         | /               | BITMAP type can be used in Aggregate 
tables or Unique tables.  - When used in a Unique table, BITMAP must be 
employed as non-key columns.  - When used in an Aggregate table, BITMAP must 
also serve as non-key columns, and the aggregation type must be set to 
BITMAP_UNION during table creation.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data. BITMAP columns can only be qu [...]
-| QUANTILE_STATE | /               | A type used to calculate approximate 
quantile values.  When loading, it performs pre-aggregation for the same keys 
with different values. When the number of values does not exceed 2048, it 
records all data in detail. When the number of values is greater than 2048, it 
employs the TDigest algorithm to aggregate (cluster) the data and store the 
centroid points after clustering.   QUANTILE_STATE cannot be used as a key 
column and should be paired with the [...]
-| ARRAY          | /               | Arrays composed of elements of type T 
cannot be used as key columns. Currently supported for use in tables with 
Duplicate and Unique models. |
-| MAP            | /               | Maps consisting of elements of type K and 
V, cannot be used as Key columns. These maps are currently supported in tables 
using the Duplicate and Unique models. |
-| STRUCT         | /               | A structure composed of multiple Fields 
can also be understood as a collection of multiple columns. It cannot be used 
as a Key. Currently, STRUCT can only be used in tables of Duplicate models. The 
name and number of Fields in a Struct are fixed and are always Nullable.|
-| JSON           | /               | Binary JSON type, stored in binary JSON 
format, access internal JSON fields through JSON function.   Supported up to 
1048576 bytes (1MB) by default, and can be adjusted to a maximum of 2147483643 
bytes (2GB). This limit can be modified through the BE configuration parameter 
'jsonb_type_length_soft_limit_bytes'. |
-| AGG_STATE      | /               | Aggregate function can only be used with 
state/merge/union function combiners.   AGG_STATE cannot be used as a key 
column. When creating a table, the signature of the aggregate function needs to 
be declared alongside.   Users do not need to specify the length or default 
value. The actual data storage size depends on the function's implementation. |
+| VARCHAR        | Variable Length | A VARIABLE length string , the parameter 
M specifies the maximum string length in characters. The range of M is from 1 
to 65533.   The variable-length string is stored in UTF-8 encoding. English 
characters occupy 1 byte, and Chinese characters occupy 3 bytes. |
+| STRING         | Variable Length | A VARIABLE length string, default 
supports 1048576 bytes (1 MB), and a limit of maximum precision of 2147483643 
bytes (2 GB).   Size can be configured string_type_length_soft_limit_bytes 
adjusted through BE.   String type can only be used in value column, not in key 
column and partition bucket column. |
+| ARRAY          | Variable Length | Arrays composed of elements of type T 
cannot be used as key columns. Currently supported for use in tables with 
Duplicate and Unique models. |
+| MAP            | Variable Length | Maps consisting of elements of type K and 
V, cannot be used as Key columns. These maps are currently supported in tables 
using the Duplicate and Unique models. |
+| STRUCT         | Variable Length | A structure composed of multiple Fields 
can also be understood as a collection of multiple columns. It cannot be used 
as a Key. Currently, STRUCT can only be used in tables of Duplicate models. The 
name and number of Fields in a Struct are fixed and are always Nullable.|
+| JSON           | Variable Length | Binary JSON type, stored in binary JSON 
format, access internal JSON fields through JSON function.   Supported up to 
1048576 bytes (1MB) by default, and can be adjusted to a maximum of 2147483643 
bytes (2GB). This limit can be modified through the BE configuration parameter 
'jsonb_type_length_soft_limit_bytes'. |
+| HLL            | Variable Length | HLL stands for HyperLogLog, is a fuzzy 
deduplication. It performs better than Count Distinct when dealing with large 
datasets.   The error rate of HLL is typically around 1%, and sometimes it can 
reach 2%. HLL cannot be used as a key column, and the aggregation type is 
HLL_UNION when creating a table.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data.  HLL columns can on [...]
+| BITMAP         | Variable Length | BITMAP type can be used in Aggregate 
tables or Unique tables.  - When used in a Unique table, BITMAP must be 
employed as non-key columns.  - When used in an Aggregate table, BITMAP must 
also serve as non-key columns, and the aggregation type must be set to 
BITMAP_UNION during table creation.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data. BITMAP columns can only be qu [...]
+| QUANTILE_STATE | Variable Length | A type used to calculate approximate 
quantile values.  When loading, it performs pre-aggregation for the same keys 
with different values. When the number of values does not exceed 2048, it 
records all data in detail. When the number of values is greater than 2048, it 
employs the TDigest algorithm to aggregate (cluster) the data and store the 
centroid points after clustering.   QUANTILE_STATE cannot be used as a key 
column and should be paired with the [...]
+| AGG_STATE      | Variable Length | Aggregate function can only be used with 
state/merge/union function combiners.   AGG_STATE cannot be used as a key 
column. When creating a table, the signature of the aggregate function needs to 
be declared alongside.   Users do not need to specify the length or default 
value. The actual data storage size depends on the function's implementation. |
 
 You can also view all the data types supported by Doris with the `SHOW DATA 
TYPES; `statement.
diff --git a/versioned_docs/version-2.1/admin-manual/compaction.md 
b/versioned_docs/version-2.1/admin-manual/compaction.md
index e48756b57e2..c103d5e4e93 100644
--- a/versioned_docs/version-2.1/admin-manual/compaction.md
+++ b/versioned_docs/version-2.1/admin-manual/compaction.md
@@ -29,10 +29,10 @@ under the License.
 
 Doris writes data through a structure similar to LSM-Tree, and continuously 
merges small files into large ordered files through compaction in the 
background. Compaction handles operations such as deletion and updating. 
 
-Appropriately adjusting the compaction strategy can greatly improve load and 
query efficiency. Doris provides the following two compaction strategies for 
tuning:
+Appropriately adjusting the compaction strategy can greatly improve load and 
query efficiency. Doris provides the following compaction strategies for tuning:
 
 
-## Vertical Compaction
+## Vertical compaction
 
 <version since="1.2.2">
 </version>
@@ -47,7 +47,7 @@ BE configuration：
 - `vertical_compaction_max_segment_size` is used to configure the size of the 
disk file after vertical compaction, the default value is 268435456 (bytes)
 
 
-## Segment Compaction
+## Segment compaction
 
 Segment compaction mainly deals with the large-scale data load. Segment 
compaction operates during the load process and compact segments inside the 
job, which is different from normal compaction and vertical compaction. This 
mechanism can effectively reduce the number of generated segments and avoid the 
-238 (OLAP_ERR_TOO_MANY_SEGMENTS) errors.
 
@@ -72,3 +72,52 @@ Situations where segment compaction is not recommended:
 - When the load operation itself has exhausted memory resources, it is not 
recommended to use the segment compaction to avoid further increasing memory 
pressure and causing the load job to fail.
 
 Refer to this [link](https://github.com/apache/doris/pull/12866) for more 
information about implementation and test results.
+
+## Single replica compaction
+
+By default, compaction for multiple replicas is performed independently, with 
each replica consuming CPU and IO resources. When single replica compaction is 
enabled, only one replica performs the compaction. Afterward, the other 
replicas pull the compacted files from this replica, resulting in CPU resources 
being consumed only once, saving N - 1 times CPU usage (where N is the number 
of replicas).
+
+Single replica compaction is specified in the table's PROPERTIES via the 
parameter `enable_single_replica_compaction`, which is false by default 
(disabled). To enable it, set the parameter to true.
+
+This parameter can be specified when creating the table or modified later 
using:
+```sql
+ALTER TABLE table_name SET("enable_single_replica_compaction" = "true");
+```
+
+## Compaction strategy
+
+The compaction strategy determines when and which small files are merged into 
larger files. Doris currently offers two compaction strategies, specified by 
the `compaction_policy` parameter in the table properties.
+
+### Size-based compaction strategy
+
+The size-based compaction strategy is the default strategy and is suitable for 
most scenarios.
+```
+"compaction_policy" = "size_based"
+```
+
+### Time series compaction strategy
+
+The time series compaction strategy is optimized for scenarios like logs and 
time-series data. It leverages the time locality of time-series data, merging 
small files written in adjacent times into larger files. Each file participates 
in compaction only once, reducing write amplification from repeated compaction.
+
+```
+"compaction_policy" = "time_series"
+```
+
+The time series compaction strategy is triggered when any of the following 
conditions are met:
+- The size of unmerged files exceeds `time_series_compaction_goal_size_mbytes` 
(default 1 GB).
+- The number of unmerged files exceeds 
`time_series_compaction_file_count_threshold` (default 2000).
+- The time since the last compaction exceeds 
`time_series_compaction_time_threshold_seconds` (default 1 hour).
+
+These parameters are set in the table's PROPERTIES and can be specified when 
creating the table or modified later using:
+```
+ALTER TABLE table_name SET("name" = "value");
+```
+
+## Compaction concurrency control
+
+Compaction runs in the background and consumes CPU and IO resources. The 
resource consumption can be controlled by adjusting the number of concurrent 
compaction threads.
+
+The number of concurrent compaction threads is configured in the BE 
configuration file, including the following parameters:
+- `max_base_compaction_threads`: Number of base compaction threads, default is 
4.
+- `max_cumu_compaction_threads`: Number of cumulative compaction threads, 
default is 10.
+- `max_single_replica_compaction_threads`: Number of threads for fetching data 
files during single replica compaction, default is 10.
diff --git a/versioned_docs/version-2.1/sql-manual/sql-types/Data-Types/JSON.md 
b/versioned_docs/version-2.1/sql-manual/sql-types/Data-Types/JSON.md
index 341e43dbd55..4ce84debda6 100644
--- a/versioned_docs/version-2.1/sql-manual/sql-types/Data-Types/JSON.md
+++ b/versioned_docs/version-2.1/sql-manual/sql-types/Data-Types/JSON.md
@@ -26,20 +26,57 @@ under the License.
 
 ## JSON
 
-<version since="1.2.0">
+The JSON data type stores [JSON](https://www.rfc-editor.org/rfc/rfc8785) data 
efficiently in a binary format and allows access to its internal fields through 
JSON functions.
 
-</version>
+By default, it supports up to 1048576 bytes (1MB), and can be increased up to 
2147483643 bytes (2GB). This can be adjusted via the 
`string_type_length_soft_limit_bytes` configuration.
 
-NOTICE: In version 1.2.x the data type name is JSONB. It's renamed to JSON to 
be more compatible to version 2.0.0. And the old tables can still be used.
+Compared to storing JSON strings in a regular STRING type, the JSON type has 
two main advantages:
+1. JSON format validation during data insertion.
+2. More efficient binary storage format, enabling faster access to JSON 
internal fields using functions like `json_extract`, compared to `get_json_xx` 
functions.
 
-### description
-    JSON (Binary) datatype.
-        Use binary JSON format for storage and json function to extract field. 
Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes 
(2G),and the JSONB type is also limited by the be configuration 
`jsonb_type_length_soft_limit_bytes`.
+**Note**: In version 1.2.x, the JSON type was named JSONB. To maintain 
compatibility with MySQL, it was renamed to JSON starting from version 2.0.0. 
Older tables can still use the previous name.
 
-### note
-    There are some advantanges for JSON over plain JSON STRING.
-    1. JSON syntax will be validated on write to ensure data quality
-    2. JSON binary format is more efficient. Using json_extract functions on 
JSON datatype is 2-4 times faster than get_json_xx on JSON STRING format.
+### Syntax
+
+**Definition:**
+```sql
+json_column_name JSON
+```
+
+**Insertion:**
+- Using `INSERT INTO VALUES` with the format as a string surrounded by quotes. 
For example:
+```sql
+INSERT INTO table_name(id, json_column_name) VALUES (1, '{"k1": "100"}')
+```
+
+- For STREAM LOAD, the format for the corresponding column is a string without 
additional quotes. For example:
+```
+12     {"k1":"v31", "k2": 300}
+13     []
+14     [123, 456]
+```
+
+**Query:**
+- Directly select the entire JSON column:
+```sql
+SELECT json_column_name FROM table_name;
+```
+
+- Extract specific fields or other information from JSON using JSON functions. 
For example:
+```sql
+SELECT json_extract(json_column_name, '$.k1') FROM table_name;
+```
+
+- The JSON type can be cast to and from integers, strings, BOOLEAN, ARRAY, and 
MAP. For example:
+```sql
+SELECT CAST('{"k1": "100"}' AS JSON);
+SELECT CAST(json_column_name AS STRING) FROM table_name;
+SELECT CAST(json_extract(json_column_name, '$.k1') AS INT) FROM table_name;
+```
+
+:::tip
+The JSON type currently cannot be used for `GROUP BY`, `ORDER BY`, or 
comparison operations.
+:::
 
 ### example
 A tutorial for JSON datatype including create table, load data and query.
diff --git a/versioned_docs/version-2.1/table-design/data-type.md 
b/versioned_docs/version-2.1/table-design/data-type.md
index 47ba73ecbba..281aba8bafa 100644
--- a/versioned_docs/version-2.1/table-design/data-type.md
+++ b/versioned_docs/version-2.1/table-design/data-type.md
@@ -28,7 +28,7 @@ Apache Doris support standard SQL syntax, using MySQL Network 
Connection Protoco
 
 The list of data types supported by Doris is as follows:
 
-| Type name      | Number of bytes | Description                               
                   |
+| Type name      | Storeage (bytes)| Description                               
                   |
 | -------------- | --------------- | 
------------------------------------------------------------ |
 | BOOLEAN        | 1               | Boolean data type hat stores only two 
types of values , 0 represents false, 1 represents true. |
 | TINYINT        | 1               | Integer value, signed range is from -128 
to 127.             |
@@ -42,16 +42,16 @@ The list of data types supported by Doris is as follows:
 | DATE           | 16              | DATE holds values for a calendar year, 
month and day, the  supported range is ['0000-01-01', '9999-12-31'].  Default 
print format: 'yyyy-MM-dd'. |
 | DATETIME       | 16              | A DATE and TIME combination  Format: 
DATETIME ([P]).   The optional parameter P represents time precision, with a 
value range of [0,6], supporting up to 6 decimal places (microseconds). When 
not set, it is 0.   The supported range is ['0000-01-01 00:00:00 [.000000]', 
'9999-12-31 23:59:59 [.999999]'].   Default print format: 'yyy-MM-dd HH: mm: 
ss. SSSSSS '. |
 | CHAR           | M               | A FIXED length string, the parameter M 
specifies the column length in characters. The range of M is from 1 to 255. |
-| VARCHAR        | M               | A VARIABLE length string , the parameter 
M specifies the maximum string length in characters. The range of M is from 1 
to 65533.   The variable-length string is stored in UTF-8 encoding. English 
characters occupy 1 byte, and Chinese characters occupy 3 bytes. |
-| STRING         | /               | A VARIABLE length string, default 
supports 1048576 bytes (1 MB), and a limit of maximum precision of 2147483643 
bytes (2 GB).   Size can be configured string_type_length_soft_limit_bytes 
adjusted through BE.   String type can only be used in value column, not in key 
column and partition bucket column. |
-| HLL            | /               | HLL stands for HyperLogLog, is a fuzzy 
deduplication. It performs better than Count Distinct when dealing with large 
datasets.   The error rate of HLL is typically around 1%, and sometimes it can 
reach 2%. HLL cannot be used as a key column, and the aggregation type is 
HLL_UNION when creating a table.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data.  HLL columns can on [...]
-| BITMAP         | /               | BITMAP type can be used in Aggregate 
tables or Unique tables.  - When used in a Unique table, BITMAP must be 
employed as non-key columns.  - When used in an Aggregate table, BITMAP must 
also serve as non-key columns, and the aggregation type must be set to 
BITMAP_UNION during table creation.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data. BITMAP columns can only be qu [...]
-| QUANTILE_STATE | /               | A type used to calculate approximate 
quantile values.  When loading, it performs pre-aggregation for the same keys 
with different values. When the number of values does not exceed 2048, it 
records all data in detail. When the number of values is greater than 2048, it 
employs the TDigest algorithm to aggregate (cluster) the data and store the 
centroid points after clustering.   QUANTILE_STATE cannot be used as a key 
column and should be paired with the [...]
-| ARRAY          | /               | Arrays composed of elements of type T 
cannot be used as key columns. Currently supported for use in tables with 
Duplicate and Unique models. |
-| MAP            | /               | Maps consisting of elements of type K and 
V, cannot be used as Key columns. These maps are currently supported in tables 
using the Duplicate and Unique models. |
-| STRUCT         | /               | A structure composed of multiple Fields 
can also be understood as a collection of multiple columns. It cannot be used 
as a Key. Currently, STRUCT can only be used in tables of Duplicate models. The 
name and number of Fields in a Struct are fixed and are always Nullable.|
-| JSON           | /               | Binary JSON type, stored in binary JSON 
format, access internal JSON fields through JSON function.   Supported up to 
1048576 bytes (1MB) by default, and can be adjusted to a maximum of 2147483643 
bytes (2GB). This limit can be modified through the BE configuration parameter 
'jsonb_type_length_soft_limit_bytes'. |
-| AGG_STATE      | /               | Aggregate function can only be used with 
state/merge/union function combiners.   AGG_STATE cannot be used as a key 
column. When creating a table, the signature of the aggregate function needs to 
be declared alongside.   Users do not need to specify the length or default 
value. The actual data storage size depends on the function's implementation. |
-| VARIANT        | /               | Variant allows storing complex data 
structures containing different data types (such as integers, strings, boolean 
values, etc.) without the need to define specific columns in the table 
structure beforehand.During the writing process, this type can automatically 
infer column information based on the structure and types of the columns, 
dynamicly merge written schemas. It stores JSON keys and their corresponding 
values as columns and dynamic sub-columns. |
+| VARCHAR        | Variable Length | A VARIABLE length string , the parameter 
M specifies the maximum string length in characters. The range of M is from 1 
to 65533.   The variable-length string is stored in UTF-8 encoding. English 
characters occupy 1 byte, and Chinese characters occupy 3 bytes. |
+| STRING         | Variable Length | A VARIABLE length string, default 
supports 1048576 bytes (1 MB), and a limit of maximum precision of 2147483643 
bytes (2 GB).   Size can be configured string_type_length_soft_limit_bytes 
adjusted through BE.   String type can only be used in value column, not in key 
column and partition bucket column. |
+| ARRAY          | Variable Length | Arrays composed of elements of type T 
cannot be used as key columns. Currently supported for use in tables with 
Duplicate and Unique models. |
+| MAP            | Variable Length | Maps consisting of elements of type K and 
V, cannot be used as Key columns. These maps are currently supported in tables 
using the Duplicate and Unique models. |
+| STRUCT         | Variable Length | A structure composed of multiple Fields 
can also be understood as a collection of multiple columns. It cannot be used 
as a Key. Currently, STRUCT can only be used in tables of Duplicate models. The 
name and number of Fields in a Struct are fixed and are always Nullable.|
+| JSON           | Variable Length | Binary JSON type, stored in binary JSON 
format, access internal JSON fields through JSON function.   Supported up to 
1048576 bytes (1MB) by default, and can be adjusted to a maximum of 2147483643 
bytes (2GB). This limit can be modified through the BE configuration parameter 
'jsonb_type_length_soft_limit_bytes'. |
+| VARIANT        | Variable Length | The VARIANT data type is dynamically 
adaptable, specifically designed for semi-structured data like JSON. It can 
store any JSON object and automatically splits JSON fields into subcolumns for 
improved storage efficiency and query performance. The length limits and 
configuration methods are the same as for the STRING type. However, the VARIANT 
type can only be used in value columns and cannot be used in key columns or 
partition / bucket columns. |
+| HLL            | Variable Length | HLL stands for HyperLogLog, is a fuzzy 
deduplication. It performs better than Count Distinct when dealing with large 
datasets.   The error rate of HLL is typically around 1%, and sometimes it can 
reach 2%. HLL cannot be used as a key column, and the aggregation type is 
HLL_UNION when creating a table.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data.  HLL columns can on [...]
+| BITMAP         | Variable Length | BITMAP type can be used in Aggregate 
tables or Unique tables.  - When used in a Unique table, BITMAP must be 
employed as non-key columns.  - When used in an Aggregate table, BITMAP must 
also serve as non-key columns, and the aggregation type must be set to 
BITMAP_UNION during table creation.  Users do not need to specify the length or 
default value as it is internally controlled based on the aggregation level of 
the data. BITMAP columns can only be qu [...]
+| QUANTILE_STATE | Variable Length | A type used to calculate approximate 
quantile values.  When loading, it performs pre-aggregation for the same keys 
with different values. When the number of values does not exceed 2048, it 
records all data in detail. When the number of values is greater than 2048, it 
employs the TDigest algorithm to aggregate (cluster) the data and store the 
centroid points after clustering.   QUANTILE_STATE cannot be used as a key 
column and should be paired with the [...]
+| AGG_STATE      | Variable Length | Aggregate function can only be used with 
state/merge/union function combiners.   AGG_STATE cannot be used as a key 
column. When creating a table, the signature of the aggregate function needs to 
be declared alongside.   Users do not need to specify the length or default 
value. The actual data storage size depends on the function's implementation. |
 
 You can also view all the data types supported by Doris with the `SHOW DATA 
TYPES; `statement.
diff --git a/versioned_sidebars/version-2.0-sidebars.json 
b/versioned_sidebars/version-2.0-sidebars.json
index f0dccf793ef..ffcd9681862 100644
--- a/versioned_sidebars/version-2.0-sidebars.json
+++ b/versioned_sidebars/version-2.0-sidebars.json
@@ -1131,13 +1131,13 @@
                                 "sql-manual/sql-reference/Data-Types/CHAR",
                                 "sql-manual/sql-reference/Data-Types/VARCHAR",
                                 "sql-manual/sql-reference/Data-Types/STRING",
-                                "sql-manual/sql-reference/Data-Types/HLL",
-                                "sql-manual/sql-reference/Data-Types/BITMAP",
-                                
"sql-manual/sql-reference/Data-Types/QUANTILE_STATE",
                                 "sql-manual/sql-reference/Data-Types/ARRAY",
                                 "sql-manual/sql-reference/Data-Types/MAP",
                                 "sql-manual/sql-reference/Data-Types/STRUCT",
                                 "sql-manual/sql-reference/Data-Types/JSON",
+                                "sql-manual/sql-reference/Data-Types/HLL",
+                                "sql-manual/sql-reference/Data-Types/BITMAP",
+                                
"sql-manual/sql-reference/Data-Types/QUANTILE_STATE",
                                 "sql-manual/sql-reference/Data-Types/AGG_STATE"
                             ]
                         },
diff --git a/versioned_sidebars/version-2.1-sidebars.json 
b/versioned_sidebars/version-2.1-sidebars.json
index b20095f8876..498e538204e 100644
--- a/versioned_sidebars/version-2.1-sidebars.json
+++ b/versioned_sidebars/version-2.1-sidebars.json
@@ -1140,17 +1140,17 @@
                                 "sql-manual/sql-types/Data-Types/CHAR",
                                 "sql-manual/sql-types/Data-Types/VARCHAR",
                                 "sql-manual/sql-types/Data-Types/STRING",
-                                "sql-manual/sql-types/Data-Types/HLL",
-                                "sql-manual/sql-types/Data-Types/BITMAP",
-                                
"sql-manual/sql-types/Data-Types/QUANTILE_STATE",
+                                "sql-manual/sql-types/Data-Types/IPV4",
+                                "sql-manual/sql-types/Data-Types/IPV6",
                                 "sql-manual/sql-types/Data-Types/ARRAY",
                                 "sql-manual/sql-types/Data-Types/MAP",
                                 "sql-manual/sql-types/Data-Types/STRUCT",
                                 "sql-manual/sql-types/Data-Types/JSON",
-                                "sql-manual/sql-types/Data-Types/AGG_STATE",
                                 "sql-manual/sql-types/Data-Types/VARIANT",
-                                "sql-manual/sql-types/Data-Types/IPV4",
-                                "sql-manual/sql-types/Data-Types/IPV6"
+                                "sql-manual/sql-types/Data-Types/HLL",
+                                "sql-manual/sql-types/Data-Types/BITMAP",
+                                
"sql-manual/sql-types/Data-Types/QUANTILE_STATE",
+                                "sql-manual/sql-types/Data-Types/AGG_STATE"
                             ]
                         }
                     ]


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

(doris-website) branch master updated: [opt](docs) Opt documents for datatypes and compaction (#792)

Reply via email to