Re: [PR] [doc] Update date-partition EN doc in 2.1 & dev version [doris-website]

via GitHub Mon, 08 Jul 2024 05:49:42 -0700


luzhijing commented on code in PR #824:
URL: https://github.com/apache/doris-website/pull/824#discussion_r1668560787



##########
versioned_docs/version-2.1/table-design/data-partition.md:
##########
@@ -38,31 +38,31 @@ A table consists of rows and columns:
 
 - Column: Used to describe different fields in a row of data;
 
-- Columns can be divided into two types: Key and Value. From a business 
perspective, Key and Value can correspond to dimension columns and metric 
columns, respectively. The key columns in Doris are those specified in the 
table creation statement, which are the columns following the keywords `unique 
key`, `aggregate key`, or `duplicate key`. The remaining columns are value 
columns. From the perspective of the aggregation model, rows with the same Key 
columns will be aggregated into a single row. The aggregation method for value 
columns is specified by the user during table creation. For more information on 
aggregation models, refer to the Doris [Data 
Model](../table-design/data-model/overview).
+- Columns can be divided into two types: Key and Value. From a business 
perspective, Key and Value can correspond to dimension columns and metric 
columns respectively. The key columns in Apache Doris are those specified in 
the table creation statement, which are the columns following the keywords  
`unique key`, `aggregate key`, or `duplicate key`. The remaining columns are 
value columns. From the perspective of the aggregation model, rows with the 
same Key columns will be aggregated into a single row. The aggregation method 
for value columns is specified by the user during table creation. For more 
information on aggregation models, refer to the Doris [Data 
Model](../table-design/data-model/overview).
 
 ### Partition & Tablet
 
-Doris supports two levels of data partitioning. The first level is 
Partitioning, which supports Range and List partition. The second level is 
Bucket (also known as Tablet), which supports Hash and Random . If no 
partitioning is established during table creation, Doris generates a default 
partition that is transparent to the user. When using the default partition, 
only Bucket is supported.
+Apache Doris supports two levels of data partitioning. The first level is 
partition, which supports RANGE partitioning and LIST partitioning. The second 
level is tablet (also called bucket), which supports Hash bucket and Random 
bucket. If no partition is established during table creation, Apache Doris 
generates a default partition that is transparent to the user. When using the 
default partition, only bucket is supported.
 
-In the Doris storage engine, data is horizontally partitioned into several 
tablets. Each tablet contains several rows of data. There is no overlap between 
the data in different tablets, and they are stored physically independently.
+In the Apache Doris storage engine, data is horizontally partitioned into 
several tablets. Each tablet contains several rows of data. There is no overlap 
between the data in different tablets, and they are stored physically 
independently.
 
 Multiple tablets logically belong to different partitions. A single tablet 
belongs to only one partition, while a partition contains several tablets. 
Because tablets are stored physically independently, partitions can also be 
considered physically independent. The tablet is the smallest physical storage 
unit for operations such as data movement and replication.
 
 Several partitions compose a table. The partition can be considered the 
smallest logical management unit.
 
-Benefits of Two-Level data partitioning:
+The benefits of Apache Doris's two-level data partitioning are as follows:
 
-- For dimensions with time or similar ordered values, such dimension columns 
can be used as partitioning columns. The partition granularity can be evaluated 
based on import frequency and partition data volume.
+- Columns with ordered values can be used as partitioning columns. The 
partition granularity can be evaluated based on import frequency and partition 
data volume.
 
-- Historical data deletion requirements: If there is a need to delete 
historical data (such as retaining only the data for the most recent several 
days), composite partition can be used to achieve this goal by deleting 
historical partitions. Alternatively, DELETE statements can be sent within 
specified partitions to delete data.
+- If there is a need to delete historical data (such as retaining only the 
data for the most recent several days), composite partition can be used to 
achieve this goal by deleting historical partitions. Alternatively, `DELETE` 
statements can be sent within specified partitions to delete data.
 
-- Solving data skew issues: Each partition can specify the number of buckets 
independently. For example, when partitioning by day and there are significant 
differences in data volume between days, the number of buckets for each 
partition can be specified to reasonably distribute data across different 
partitions. It is recommended to choose a column with high distinctiveness as 
the bucketing column.
+- Each partition can specify the number of buckets independently. For example, 
when data is partitioned by day and there are significant differences in data 
volume between days, the number of tablets for each partition can be specified 
to reasonably distribute data across different partitions. It is recommended to 
choose a column with high distinctiveness as the bucketing column.

Review Comment:
   buckets



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@doris.apache.org
For additional commands, e-mail: dev-h...@doris.apache.org

Re: [PR] [doc] Update date-partition EN doc in 2.1 & dev version [doris-website]

Reply via email to