This is an automated email from the ASF dual-hosted git repository. jakevin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push: new aa87e31b6f [doc](cold hot separation)cold hot separation document adjustment (#15811) aa87e31b6f is described below commit aa87e31b6f617eb9bf3418642e08eb14ea9de941 Author: catpineapple <42031973+catpineap...@users.noreply.github.com> AuthorDate: Tue Jan 24 23:24:28 2023 +0800 [doc](cold hot separation)cold hot separation document adjustment (#15811) --- docs/en/docs/admin-manual/config/be-config.md | 10 +++++++-- docs/en/docs/advanced/cold_hot_separation.md | 26 +++++++++++++++------- .../Alter/ALTER-STORAGE-POLICY.md | 2 ++ docs/zh-CN/docs/admin-manual/config/be-config.md | 8 ++++++- docs/zh-CN/docs/advanced/cold_hot_separation.md | 16 +++++++++++-- .../Alter/ALTER-STORAGE-POLICY.md | 2 ++ 6 files changed, 51 insertions(+), 13 deletions(-) diff --git a/docs/en/docs/admin-manual/config/be-config.md b/docs/en/docs/admin-manual/config/be-config.md index f5c01c2291..7823885289 100644 --- a/docs/en/docs/admin-manual/config/be-config.md +++ b/docs/en/docs/admin-manual/config/be-config.md @@ -989,8 +989,8 @@ Metrics: {"filtered_rows":0,"input_row_num":3346807,"input_rowsets_count":42,"in #### `file_cache_type` * Type: string -* Description: Type of cache file. whole_ file_ Cache: download the entire segment file, sub_ file_ Cache: the segment file is divided into multiple files by size. -* Default value: null +* Description: Type of cache file.`whole_file_cache`: download the entire segment file, `sub_file_cache`: the segment file is divided into multiple files by size. if set "", no cache, please set this parameter when caching is required. +* Default value: "" #### `file_cache_alive_time_sec` @@ -998,6 +998,12 @@ Metrics: {"filtered_rows":0,"input_row_num":3346807,"input_rowsets_count":42,"in * Description: Save time of cache file * Default value: 604800 (1 week) +#### `file_cache_max_size_per_disk` + +* Type: int64 +* Description: The cache occupies the disk size. Once this setting is exceeded, the cache that has not been accessed for the longest time will be deleted. If it is 0, the size is not limited. unit is bytes. +* Default value: 0 + #### `max_sub_cache_file_size` * Type: int64 diff --git a/docs/en/docs/advanced/cold_hot_separation.md b/docs/en/docs/advanced/cold_hot_separation.md index faac4e522c..63d1617b8f 100644 --- a/docs/en/docs/advanced/cold_hot_separation.md +++ b/docs/en/docs/advanced/cold_hot_separation.md @@ -1,7 +1,7 @@ --- { - "title": "cold hot separation", - "language": "en" +"title": "cold hot separation", +"language": "en" } --- @@ -51,6 +51,10 @@ The cold and hot separation supports all doris functions, but only places some d The storage policy is the entry to use the cold and hot separation function. Users only need to associate a storage policy with a table or partition during table creation or doris use. that is, they can use the cold and hot separation function. +<version since="dev"></version> When creating an S3 RESOURCE, the S3 remote link verification will be performed to ensure that the RESOURCE is created correctly. + +In addition, fe configuration needs to be added: `enable_storage_policy=true` + For example: ``` @@ -95,21 +99,27 @@ Or associate a storage policy with an existing partition ``` ALTER TABLE create_table_partition MODIFY PARTITION (*) SET("storage_policy"="test_policy"); ``` -For details, please refer to the resource, policy, create table, alter and other documents in the docs directory +For details, please refer to the [resource](../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-RESOURCE.md), [policy](../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-POLICY.md), create table, alter and other documents in the docs directory ### Some restrictions - A single table or a single partition can only be associated with one storage policy. After association, the storage policy cannot be dropped - The object information associated with the storage policy does not support modifying the data storage path information, such as bucket, endpoint, and root_ Path and other information -- Currently, the storage policy only supports creation, not deletion +- Currently, the storage policy only supports creation and modification, not deletion ## Show size of objects occupied by cold data -方式一: -Through show proc '/backends', you can view the size of each object being uploaded to, and the RemoteUsedCapacity item. +1. Through show proc '/backends', you can view the size of each object being uploaded to, and the RemoteUsedCapacity item. + +2. Through show tables from tableName, you can view the object size occupied by each table, and the RemoteDataSize item. -方式二: -Through show tables from tableName, you can view the object size occupied by each table, and the RemoteDataSize item. +## cold data cache +As above, cold data introduces the cache in order to optimize query performance. After the first hit after cooling, Doris will reload the cooled data to be's local disk. The cache has the following characteristics: +- The cache is actually stored on the be local disk and does not occupy memory. +- the cache can limit expansion and clean up data through LRU +- The be parameter `file_cache_alive_time_sec` can set the maximum storage time of the cache data after it has not been accessed. The default is 604800, which is one week. +- The be parameter `file_cache_max_size_per_disk` can set the disk size occupied by the cache. Once this setting is exceeded, the cache that has not been accessed for the longest time will be deleted. The default is 0, means no limit to the size, unit: byte. +- The be parameter `file_cache_type` is optional `sub_file_cache` (segment the remote file for local caching) and `whole_file_cache` (the entire remote file for local caching), the default is "", means no file is cached, please set it when caching is required this parameter. ## Unfinished Matters diff --git a/docs/en/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-STORAGE-POLICY.md b/docs/en/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-STORAGE-POLICY.md index ed17566afd..30b44285c0 100644 --- a/docs/en/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-STORAGE-POLICY.md +++ b/docs/en/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-STORAGE-POLICY.md @@ -48,6 +48,8 @@ ALTER STORAGE POLICY has_test_policy_to_alter PROPERTIES("cooldown_datetime" = " 2. Modify the name to coolown_countdown of hot and cold separation data migration of ttl ```sql ALTER STORAGE POLICY has_test_policy_to_alter PROPERTIES ("cooldown_ttl" = "10000"); +ALTER STORAGE POLICY has_test_policy_to_alter PROPERTIES ("cooldown_ttl" = "1h"); +ALTER STORAGE POLICY has_test_policy_to_alter PROPERTIES ("cooldown_ttl" = "3d"); ``` ### Keywords diff --git a/docs/zh-CN/docs/admin-manual/config/be-config.md b/docs/zh-CN/docs/admin-manual/config/be-config.md index ef8a498f51..754c11294e 100644 --- a/docs/zh-CN/docs/admin-manual/config/be-config.md +++ b/docs/zh-CN/docs/admin-manual/config/be-config.md @@ -1003,7 +1003,7 @@ Metrics: {"filtered_rows":0,"input_row_num":3346807,"input_rowsets_count":42,"in #### `file_cache_type` * 类型:string -* 描述:缓存文件的类型。whole_file_cache:将segment文件整个下载,sub_file_cache:将segment文件按大小切分成多个文件。 +* 描述:缓存文件的类型。`whole_file_cache`:将segment文件整个下载,`sub_file_cache`:将segment文件按大小切分成多个文件。设置为"",则不缓存文件,需要缓存的时候请设置此参数。 * 默认值:"" #### `file_cache_alive_time_sec` @@ -1012,6 +1012,12 @@ Metrics: {"filtered_rows":0,"input_row_num":3346807,"input_rowsets_count":42,"in * 描述:缓存文件的保存时间,单位:秒 * 默认值:604800(1个星期) +#### `file_cache_max_size_per_disk` + +* 类型:int64 +* 描述:缓存占用磁盘大小,一旦超过这个设置,会删除最久未访问的缓存,为0则不限制大小。单位字节 +* 默认值:0 + #### `max_sub_cache_file_size` * 类型:int64 diff --git a/docs/zh-CN/docs/advanced/cold_hot_separation.md b/docs/zh-CN/docs/advanced/cold_hot_separation.md index 02d6aad789..1c9a84b745 100644 --- a/docs/zh-CN/docs/advanced/cold_hot_separation.md +++ b/docs/zh-CN/docs/advanced/cold_hot_separation.md @@ -51,6 +51,10 @@ under the License. 存储策略是使用冷热分离功能的入口,用户只需要在建表或使用doris过程中,给表或分区关联上storage policy,即可以使用冷热分离的功能。 +<version since="dev"></version> 创建S3 RESOURCE的时候,会进行S3远端的链接校验,以保证RESOURCE创建的正确。 + +此外,需要新增fe配置:`enable_storage_policy=true` + 例如: ``` @@ -95,13 +99,13 @@ ALTER TABLE create_table_not_have_policy set ("storage_policy" = "test_policy"); ``` ALTER TABLE create_table_partition MODIFY PARTITION (*) SET("storage_policy"="test_policy"); ``` -具体可以参考docs目录下resource、policy、create table、alter等文档,里面有详细介绍 +具体可以参考docs目录下[resource](../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-RESOURCE.md)、[policy](../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-POLICY.md)、create table、alter等文档,里面有详细介绍 ### 一些限制 - 单表或单partition只能关联一个storage policy,关联后不能drop掉storage policy - storage policy关联的对象信息不支持修改数据存储path的信息,比如bucket、endpoint、root_path等信息 -- storage policy目前只支持创建,不支持删除 +- storage policy目前只支持创建和修改,不支持删除 ## 冷数据占用对象大小 方式一: @@ -110,6 +114,14 @@ ALTER TABLE create_table_partition MODIFY PARTITION (*) SET("storage_policy"="te 方式二: 通过show tablets from tableName可以查看到表的每个tablet占用的对象大小,RemoteDataSize项 +## 冷数据的cache +上文提到冷数据为了优化查询的性能和对象存储资源节省,引入了cache的概念。在冷却后首次命中,Doris会将已经冷却的数据又重新加载到be的本地磁盘,cache有以下特性: +- cache实际存储于be磁盘,不占用内存空间。 +- cache可以限制膨胀,通过LRU进行数据的清理 +- be参数`file_cache_alive_time_sec`可以设置cache数据再未被访问后的最大保存时间,默认是604800,即一周。 +- be参数`file_cache_max_size_per_disk` 可以设置cache占用磁盘大小,一旦超过这个设置,会删除最久未访问cache,默认是0,单位:字节,即不限制大小。 +- be参数`file_cache_type` 可选项`sub_file_cache`(切分远端文件进行本地缓存)和`whole_file_cache`(整个远端文件进行本地缓存),默认为"",即不缓存文件,需要缓存的时候请设置此参数。 + ## 未尽事项 diff --git a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-STORAGE-POLICY.md b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-STORAGE-POLICY.md index eab0fbb3c1..90273e49f5 100644 --- a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-STORAGE-POLICY.md +++ b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-STORAGE-POLICY.md @@ -48,6 +48,8 @@ ALTER STORAGE POLICY has_test_policy_to_alter PROPERTIES("cooldown_datetime" = " 2. 修改名为 cooldown_ttl的冷热分离数据迁移倒计时 ```sql ALTER STORAGE POLICY has_test_policy_to_alter PROPERTIES ("cooldown_ttl" = "10000"); +ALTER STORAGE POLICY has_test_policy_to_alter PROPERTIES ("cooldown_ttl" = "1h"); +ALTER STORAGE POLICY has_test_policy_to_alter PROPERTIES ("cooldown_ttl" = "3d"); ``` ### Keywords --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org