This is an automated email from the ASF dual-hosted git repository.

kassiez pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 0d9a8d4baa5 [Fix](dictionary) Fix wrong description in dictionary docs 
(#2352)
0d9a8d4baa5 is described below

commit 0d9a8d4baa572299533fea14c76be451feee42ea
Author: zclllyybb <zhaochan...@selectdb.com>
AuthorDate: Tue May 6 12:22:43 2025 +0800

    [Fix](dictionary) Fix wrong description in dictionary docs (#2352)
    
    ## Versions
    
    - [x] dev
    - [ ] 3.0
    - [ ] 2.1
    - [ ] 2.0
    
    ## Languages
    
    - [x] Chinese
    - [x] English
    
    ## Docs Checklist
    
    - [ ] Checked by AI
    - [ ] Test Cases Built
---
 docs/query-acceleration/dictionary.md                     | 13 +++++++++----
 .../current/query-acceleration/dictionary.md              | 15 ++++++++++-----
 2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/docs/query-acceleration/dictionary.md 
b/docs/query-acceleration/dictionary.md
index cd79b71625e..09ab51c28f7 100644
--- a/docs/query-acceleration/dictionary.md
+++ b/docs/query-acceleration/dictionary.md
@@ -196,7 +196,7 @@ Currently, two layout types are supported:
 
 |Property Name|Value Type|Meaning|Required|
 |-|-|-|-|
-|`date_lifetime`|Integer, unit in seconds|Data validity period. When the time 
since the last update of this dictionary exceeds this value, it will 
automatically initiate a import. The import logic is detailed in [Automatic 
Import](#automatic-import)|Yes|
+|`date_lifetime`|Integer, unit in seconds|Data validity period. When the time 
since the last update of this dictionary exceeds this value and the source 
table has data changes, it will automatically initiate a import. The import 
logic is detailed in [Automatic Import](#automatic-import)|Yes|
 |`skip_null_key`|Boolean|If the Key column contains null values when load to a 
dictionary, skip the row if the value is `true`, otherwise raise an error. The 
default value is `false`|No|
 |`memory_limit`|Integer, unit in bytes|The upper limit of memory occupied by 
this dictionary on a single BE. The deafult value is `2147483648`, which equals 
to 2GB.|No|
 
@@ -248,7 +248,11 @@ Automatic import occurs at the following times:
 
 1. After the dictionary is established
 2. When the dictionary data expires (see [Property](#property))
-3. When the BE state shows the loss of the dictionary data (new BE going 
online, or old BE restarting, etc.)
+3. When the BE state shows the lack of the dictionary data (new BE going 
online, or old BE restarting, etc.)
+
+Doris will check all dictionary data for expiration every 
`dictionary_auto_refresh_interval_seconds` seconds. When a dictionary has not 
been updated for more than `data_lifetime` seconds, and the source table data 
has changed compared to the last import, Doris will automatically submit the 
import for that dictionary.
+
+If some BEs are missing data and the source table data has not changed 
compared to the last import, Doris will only fill in the current version of the 
data on the corresponding BEs, will not submit the refresh task for all BEs, 
and the dictionary's version will not change.
 
 #### Manual Import
 
@@ -290,7 +294,7 @@ Among:
 - `<query_key_values>` is a STRUCT that contains all Key columns of the data 
to be queried in a dictionary.
 
 The return type of `dict_get` is the dictionary column type corresponding to 
`<query_column>`.
-The return type of `dict_get_many` is a 
[STRUCT](../sql-manual/sql-data-types/semi-structured/STRUCT) corresponding to 
the types of various dictionary columns in `<query_columns>`。
+The return type of `dict_get_many` is a 
[STRUCT](../sql-manual/basic-element/sql-data-types/semi-structured/STRUCT) 
corresponding to the types of various dictionary columns in `<query_columns>`。
 
 #### Query Example
 
@@ -347,6 +351,7 @@ returns type of `STRUCT<float, varchar>`。
 1. When the query Key data does not exist in the dictionary table, **or the 
Key data is null**, return null.
 2. For IP_TRIE type queries, **`<query_key_value>` type must be `IPV4` or 
`IPV6`**.
 3. When using an IP_TRIE type dictionary, the data in the Key column 
`<key_column>` and the `<query_key_value>` used for querying both support 
`IPV4` and `IPV6` format data.
+4. When a specific BE lacks dictionary data due to reasons such as new launch 
or restart, executing a query using corresponding dictionary on that BE will 
fail. Whether the query is scheduled to that BE depends on various factors. 
Reducing the value of the configuration item 
`dictionary_auto_refresh_interval_seconds` when the FE Master is not under 
heavy pressure can shorten the time when the dictionary is unavailable.
 
 ### Dictionary Management
 
@@ -379,7 +384,7 @@ The dictionary table supports the following configuration 
items, all of which ar
 1. `dictionary_task_queue_size` —— The queue length of the thread pool for all 
tasks in the dictionary is not dynamically adjustable. The default value is 
1024, and it is generally not necessary to adjust it.
 2. `job_dictionary_task_consumer_thread_num` —— The number of threads in the 
thread pool for all tasks in the dictionary is not dynamically adjustable. 
Default value is 3.
 3. `dictionary_rpc_timeout_ms` —— The timeout duration for all related RPCs in 
the dictionary can be dynamically adjusted. The default is 5000 (i.e., 5 
seconds), and it generally does not need to be adjusted.
-4. `dictionary_auto_refresh_interval_seconds` —— The interval for 
automatically checking if all dictionary data is up to date is default 60 
(seconds), and it can be dynamically adjusted.
+4. `dictionary_auto_refresh_interval_seconds` —— The interval for 
automatically checking if all dictionary data is up to date is default 5 
(seconds), and it can be dynamically adjusted.
 
 ### Status Display
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/dictionary.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/dictionary.md
index 9f2a91704b5..dd5fb9a9f52 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/dictionary.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/dictionary.md
@@ -196,7 +196,7 @@ PROPERTIES(
 
 |属性名|值类型|含义|必须项|
 |-|-|-|-|
-|`date_lifetime`|整数,单位为秒|数据有效期。当该字典上次更新距今时间超过该值时,将会自动发起重新导入,导入逻辑详见[自动导入](#自动导入)|是|
+|`date_lifetime`|整数,单位为秒|数据有效期。当该字典上次更新距今时间超过该值且基表有数据变化时,将会自动发起重新导入,导入逻辑详见[自动导入](#自动导入)|是|
 |`skip_null_key`|布尔值|向字典导入时如果 Key 列中出现 null 值,如果该值为 `true`,跳过该行数据,否则报错。缺省值为 
`false`|否|
 |`memory_limit`|整数,单位为 byte|该字典在单一 BE 上所占内存的上限,缺省值为 `2147483648` 即 2GB|否|
 
@@ -240,7 +240,7 @@ PROPERTIES('data_lifetime' = '600');
 
 ### 导入(刷新)数据
 
-字典支持自动与手动导入。字典的导入也称为”刷新“操作。
+字典支持自动与手动导入。字典的导入也被称为“刷新”操作。
 
 #### 自动导入
 
@@ -248,7 +248,11 @@ PROPERTIES('data_lifetime' = '600');
 
 1. 字典建立以后
 2. 字典数据过期时(见[属性](#属性))
-3. BE 状态显示丢失该字典数据(有新 BE 上线,或旧 BE 重启等均有可能造成)
+3. BE 状态显示缺少该字典数据(有新 BE 上线,或旧 BE 重启等均有可能造成)
+
+Doris 将每隔 `dictionary_auto_refresh_interval_seconds` 秒检查所有字典数据是否过期。当某字典未更新数据超过 
`data_lifetime` 秒,且**基表数据相比上次导入时有变化**时,Doris 将会自动提交对该字典的导入。
+
+如果部分 BE 缺少数据,且基表数据相比上次导入没有变化,则 Doris 仅会在对应 BE 上补齐当前版本的数据,不会提交全体 BE 的刷新任务,字典的 
version 也不会变化。
 
 #### 手动导入
 
@@ -290,7 +294,7 @@ dict_get_many("<db_name>.<dict_name>", <query_columns>, 
<query_key_values>);
 - `<query_key_values>` 为一个包含该字典**所有 key 列**的需查询数据的 STRUCT
 
 `dict_get` 的返回类型为 `<query_column>` 对应的字典列类型。
-`dict_get_many` 的返回类型为 `<query_columns>` 对应的各个字典列类型所组成的 
[STRUCT](../sql-manual/sql-data-types/semi-structured/STRUCT)。
+`dict_get_many` 的返回类型为 `<query_columns>` 对应的各个字典列类型所组成的 
[STRUCT](../sql-manual/basic-element/sql-data-types/semi-structured/STRUCT)。
 
 #### 查询示例
 
@@ -347,6 +351,7 @@ SELECT dict_get_many("test_db.multi_key_dict", ["k2", 
"k3"], struct(2, 'ABC'));
 1. 当查询的 Key 数据不存在于字典表内,**或 Key 数据为 null 时**,返回 null。
 2. IP_TRIE 类型进行查询时,**`<query_key_value>` 类型必须为 `IPV4` 或 `IPV6`**。
 3. 使用 IP_TRIE 类型字典时,key 列 `<key_column>` 内的数据和查询时使用的 `<query_key_value>` 同时支持 
`IPV4` 和 `IPV6` 格式数据。
+4. 当特定 BE 因为新上线或宕机重启等原因没有字典数据时,如果在该 BE 上执行对应字典的查询将会失败。查询是否调度到该 BE 取决于多种因素。在 FE 
Master 压力不大时减小[配置项](#配置项) `dictionary_auto_refresh_interval_seconds` 
的值可以缩短字典不可用时间。
 
 ### 字典表管理
 
@@ -379,7 +384,7 @@ SELECT dict_get_many("test_db.multi_key_dict", ["k2", 
"k3"], struct(2, 'ABC'));
 1. `dictionary_task_queue_size` —— 字典所有任务的线程池的队列长度,不可动态调整。默认值 1024,一般不需要调整。
 2. `job_dictionary_task_consumer_thread_num` —— 字典所有任务的线程池的线程数量,不可动态调整。默认值 3。
 3. `dictionary_rpc_timeout_ms` —— 字典所有相关 rpc 的超时时间,可以动态调整。默认 5000(即 
5s),一般不需要调整。
-4. `dictionary_auto_refresh_interval_seconds` —— 自动检查所有字典数据是否过期的间隔,默认 
60(秒),可以动态调整。
+4. `dictionary_auto_refresh_interval_seconds` —— 自动检查所有字典数据是否过期的间隔,默认 
5(秒),可以动态调整。
 
 ### 状态显示
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to