airborne12 opened a new pull request, #60790:
URL: https://github.com/apache/doris/pull/60790

   ### What problem does this PR solve?
   
   Problem Summary:
   
   This PR adds searcher cache reuse and a DSL result cache for the `search()` 
function to improve query performance on repeated search queries against the 
same segments.
   
   **Key changes:**
   
   1. **DSL result cache**: Caches the final roaring bitmap per (segment, DSL) 
pair so repeated identical `search()` queries skip Lucene execution entirely. 
Uses length-prefix key encoding to avoid hash collisions.
   
   2. **Deep-copy bitmap semantics**: Bitmaps are deep-copied on both cache 
read and write to prevent `mask_out_null()` from polluting cached entries.
   
   3. **Type-safe cache accessor**: Replaces raw `void*` return with a template 
`get_value<T>()` that uses `static_assert` to ensure T derives from 
`LRUCacheValueBase`.
   
   4. **Session-level cache toggle**: Adds `enable_search_function_query_cache` 
session variable (default: true) to allow disabling the cache per query via 
`SET_VAR`.
   
   5. **Const-correctness fix**: Removes unsafe `const_cast` in 
`build_dsl_signature` by copying TSearchParam before Thrift serialization.
   
   6. **Defensive improvements**: Adds null check for `result_bitmap` on cache 
hit, logging for serialization fallback and cache bypass paths.
   
   ### Release note
   
   Add DSL result cache for search() function to skip repeated Lucene execution 
on identical queries.
   
   ### Check List (For Author)
   
   - Test
       - [x] Regression test
       - [x] Unit Test
       - [ ] Manual test (add detailed scripts or steps below)
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason
   
   - Behavior changed:
       - [x] Yes. Adds a new in-memory LRU cache 
(`search_function_query_cache_limit`, default 5% of mem_limit) for search() DSL 
results. Can be disabled per-query with `SET enable_search_function_query_cache 
= false`.
   
   - Does this need documentation?
       - [x] Yes. New session variable `enable_search_function_query_cache` and 
BE config `search_function_query_cache_limit`.
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to