Haoyu Zhai created LUCENE-10103:
-----------------------------------

             Summary: QueryCache not estimating query size properly
                 Key: LUCENE-10103
                 URL: https://issues.apache.org/jira/browse/LUCENE-10103
             Project: Lucene - Core
          Issue Type: Improvement
          Components: core/search
            Reporter: Haoyu Zhai


QueryCache seems estimating the cached query size using a 
[constant|https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L302],
 it will cause OOM error in some extreme cases where queries cached will use 
far more memories than assumed. (The default QueryCache tries to use [only 5% 
of 
heap|https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L89])
 One example of such memory-eating query is AutomatonQuery, it will each carry 
a 
[RunAutomaton|https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/automaton/RunAutomaton.java#L42]
 , which consumes a good amount of memory in exchange for the speed.

On the other hand, we actually have a good implementation of {{Accountable}} 
interface for AutomatonQuery (though it will become a bit more complicated 
later since this query will eventually be rewritten to something else), so 
maybe QueryCache could use those estimation directly (using an {{instanceof}} 
check)? Or moreover we could make all {{Query}} implement {Accountable}}, and 
maybe the default implementation could just be returning the current constant 
we're using, and only override the method of the potential troublesome queries?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to