Haoyu Zhai created LUCENE-10103:
-----------------------------------
Summary: QueryCache not estimating query size properly
Key: LUCENE-10103
URL: https://issues.apache.org/jira/browse/LUCENE-10103
Project: Lucene - Core
Issue Type: Improvement
Components: core/search
Reporter: Haoyu Zhai
QueryCache seems estimating the cached query size using a
[constant|https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L302],
it will cause OOM error in some extreme cases where queries cached will use
far more memories than assumed. (The default QueryCache tries to use [only 5%
of
heap|https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L89])
One example of such memory-eating query is AutomatonQuery, it will each carry
a
[RunAutomaton|https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/automaton/RunAutomaton.java#L42]
, which consumes a good amount of memory in exchange for the speed.
On the other hand, we actually have a good implementation of {{Accountable}}
interface for AutomatonQuery (though it will become a bit more complicated
later since this query will eventually be rewritten to something else), so
maybe QueryCache could use those estimation directly (using an {{instanceof}}
check)? Or moreover we could make all {{Query}} implement {Accountable}}, and
maybe the default implementation could just be returning the current constant
we're using, and only override the method of the potential troublesome queries?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]