sgup432 opened a new issue, #14222: URL: https://github.com/apache/lucene/issues/14222
### Description Given the significant role of LRUQueryCache in Lucene, I see opportunities to enhance its performance. Although there have been many discussions on this topic like [here](https://github.com/apache/lucene/issues/10081), they haven’t led to anywhere. Currently, QueryCache uses a `Map<CacheKey, LeafCache>`, where: - **CacheKey** is tied to a specific segment. - **LeafCache** is per-segment and maintains a Map<Query, CacheAndCount>, where CacheAndCount is essentially a docID set. ### Current Issue: We use a global read/write lock to put items into the cache. However, since CacheKey is our primary key, this limits concurrency. For example, if there are only 10 segments but 10,000 unique queries, we end up with just 10 unique cache keys. This means a large number of requests compete for the same write lock, significantly degrading performance by either waiting or not using cache (by skipping it if lock not obtained) ### One simple Solution: - **Introduce a Composite Key** - Instead of CacheKey alone, use (CacheKey, Query) as the key. - The new structure would be `Map<CompositeKey, CacheCount>` - **Partition the Cache for Better Concurrency** - Create multiple LRU partitions (e.g., 8, 16, or 64 , configurable). - Each partition has its own read/write lock to reduce contention. - Assign incoming requests to specific partition using: `hash(CacheKey, Query) % partition_count` (as an example) - I also see we use uniqueQueries/mostRecentlyUsedQueries [here](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L94-L98), to maintain a LRU list for the queries. I think we can remove it and instead maintain a separate LRU list per partition, using CompositeKey. - For Eviction & Cleanup, we can collect stale `CacheKey` entries in a `Set<CacheKey>` when segment is closed due to merge. A background thread later iterates over cache entries and clears them, similar to how RequestCache in OpenSearch works. Let me know what you think. I think it might work pretty well(and simple enough) but I am open to other ideas. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org