[I] Refactor QueryCache to improve concurrency and performance [lucene]

via GitHub Tue, 11 Feb 2025 18:05:44 -0800


sgup432 opened a new issue, #14222:
URL: https://github.com/apache/lucene/issues/14222

### Description

Given the significant role of LRUQueryCache in Lucene, I see opportunities
to enhance its performance. Although there have been many discussions on this
topic like [here](https://github.com/apache/lucene/issues/10081), they haven’t
led to anywhere.

Currently, QueryCache uses a `Map<CacheKey, LeafCache>`, where:

- **CacheKey** is tied to a specific segment.
- **LeafCache** is per-segment and maintains a Map<Query, CacheAndCount>,
where CacheAndCount is essentially a docID set.

### Current Issue:
We use a global read/write lock to put items into the cache. However, since
CacheKey is our primary key, this limits concurrency.

For example, if there are only 10 segments but 10,000 unique queries, we end
up with just 10 unique cache keys. This means a large number of requests
compete for the same write lock, significantly degrading performance by either
waiting or not using cache (by skipping it if lock not obtained)

### One simple Solution:

- **Introduce a Composite Key**
- Instead of CacheKey alone, use (CacheKey, Query) as the key.
- The new structure would be `Map<CompositeKey, CacheCount>`

- **Partition the Cache for Better Concurrency**
- Create multiple LRU partitions (e.g., 8, 16, or 64 , configurable).
- Each partition has its own read/write lock to reduce contention.
- Assign incoming requests to specific partition using:
`hash(CacheKey, Query) % partition_count` (as an example)

- I also see we use uniqueQueries/mostRecentlyUsedQueries
[here](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L94-L98),
to maintain a LRU list for the queries. I think we can remove it and instead
maintain a separate LRU list per partition, using CompositeKey.

- For Eviction & Cleanup, we can collect stale `CacheKey` entries in a
`Set<CacheKey>` when segment is closed due to merge. A background thread later
iterates over cache entries and clears them, similar to how RequestCache in
OpenSearch works.

Let me know what you think. I think it might work pretty well(and simple
enough) but I am open to other ideas.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[I] Refactor QueryCache to improve concurrency and performance [lucene]

Reply via email to