sgup432 commented on issue #14222: URL: https://github.com/apache/lucene/issues/14222#issuecomment-2673153965
I got busy with other stuff but got sometime to run initial benchmark for this. I essentially micro-benchmarked `putIfAbsent()` and `get()`methods in isolation for QueryCache for simplicity. Here is the benchmark [code](https://github.com/sgup432/lucene/blob/query_cache_test/lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/QueryCacheBenchmark.java). It basically creates sample queries(10000) and cacheHelpers(assuming 16 lucene segments). I created a [LRUQueryCacheV2](https://github.com/sgup432/lucene/blob/query_cache_test/lucene/core/src/java/org/apache/lucene/search/LRUQueryCacheV2.java), with things recommended above. It creates 16(for this test) QueryCacheSegments with each having its own in-memory [map](https://github.com/sgup432/lucene/blob/query_cache_test/lucene/core/src/java/org/apache/lucene/search/LRUQueryCacheV2.java#L170) to store composite key and value. Composite key is nothing but a combination of `(CacheKey, Query)`, and it uses its hashcode() to determine which partition it will end up going. Rest its pretty similar to existing [LRUQueryCache](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java). Some parts of eviction logic is yet to be fully written for V2, like clearing entires when a lucene segment is merged etc. Also ran existing UT for QueryCache on top of LRUQueryCacheV2 for high level correctness, some 16 are passing and 10 failing(basically due to incomplete eviction logic). Coming to results: Here v1 refers to existing QueryCache and v2 refers to my version of QueryCache. Benchmarks can be run using: `java --module-path lucene/benchmark-jmh/build/benchmarks --module org.apache.lucene.benchmark.jmh QueryCacheBenchmark` ## Performance Comparison: v1 vs. v2 | **Benchmark** | **Version** | **Throughput (ops/s)** | **Error (ops/s)** | **Performance Gain (v2 vs. v1)** | |-------------------------------------------|------------|------------------------|--------------------|---------------------------| | **Concurrent Get & Put (Mixed Load)** | | | | | | `concurrentGetAndPuts` | v1 | **1,857,864** | ±57,408 | **3.02x** | | `concurrentGetAndPuts_v2` | v2 | **5,614,289** | ±96,352 | | | **Get Performance (Read-Only in Mixed load)** | | | | | | `concurrentGetAndPuts_get` | v1 | **814,891** | ±75,165 | **5.27x** | | `concurrentGetAndPuts_getV2` | v2 | **4,298,377** | ±114,633 | | | **Put Performance (Write-Only in Mixed Load)** | | | | | | `concurrentGetAndPuts_put` | v1 | **1,042,973** | ±49,868 | **1.26x** | | `concurrentGetAndPuts_putV2` | v2 | **1,315,912** | ±32,133 | | | **Concurrent Puts (Write-Only Load)** | | | | | | `concurrent_puts_v1` | v1 | **1,387,740** | ±35,309 | **2.83x** | | `concurrent_puts_v2` | v2 | **3,933,324** | ±58,046 | | Raw results: ``` Benchmark Mode Cnt Score Error Units QueryCacheBenchmark.concurrentGetAndPuts thrpt 25 1857864.371 ± 57408.178 ops/s QueryCacheBenchmark.concurrentGetAndPuts:concurrentGetAndPuts_get thrpt 25 814891.042 ± 75165.491 ops/s QueryCacheBenchmark.concurrentGetAndPuts:concurrentGetAndPuts_put thrpt 25 1042973.329 ± 49868.486 ops/s QueryCacheBenchmark.concurrentGetAndPuts_v2 thrpt 25 5614289.356 ± 96352.346 ops/s QueryCacheBenchmark.concurrentGetAndPuts_v2:concurrentGetAndPuts_getV2 thrpt 25 4298377.070 ± 114633.945 ops/s QueryCacheBenchmark.concurrentGetAndPuts_v2:concurrentGetAndPuts_putV2 thrpt 25 1315912.286 ± 32133.146 ops/s QueryCacheBenchmark.concurrent_puts_v1 thrpt 25 1387740.110 ± 35309.681 ops/s QueryCacheBenchmark.concurrent_puts_v2 thrpt 25 3933324.449 ± 58046.222 ops/s ``` I only assumed 16 lucene segments for this test which is less for a OpenSearch node with multiple indices. With more, we will see more improvements. Also eviction wrt segment merges will be handled on a separate thread for v2 which is unaccounted for, but even with that, it should be highly performant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org