sgup432 commented on code in PR #15124:
URL: https://github.com/apache/lucene/pull/15124#discussion_r2361181686


##########
lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java:
##########
@@ -426,10 +425,8 @@ public void clear() {
   }
 
   private static long getRamBytesUsed(Query query) {
-    return LINKED_HASHTABLE_RAM_BYTES_PER_ENTRY
-        + (query instanceof Accountable accountableQuery
-            ? accountableQuery.ramBytesUsed()
-            : QUERY_DEFAULT_RAM_BYTES_USED);
+    long queryRamBytesUsed = RamUsageEstimator.sizeOf(query, 0);
+    return LINKED_HASHTABLE_RAM_BYTES_PER_ENTRY + queryRamBytesUsed;

Review Comment:
   Okay, I did the change to cache query size as well in the existing map
   
   **Baseline (100% write)**
   ```
   Benchmark                               Mode  Cnt        Score        Error  
Units
   QueryCacheBenchmark.concurrentPutOnly  thrpt    9  4057602.444 ± 380153.561  
ops/s
   ```
   
   **My changes (100% write)** 
   ```
   Benchmark                               Mode  Cnt        Score        Error  
Units
   QueryCacheBenchmark.concurrentPutOnly  thrpt    9  3028840.412 ± 483817.248  
ops/s
   ```
   
   --------------
   
   **Baseline - 60% read, 40% write**  (6 read threads, 4 write threads)
   ```
   Benchmark                                                           Mode  
Cnt        Score         Error  Units
   QueryCacheBenchmark.concurrentGetAndPuts                           thrpt    
9  8074623.026 ± 1229792.902  ops/s
   ```
   
   **My changes - 60% read, 40% write** 
   ```
   Benchmark                                                           Mode  
Cnt        Score         Error  Units
   QueryCacheBenchmark.concurrentGetAndPuts                           thrpt    
9  8262227.277 ± 1224929.417  ops/s
   ```
   
   
   So with 100% writes, which is the worst case for any cache, my changes(to 
add RamUsageEstimator and BooleanQuery.visit() changes) performs 25% worse than 
the baseline. But this isn't a ideal workload for any cache.
   
   So I did another benchmark to test with 60% reads, 40% writes which 
represents more ideal distribution for the cache.
   
   Here both are almost equivalent, and no degradation is seen.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to