Re: [PR] Add prefetching support to stored fields. [lucene]

via GitHub Mon, 27 May 2024 13:37:27 -0700


jpountz commented on code in PR #13424:
URL: https://github.com/apache/lucene/pull/13424#discussion_r1616375903



##########
lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingStoredFieldsReader.java:
##########
@@ -609,6 +622,23 @@ public void skipBytes(long numBytes) throws IOException {
     }
   }
 
+  @Override
+  public void prefetch(int docID) throws IOException {
+    final long blockID = indexReader.getBlockID(docID);
+
+    for (long prefetchedBlockID : prefetchedBlockIDCache) {
+      if (prefetchedBlockID == blockID) {
+        return;
+      }
+    }

Review Comment:
   The cache is currently an unsorted rolling buffer, this is why I'm linearly 
scanning all entries, and also why I'm keeping it small (16 entries).
   
   I hesitated with only caching the last blockID, which would have been 
slightly simpler. To have a good hit ratio though, callers would need to fetch 
stored documents in doc ID order. I'm not sure how common it is today. I could 
switch to only caching the last block ID and add javadocs suggesting that 
applications fetch stored documents in doc ID order, what do you think? Maybe 
even add a helper method that takes a ScoreDoc[] array and returns the 
associated stored documents doing the right thing, ie. sorting doc IDs and 
doing prefetching so that I/O is parallelized?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add prefetching support to stored fields. [lucene]

Reply via email to