jpountz commented on code in PR #13424:
URL: https://github.com/apache/lucene/pull/13424#discussion_r1616375903
##########
lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingStoredFieldsReader.java:
##########
@@ -609,6 +622,23 @@ public void skipBytes(long numBytes) throws IOException {
}
}
+ @Override
+ public void prefetch(int docID) throws IOException {
+ final long blockID = indexReader.getBlockID(docID);
+
+ for (long prefetchedBlockID : prefetchedBlockIDCache) {
+ if (prefetchedBlockID == blockID) {
+ return;
+ }
+ }
Review Comment:
The cache is currently an unsorted rolling buffer, this is why I'm linearly
scanning all entries, and also why I'm keeping it small (16 entries).
I hesitated with only caching the last blockID, which would have been
slightly simpler. To have a good hit ratio though, callers would need to fetch
stored documents in doc ID order. I'm not sure how common it is today. I could
switch to only caching the last block ID and add javadocs suggesting that
applications fetch stored documents in doc ID order, what do you think? Maybe
even add a helper method that takes a ScoreDoc[] array and returns the
associated stored documents doing the right thing, ie. sorting doc IDs and
doing prefetching so that I/O is parallelized?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]