mikemccand commented on code in PR #13253: URL: https://github.com/apache/lucene/pull/13253#discussion_r1576373763
########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java: ########## @@ -287,6 +287,68 @@ void rewind() { */ } + // Only rewind, don't force reload block. + // Reset readers' position, don't read, decompress. + // Current term greater than target, reduce endCount. + void rewindWithoutReload() { + // Set nextEnt to 0, to prevent force load. + nextEnt = 0; + suffixesReader.setPosition(0); + suffixLengthsReader.setPosition(0); + statsReader.setPosition(0); + bytesReader.setPosition(0); + + // TODO: Since we only rewind without reload for fist floor(currentFrame.fp == Review Comment: `fist` -> `first` ########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ########## @@ -434,8 +436,29 @@ public boolean seekExact(BytesRef target) throws IOException { // System.out.println(" target is before current (shares prefixLen=" + targetUpto + "); // rewind frame ord=" + lastFrame.ord); // } + + // We got lastFrame by comparing target and term, and target less than last seeked term in + // currentFrame. If lastFrame's fp is same with currentFrame's fp, we can reduce entCount to + // nextEnt. + boolean currentIsLast = currentFrame.fp == lastFrame.fp; currentFrame = lastFrame; - currentFrame.rewind(); + + // Only rewindWithoutReload for non-floor block or first floor block. + // TODO: We need currentFrame's first entry to judge whether we can rewindWithoutReload for + // non-first floor blocks. Review Comment: The FST terms index has a lower bound `BytesRef` pointer to the start term of all floor multi-blocks, but I don't think we save this anywhere today. Let's leave it as a future `TODO`? ########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ########## @@ -518,7 +541,20 @@ public boolean seekExact(BytesRef target) throws IOException { currentFrame.loadBlock(); + // We still stay on withOutReload frame, reduce entCount to nextEnt. + int origEntCount = -1; + if (currentFrame.fp == withOutReloadFp && origNextEnt != 0) { + origEntCount = currentFrame.entCount; + currentFrame.entCount = origNextEnt; + } + final SeekStatus result = currentFrame.scanToTerm(target, true); + + // Revert entCount to origEntCount. Review Comment: `Revert` -> `Restore`? ########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ########## @@ -573,7 +609,20 @@ public boolean seekExact(BytesRef target) throws IOException { currentFrame.loadBlock(); + // We still stay on withOutReload frame, reduce entCount to nextEnt. + int origEntCount = -1; + if (currentFrame.fp == withOutReloadFp && origNextEnt != 0) { + origEntCount = currentFrame.entCount; + currentFrame.entCount = origNextEnt; + } + final SeekStatus result = currentFrame.scanToTerm(target, true); + + // Revert entCount to origEntCount. Review Comment: `Revert` -> `Restore`? ########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ########## @@ -710,8 +761,28 @@ public SeekStatus seekCeil(BytesRef target) throws IOException { // System.out.println(" target is before current (shares prefixLen=" + targetUpto + "); // rewind frame ord=" + lastFrame.ord); // } + + // We got lastFrame by comparing target and term, and target less than last seeked term in + // currentFrame. If lastFrame's fp is same with currentFrame's fp, we can reduce entCount to + // nextEnt. Review Comment: Maybe change to `we can rewind without reloading`? We don't seem to reduce `entCount` anymore? ########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ########## @@ -434,8 +436,29 @@ public boolean seekExact(BytesRef target) throws IOException { // System.out.println(" target is before current (shares prefixLen=" + targetUpto + "); // rewind frame ord=" + lastFrame.ord); // } + + // We got lastFrame by comparing target and term, and target less than last seeked term in + // currentFrame. If lastFrame's fp is same with currentFrame's fp, we can reduce entCount to + // nextEnt. + boolean currentIsLast = currentFrame.fp == lastFrame.fp; currentFrame = lastFrame; - currentFrame.rewind(); + + // Only rewindWithoutReload for non-floor block or first floor block. + // TODO: We need currentFrame's first entry to judge whether we can rewindWithoutReload for + // non-first floor blocks. + if (currentFrame.fp != currentFrame.fpOrig + || currentFrame.entCount == 0 + || currentFrame.nextEnt == -1) { + currentFrame.rewind(); + } else { + // Prepare to reduce entCount. + if (currentIsLast && currentFrame.isLeafBlock) { Review Comment: I am confused why there are two cases under the `rewindWithoutReload` case. One case you can roll back `entCount` temporarily, because you know the target term is before the current term? And in the 2nd case, you don't know that, but you do know the target term is in this current block? But then why are we even rewinding? ########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ########## @@ -434,8 +436,29 @@ public boolean seekExact(BytesRef target) throws IOException { // System.out.println(" target is before current (shares prefixLen=" + targetUpto + "); // rewind frame ord=" + lastFrame.ord); // } + + // We got lastFrame by comparing target and term, and target less than last seeked term in + // currentFrame. If lastFrame's fp is same with currentFrame's fp, we can reduce entCount to + // nextEnt. + boolean currentIsLast = currentFrame.fp == lastFrame.fp; currentFrame = lastFrame; - currentFrame.rewind(); + + // Only rewindWithoutReload for non-floor block or first floor block. + // TODO: We need currentFrame's first entry to judge whether we can rewindWithoutReload for + // non-first floor blocks. + if (currentFrame.fp != currentFrame.fpOrig Review Comment: Could you add comment for each of these reasons to still reload the frame? E.g.: `currentFrame.fp != currentFrame.fpOrig // this is a floor multi-block and not the first one`` `currentFrame.entCount == 0 // why does this happen?` etc.? I don't understand when/why the 2nd two conditions occur. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org