jpountz commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1677636214
########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java: ########## @@ -1792,61 +1794,88 @@ public DocValuesSkipper getSkipper(FieldInfo field) throws IOException { if (input.length() > 0) { input.prefetch(0, 1); } + // TODO: should we write to disk the actual max level for this segment? return new DocValuesSkipper() { - int minDocID = -1; - int maxDocID = -1; - long minValue, maxValue; - int docCount; + final int[] minDocID = new int[SKIP_INDEX_MAX_LEVEL]; + final int[] maxDocID = new int[SKIP_INDEX_MAX_LEVEL]; + + { + for (int i = 0; i < SKIP_INDEX_MAX_LEVEL; i++) { + minDocID[i] = maxDocID[i] = -1; + } + } + + final long[] minValue = new long[SKIP_INDEX_MAX_LEVEL]; + final long[] maxValue = new long[SKIP_INDEX_MAX_LEVEL]; + final int[] docCount = new int[SKIP_INDEX_MAX_LEVEL]; + int levels; @Override public void advance(int target) throws IOException { if (target > entry.maxDocId) { - minDocID = DocIdSetIterator.NO_MORE_DOCS; - maxDocID = DocIdSetIterator.NO_MORE_DOCS; + // skipper is exhausted + for (int i = 0; i < SKIP_INDEX_MAX_LEVEL; i++) { + minDocID[i] = maxDocID[i] = DocIdSetIterator.NO_MORE_DOCS; + } } else { + // find next interval + assert target > maxDocID[0] : "target must be bigger that current interval"; while (true) { - maxDocID = input.readInt(); - if (maxDocID >= target) { - minDocID = input.readInt(); - maxValue = input.readLong(); - minValue = input.readLong(); - docCount = input.readInt(); + levels = input.readByte(); Review Comment: I see. I need to think more about it. It makes sense to me for top-level queries which would visit the full doc ID range anyway. But if the query is part of a conjunction, then the leading clause of the conjunction could advance this clause to an arbitrary doc in the doc ID space, and I wonder if we're losing potential efficiency by not making the higher levels visible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org