Re: [PR] Add levels to DocValues skipper index [lucene]

via GitHub Mon, 15 Jul 2024 03:49:10 -0700


jpountz commented on code in PR #13563:
URL: https://github.com/apache/lucene/pull/13563#discussion_r1677636214



##########
lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java:
##########
@@ -1792,61 +1794,88 @@ public DocValuesSkipper getSkipper(FieldInfo field) 
throws IOException {
     if (input.length() > 0) {
       input.prefetch(0, 1);
     }
+    // TODO: should we write to disk the actual max level for this segment?
     return new DocValuesSkipper() {
-      int minDocID = -1;
-      int maxDocID = -1;
-      long minValue, maxValue;
-      int docCount;
+      final int[] minDocID = new int[SKIP_INDEX_MAX_LEVEL];
+      final int[] maxDocID = new int[SKIP_INDEX_MAX_LEVEL];
+
+      {
+        for (int i = 0; i < SKIP_INDEX_MAX_LEVEL; i++) {
+          minDocID[i] = maxDocID[i] = -1;
+        }
+      }
+
+      final long[] minValue = new long[SKIP_INDEX_MAX_LEVEL];
+      final long[] maxValue = new long[SKIP_INDEX_MAX_LEVEL];
+      final int[] docCount = new int[SKIP_INDEX_MAX_LEVEL];
+      int levels;
 
       @Override
       public void advance(int target) throws IOException {
         if (target > entry.maxDocId) {
-          minDocID = DocIdSetIterator.NO_MORE_DOCS;
-          maxDocID = DocIdSetIterator.NO_MORE_DOCS;
+          // skipper is exhausted
+          for (int i = 0; i < SKIP_INDEX_MAX_LEVEL; i++) {
+            minDocID[i] = maxDocID[i] = DocIdSetIterator.NO_MORE_DOCS;
+          }
         } else {
+          // find next interval
+          assert target > maxDocID[0] : "target must be bigger that current 
interval";
           while (true) {
-            maxDocID = input.readInt();
-            if (maxDocID >= target) {
-              minDocID = input.readInt();
-              maxValue = input.readLong();
-              minValue = input.readLong();
-              docCount = input.readInt();
+            levels = input.readByte();

Review Comment:
   I see. I need to think more about it. It makes sense to me for top-level 
queries which would visit the full doc ID range anyway. But if the query is 
part of a conjunction, then the leading clause of the conjunction could advance 
this clause to an arbitrary doc in the doc ID space, and I wonder if we're 
losing potential efficiency by not making the higher levels visible.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add levels to DocValues skipper index [lucene]

Reply via email to