kaivalnp commented on code in PR #12857:
URL: https://github.com/apache/lucene/pull/12857#discussion_r1412393818


##########
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##########
@@ -118,13 +118,38 @@ private TopDocs getLeafResults(LeafReaderContext ctx, 
Weight filterWeight) throw
       return NO_RESULTS;
     }
 
-    BitSet acceptDocs = createBitSet(scorer.iterator(), liveDocs, maxDoc);
-    int cost = acceptDocs.cardinality();
+    DocIdSetIterator iterator = scorer.iterator();
+    BitSet bitSet =
+        iterator instanceof BitSetIterator
+            ? ((BitSetIterator) iterator).getBitSet()
+            : BitSet.of(iterator, maxDoc);
+    Bits acceptDocs =
+        new Bits() {
+          @Override
+          public boolean get(int index) {
+            return liveDocs != null ? liveDocs.get(index) & bitSet.get(index) 
: bitSet.get(index);
+          }
+
+          @Override
+          public int length() {
+            return bitSet.cardinality();
+          }

Review Comment:
   `length()` is more like the [maximum 
doc](https://github.com/apache/lucene/blob/0e96b9cd8c1267acee02684bcac5be89d60b3bf4/lucene/core/src/java/org/apache/lucene/util/Bits.java#L73)
 you can request, this should be `bitSet.length()`?



##########
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##########
@@ -118,13 +118,38 @@ private TopDocs getLeafResults(LeafReaderContext ctx, 
Weight filterWeight) throw
       return NO_RESULTS;
     }
 
-    BitSet acceptDocs = createBitSet(scorer.iterator(), liveDocs, maxDoc);
-    int cost = acceptDocs.cardinality();
+    DocIdSetIterator iterator = scorer.iterator();
+    BitSet bitSet =
+        iterator instanceof BitSetIterator
+            ? ((BitSetIterator) iterator).getBitSet()
+            : BitSet.of(iterator, maxDoc);
+    Bits acceptDocs =
+        new Bits() {
+          @Override
+          public boolean get(int index) {
+            return liveDocs != null ? liveDocs.get(index) & bitSet.get(index) 
: bitSet.get(index);

Review Comment:
   nit: Can we make this cleaner by `return (liveDocs == null || 
liveDocs.get(index)) && bitSet.get(index)` ?



##########
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##########
@@ -118,13 +118,38 @@ private TopDocs getLeafResults(LeafReaderContext ctx, 
Weight filterWeight) throw
       return NO_RESULTS;
     }
 
-    BitSet acceptDocs = createBitSet(scorer.iterator(), liveDocs, maxDoc);
-    int cost = acceptDocs.cardinality();
+    DocIdSetIterator iterator = scorer.iterator();
+    BitSet bitSet =
+        iterator instanceof BitSetIterator
+            ? ((BitSetIterator) iterator).getBitSet()
+            : BitSet.of(iterator, maxDoc);
+    Bits acceptDocs =
+        new Bits() {

Review Comment:
   Perhaps we can wrap the `BitSet` in a new `Bits` only for the case we're 
trying to optimize (when `iterator instanceof BitSetIterator`) -- not changing 
the `BitSet.of` flow like @shubhamvishu also mentioned?



##########
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##########
@@ -118,13 +118,38 @@ private TopDocs getLeafResults(LeafReaderContext ctx, 
Weight filterWeight) throw
       return NO_RESULTS;
     }
 
-    BitSet acceptDocs = createBitSet(scorer.iterator(), liveDocs, maxDoc);
-    int cost = acceptDocs.cardinality();
+    DocIdSetIterator iterator = scorer.iterator();
+    BitSet bitSet =
+        iterator instanceof BitSetIterator
+            ? ((BitSetIterator) iterator).getBitSet()
+            : BitSet.of(iterator, maxDoc);
+    Bits acceptDocs =
+        new Bits() {
+          @Override
+          public boolean get(int index) {
+            return liveDocs != null ? liveDocs.get(index) & bitSet.get(index) 
: bitSet.get(index);
+          }
+
+          @Override
+          public int length() {
+            return bitSet.cardinality();
+          }
+        };
+
+    FilteredDocIdSetIterator filteredDocIdSetIterator =
+        new FilteredDocIdSetIterator(new BitSetIterator(bitSet, maxDoc)) {
+          @Override
+          protected boolean match(int doc) {
+            return liveDocs == null || liveDocs.get(doc);
+          }
+        };
+
+    int cost = acceptDocs.length();

Review Comment:
   The `cost` here determines what limit to set for `#approximateSearch`
   
   If we use `acceptDocs.length()`, this will be equal to `maxDoc` (and always 
complete graph search without falling back to exact search, even when we want 
to..)
   
   Perhaps this should be `acceptDocs.cardinality()`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to