Re: [PR] Update QueryUtils to use CollectorManager [lucene]

via GitHub Tue, 10 Sep 2024 15:20:53 -0700


javanna commented on code in PR #13748:
URL: https://github.com/apache/lucene/pull/13748#discussion_r1752834380



##########
lucene/test-framework/src/java/org/apache/lucene/tests/search/QueryUtils.java:
##########
@@ -345,175 +348,201 @@ public static void checkSkipTo(final Query q, final 
IndexSearcher s) throws IOEx
       // FUTURE: ensure scorer.doc()==-1
 
       final float maxDiff = 1e-5f;
-      final LeafReader[] lastReader = {null};
 
-      s.search(
-          q,
-          new SimpleCollector() {
-            private Scorable sc;
-            private Scorer scorer;
-            private DocIdSetIterator iterator;
-            private int leafPtr;
+      List<LeafReader> lastReaders =
+          s.search(
+              q,
+              new CollectorManager<SimpleCollectorWithLastReader, 
List<LeafReader>>() {
+                @Override
+                public SimpleCollectorWithLastReader newCollector() {
+                  return new SimpleCollectorWithLastReader() {
+                    LeafReader lastReader = null;
+                    private Scorable sc;
+                    private Scorer scorer;
+                    private DocIdSetIterator iterator;
+                    private int leafPtr;
+
+                    @Override
+                    public void setScorer(Scorable scorer) {
+                      this.sc = scorer;
+                    }
 
-            @Override
-            public void setScorer(Scorable scorer) {
-              this.sc = scorer;
-            }
+                    @Override
+                    public void collect(int doc) throws IOException {
+                      float score = sc.score();
+                      lastDoc[0] = doc;
+                      try {
+                        if (scorer == null) {
+                          Query rewritten = s.rewrite(q);
+                          Weight w = s.createWeight(rewritten, 
ScoreMode.COMPLETE, 1);
+                          LeafReaderContext context = 
readerContextArray.get(leafPtr);
+                          scorer = w.scorer(context);
+                          iterator = scorer.iterator();
+                        }
+
+                        int op = order[(opidx[0]++) % order.length];
+                        // System.out.println(op==skip_op ?
+                        // "skip("+(sdoc[0]+1)+")":"next()");
+                        boolean more =
+                            op == skip_op
+                                ? iterator.advance(scorer.docID() + 1)
+                                    != DocIdSetIterator.NO_MORE_DOCS
+                                : iterator.nextDoc() != 
DocIdSetIterator.NO_MORE_DOCS;
+                        int scorerDoc = scorer.docID();
+                        float scorerScore = scorer.score();
+                        float scorerScore2 = scorer.score();
+                        float scoreDiff = Math.abs(score - scorerScore);
+                        float scorerDiff = Math.abs(scorerScore2 - 
scorerScore);
+
+                        boolean success = false;
+                        try {
+                          assertTrue(more);
+                          assertEquals("scorerDoc=" + scorerDoc + ",doc=" + 
doc, scorerDoc, doc);
+                          assertTrue(
+                              "score=" + score + ", scorerScore=" + 
scorerScore,
+                              scoreDiff <= maxDiff);
+                          assertTrue(
+                              "scorerScorer=" + scorerScore + ", 
scorerScore2=" + scorerScore2,
+                              scorerDiff <= maxDiff);
+                          success = true;
+                        } finally {
+                          if (!success) {
+                            if (LuceneTestCase.VERBOSE) {
+                              StringBuilder sbord = new StringBuilder();
+                              for (int i = 0; i < order.length; i++) {
+                                sbord.append(order[i] == skip_op ? " skip()" : 
" next()");
+                              }
+                              System.out.println(
+                                  "ERROR matching docs:"
+                                      + "\n\t"
+                                      + (doc != scorerDoc ? "--> " : "")
+                                      + "doc="
+                                      + doc
+                                      + ", scorerDoc="
+                                      + scorerDoc
+                                      + "\n\t"
+                                      + (!more ? "--> " : "")
+                                      + "tscorer.more="
+                                      + more
+                                      + "\n\t"
+                                      + (scoreDiff > maxDiff ? "--> " : "")
+                                      + "scorerScore="
+                                      + scorerScore
+                                      + " scoreDiff="
+                                      + scoreDiff
+                                      + " maxDiff="
+                                      + maxDiff
+                                      + "\n\t"
+                                      + (scorerDiff > maxDiff ? "--> " : "")
+                                      + "scorerScore2="
+                                      + scorerScore2
+                                      + " scorerDiff="
+                                      + scorerDiff
+                                      + "\n\thitCollector.doc="
+                                      + doc
+                                      + " score="
+                                      + score
+                                      + "\n\t Scorer="
+                                      + scorer
+                                      + "\n\t Query="
+                                      + q
+                                      + "  "
+                                      + q.getClass().getName()
+                                      + "\n\t Searcher="
+                                      + s
+                                      + "\n\t Order="
+                                      + sbord
+                                      + "\n\t Op="
+                                      + (op == skip_op ? " skip()" : " 
next()"));
+                            }
+                          }
+                        }
+                      } catch (IOException e) {
+                        throw new RuntimeException(e);
+                      }
+                    }
 
-            @Override
-            public void collect(int doc) throws IOException {
-              float score = sc.score();
-              lastDoc[0] = doc;
-              try {
-                if (scorer == null) {
-                  Query rewritten = s.rewrite(q);
-                  Weight w = s.createWeight(rewritten, ScoreMode.COMPLETE, 1);
-                  LeafReaderContext context = readerContextArray.get(leafPtr);
-                  scorer = w.scorer(context);
-                  iterator = scorer.iterator();
-                }
+                    @Override
+                    public ScoreMode scoreMode() {
+                      return ScoreMode.COMPLETE;
+                    }
 
-                int op = order[(opidx[0]++) % order.length];
-                // System.out.println(op==skip_op ?
-                // "skip("+(sdoc[0]+1)+")":"next()");
-                boolean more =
-                    op == skip_op
-                        ? iterator.advance(scorer.docID() + 1) != 
DocIdSetIterator.NO_MORE_DOCS
-                        : iterator.nextDoc() != DocIdSetIterator.NO_MORE_DOCS;
-                int scorerDoc = scorer.docID();
-                float scorerScore = scorer.score();
-                float scorerScore2 = scorer.score();
-                float scoreDiff = Math.abs(score - scorerScore);
-                float scorerDiff = Math.abs(scorerScore2 - scorerScore);
-
-                boolean success = false;
-                try {
-                  assertTrue(more);
-                  assertEquals("scorerDoc=" + scorerDoc + ",doc=" + doc, 
scorerDoc, doc);
-                  assertTrue(
-                      "score=" + score + ", scorerScore=" + scorerScore, 
scoreDiff <= maxDiff);
-                  assertTrue(
-                      "scorerScorer=" + scorerScore + ", scorerScore2=" + 
scorerScore2,
-                      scorerDiff <= maxDiff);
-                  success = true;
-                } finally {
-                  if (!success) {
-                    if (LuceneTestCase.VERBOSE) {
-                      StringBuilder sbord = new StringBuilder();
-                      for (int i = 0; i < order.length; i++) {
-                        sbord.append(order[i] == skip_op ? " skip()" : " 
next()");
+                    @Override
+                    protected void doSetNextReader(LeafReaderContext context) 
throws IOException {
+                      // confirm that skipping beyond the last doc, on the
+                      // previous reader, hits NO_MORE_DOCS
+                      if (lastReader != null) {
+                        final LeafReader previousReader = lastReader;
+                        IndexSearcher indexSearcher =
+                            LuceneTestCase.newSearcher(previousReader, false);
+                        indexSearcher.setSimilarity(s.getSimilarity());
+                        Query rewritten = indexSearcher.rewrite(q);
+                        Weight w = indexSearcher.createWeight(rewritten, 
ScoreMode.COMPLETE, 1);
+                        LeafReaderContext ctx =
+                            (LeafReaderContext) 
indexSearcher.getTopReaderContext();
+                        Scorer scorer = w.scorer(ctx);
+                        if (scorer != null) {
+                          DocIdSetIterator iterator = scorer.iterator();
+                          boolean more = false;
+                          final Bits liveDocs = context.reader().getLiveDocs();
+                          for (int d = iterator.advance(lastDoc[0] + 1);
+                              d != DocIdSetIterator.NO_MORE_DOCS;
+                              d = iterator.nextDoc()) {
+                            if (liveDocs == null || liveDocs.get(d)) {
+                              more = true;
+                              break;
+                            }
+                          }
+                          Assert.assertFalse(
+                              "query's last doc was "
+                                  + lastDoc[0]
+                                  + " but advance("
+                                  + (lastDoc[0] + 1)
+                                  + ") got to "
+                                  + scorer.docID(),
+                              more);
+                        }
+                        leafPtr++;
                       }
-                      System.out.println(
-                          "ERROR matching docs:"
-                              + "\n\t"
-                              + (doc != scorerDoc ? "--> " : "")
-                              + "doc="
-                              + doc
-                              + ", scorerDoc="
-                              + scorerDoc
-                              + "\n\t"
-                              + (!more ? "--> " : "")
-                              + "tscorer.more="
-                              + more
-                              + "\n\t"
-                              + (scoreDiff > maxDiff ? "--> " : "")
-                              + "scorerScore="
-                              + scorerScore
-                              + " scoreDiff="
-                              + scoreDiff
-                              + " maxDiff="
-                              + maxDiff
-                              + "\n\t"
-                              + (scorerDiff > maxDiff ? "--> " : "")
-                              + "scorerScore2="
-                              + scorerScore2
-                              + " scorerDiff="
-                              + scorerDiff
-                              + "\n\thitCollector.doc="
-                              + doc
-                              + " score="
-                              + score
-                              + "\n\t Scorer="
-                              + scorer
-                              + "\n\t Query="
-                              + q
-                              + "  "
-                              + q.getClass().getName()
-                              + "\n\t Searcher="
-                              + s
-                              + "\n\t Order="
-                              + sbord
-                              + "\n\t Op="
-                              + (op == skip_op ? " skip()" : " next()"));
+                      lastReader = context.reader();
+                      assert readerContextArray.get(leafPtr).reader() == 
context.reader();

Review Comment:
   I did not have time to review this in depth, but there is more to it than 
just the assertion. These collectors are rather hard to follow at first glance. 
Also, this is a case where we cannot resort to a single-threaded collector 
manager unfortunately, because the searcher is externally provided and that 
would require disabling concurrency in all those tests that call `QueryUtils` 
methods somehow. Calling search(Query, Collector) bypassed concurrency so far 
despite an executor is provided to the searcher, which is why we want to remove 
it, but that also means that once an executor is provided to the searcher, 
there is no longer a way to disable search concurrency for specific queries.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Update QueryUtils to use CollectorManager [lucene]

Reply via email to