maosuhan opened a new pull request, #11939: URL: https://github.com/apache/lucene/pull/11939
### Description When we execute TermRangeQuery or TermInSet query, lucene use DocIdSetBuilder to store doc id list. When the doc id list becomes large, it will convert from array to bitset in upgradeToBitSet. When new doc id is added, the `counter` variable of DocIdSetBuilder is unchanged, and the cost is incorrect in DocIdSetBuilder.build. How to reproduce: Directory dir = FSDirectory.open(Files.createTempDirectory(null, new FileAttribute[0])); IndexWriter w = new IndexWriter(dir, new IndexWriterConfig()); for (int i = 100000; i < 300000; ++i) { Document doc = new Document(); doc.add(new StringField("f1", i + "", Field.Store.NO)); w.addDocument(doc); } w.forceMerge(1); IndexReader reader = DirectoryReader.open(w); IndexSearcher searcher = new IndexSearcher(reader); searcher.setQueryCache(null); Query query = new TermRangeQuery("f1", new BytesRef("200000"), new BytesRef("300000"), true, true); Weight weight = searcher.createWeight(searcher.rewrite(query), ScoreMode.COMPLETE, 1); ScorerSupplier scorerSupplier = weight.scorerSupplier(searcher.getIndexReader().leaves().get(0)); System.out.println(scorerSupplier.cost()); it is wrong cost=1026, the actual cost should be 100000. This will cause some performance unexpected issue like lead selection in bool query. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org