maosuhan opened a new pull request, #11939:
URL: https://github.com/apache/lucene/pull/11939

   ### Description
   
   When we execute TermRangeQuery or TermInSet query, lucene use 
DocIdSetBuilder to store doc id list. When the doc id list becomes large, it 
will convert from array to bitset in upgradeToBitSet. When new doc id is added, 
the `counter` variable of DocIdSetBuilder is unchanged, and the cost is 
incorrect in DocIdSetBuilder.build.
   
   How to reproduce:
   
           Directory dir = FSDirectory.open(Files.createTempDirectory(null, new 
FileAttribute[0]));
           IndexWriter w = new IndexWriter(dir, new IndexWriterConfig());
           for (int i = 100000; i < 300000; ++i) {
               Document doc = new Document();
               doc.add(new StringField("f1", i + "", Field.Store.NO));
               w.addDocument(doc);
           }
           w.forceMerge(1);
           IndexReader reader = DirectoryReader.open(w);
           IndexSearcher searcher = new IndexSearcher(reader);
           searcher.setQueryCache(null);
   
           Query query = new TermRangeQuery("f1", new BytesRef("200000"), new 
BytesRef("300000"), true, true);
           Weight weight = searcher.createWeight(searcher.rewrite(query), 
ScoreMode.COMPLETE, 1);
           ScorerSupplier scorerSupplier = 
weight.scorerSupplier(searcher.getIndexReader().leaves().get(0));
           System.out.println(scorerSupplier.cost());
   
   it is wrong cost=1026, the actual cost should be 100000. This will cause 
some performance unexpected issue like lead selection in bool query.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to