Brain2000 opened a new issue, #12082: URL: https://github.com/apache/lucene/issues/12082
### Description It looks like there's a problem in the TopFieldCollector.java where it calls "compareBottom" without calling "setBottom" first. I believe this is an issue if getLeafComparer( ) is called more than once, as the new class will no longer have the previous bottom slot value. I found a workaround is to rescope the variable storing the bottom slot set from calling "setBottom( )", to outside of the LeafFieldComparator class to make it persistent (just like the setTopValue( ) is called outside of the LeafFieldComparator class). Rescoping the bottom slot should be safe unless multiple threads are being run at the same time and need their own version of bottomSlot for the LeafFieldComparator class. The documentation here https://lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/LeafFieldComparator.html states that _"[setBottom] will always be called before compareBottom(int)"_ So one might think the LeafFieldComparator would be used like this: ``` //get new leaf comparator LeafFieldComparator lc = getLeafComparator( ); //setBottom before compareBottom lc.setBottom(xxx); lc.compareBottom(yyy); //get new leaf comparator lc = getLeafComparator( ); //setBottom before compareBottom lc.setBottom(xxx2); lc.compareBottom(yyy2); ``` However, it is being called in this fashion: ``` //get new leaf comparator LeafFieldComparator lc = getLeafComparator( ); //setBottom before compareBottom lc.setBottom(xxx); lc.compareBottom(yyy); //get new leaf comparator LeafFieldComparator lc = getLeafComparator( ); //compareBottom... hopefully the slot is correct... lc.compareBottom(yyy); ``` As you can see, the 2nd set of code does not treat the LeafComparator as if each instance has it's own bottomSlot variable. I believe the purpose of this would be to allow multiple threads to call this simultaneously, which would require each one to have it's own bottomSlot, otherwise it would not be thread safe. Also, if setBottom was meant to be outside of the scope of LeafComparator, then why was the time taken to put setTop outside and setBottom inside? This seems to have been done very deliberately according to the documentation. Here is a sample code snippet: ``` import org.apache.lucene.index.LeafReaderContext; import org.apache.lucene.search.FieldComparator; import org.apache.lucene.search.LeafFieldComparator; import org.apache.lucene.search.Scorable; public final class MySimpleStringComparator extends FieldComparator<String> { private String[] values; private final String field; private String topValue; //SCOPING bottomSlot HERE WILL CAUSE ITEMS TO RETURN PROPERLY //private int bottomSlot; public MySimpleStringComparator(int numHits, String field) { this.values = new String[numHits]; this.field = field; } @Override public int compare(int slot1, int slot2) { return this.compareValues(this.values[slot1], this.values[slot2]); } @Override public LeafFieldComparator getLeafComparator(LeafReaderContext lrc) throws IOException { final LeafReaderContext frc = lrc; return new LeafFieldComparator() { //SCOPING bottomSlot HERE WILL CAUSE ITEMS ON A SUBSET TO RANDOMLY DISAPPEAR private int bottomSlot; public void setScorer(Scorable arg0) throws IOException { } public void setBottom(int slot) throws IOException { bottomSlot = slot; } public void copy(int slot, int doc) throws IOException { values[slot] = frc.reader().document(doc).get(field); } public int compareTop(int doc) throws IOException { return compareValues(topValue, frc.reader().document(doc).get(field)); } public int compareBottom(int doc) throws IOException { return compareValues(values[bottomSlot], frc.reader().document(doc).get(field)); } }; } @Override public void setTopValue(String val) { this.topValue = val; } @Override public String value(int slot) { return this.values[slot]; } } ``` Here is a code snippet that runs a search/sort for a single page that utilizes the MySimpleStringComparator above. ``` ComplexPhraseQueryParser cpqp = new ComplexPhraseQueryParser("somefield", analyzer); Query query = cpqp.parse("somevalue"); pageSize = 10; pageNum = 1; requestedRecords = pageSize * pageNum; startOffset = (pageNum - 1) * pageSize; FieldComparatorSource fsc = new FieldComparatorSource() { @Override public FieldComparator<String> newComparator(String fieldname, int numhits, int sortPos, boolean reversed) { return new MySimpleStringComparator(numhits, fieldname); } }; Sort sort = new Sort(new SortField("firstname", fsc, false)); IndexSearcher searcher = new IndexSearcher(reader); TopFieldCollector tfcollector = TopFieldCollector.create(sort, requestedRecords, Integer.MAX_VALUE); searcher.search(query, tfcollector); ScoreDoc[] hits = tfcollector.topDocs(startOffset, pageSize).scoreDocs; ``` ### Version and environment details This is running on Windows Server 2019 using Tomcat 8.5 with Lucene 8.11.2 (also reproduceable in 8.6.0) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org