benwtrent opened a new issue, #14517: URL: https://github.com/apache/lucene/issues/14517
### Description With Lucene 10.2, we have seen some exceptions that are rather troubling. It appears that the into bit set code is buggy when utilizing multiple layers of iterators. It does seem strange that we have ran into this type of failure on a couple of different paths. This indicates a significant API shift (e.g. how iterators need to be handled is now fundamentally different), or some underlying bugs in this new implementation. Here is one trace. This one seems possible if the iterator has iterated passed `windowBase` and `windowMax` in: https://github.com/apache/lucene/blob/2772951f5f15ea02e6accd44f6904077b4a09060/lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java#L238 ``` DocIdSetIterator lead = iterators.get(0); if (lead.docID() < windowBase) { lead.advance(windowBase); } lead.intoBitSet(windowMax, windowMatches, windowBase); ``` ``` Caused by: java.lang.ArrayIndexOutOfBoundsException: Index -33 out of bounds for length 64 at org.apache.lucene.core@10.2.0/org.apache.lucene.util.FixedBitSet.set(FixedBitSet.java:283) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.DocIdSetIterator.intoBitSet(DocIdSetIterator.java:268) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.comparators.TermOrdValComparator$CompetitiveIterator.intoBitSet(TermOrdValComparator.java:539) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.DenseConjunctionBulkScorer.scoreWindowUsingBitSet(DenseConjunctionBulkScorer.java:242) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.DenseConjunctionBulkScorer.scoreWindow(DenseConjunctionBulkScorer.java:210) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.DenseConjunctionBulkScorer.score(DenseConjunctionBulkScorer.java:132) at org.elasticsearch.server@9.1.0/org.elasticsearch.search.internal.CancellableBulkScorer.score(CancellableBulkScorer.java:46) at org.elasticsearch.server@9.1.0/org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:460) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:809) at org.elasticsearch.server@9.1.0/org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:387) at org.elasticsearch.server@9.1.0/org.elasticsearch.search.internal.ContextIndexSearcher.lambda$search$3(ContextIndexSearcher.java:365) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:328) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.TaskExecutor$Task.run(TaskExecutor.java:173) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.TaskExecutor.invokeAll(TaskExecutor.java:111) at org.elasticsearch.server@9.1.0/org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:369) at org.elasticsearch.server@9.1.0/org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:336) at org.elasticsearch.server@9.1.0/org.elasticsearch.search.query.QueryPhase.addCollectorsAndSearch(QueryPhase.java:212) ... 15 more ``` This following trace seems possible if `DISIDocIdStream#count(upto)` where `upto` is > `max`. Even though we do a check for `iterator.docID() >= upTo` there is no check to verify we are iterating past the configured `max`. Maybe we are handling the iterators incorrectly? https://github.com/apache/lucene/blob/0018e62cf21fe589633b01812fe9ec49fd42b426/lucene/core/src/java/org/apache/lucene/search/DISIDocIdStream.java#L53 ``` Suppressed: java.lang.IndexOutOfBoundsException: Range [0, -2) out of bounds for length 4096 at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100) at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckFromToIndex(Preconditions.java:112) at java.base/jdk.internal.util.Preconditions.checkFromToIndex(Preconditions.java:349) at java.base/java.util.Objects.checkFromToIndex(Objects.java:391) at org.apache.lucene.core@10.2.0/org.apache.lucene.util.FixedBitSet.cardinality(FixedBitSet.java:212) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.DISIDocIdStream.count(DISIDocIdStream.java:63) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.DocIdStream.count(DocIdStream.java:50) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.TotalHitCountCollector$1.collect(TotalHitCountCollector.java:69) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.DenseConjunctionBulkScorer.scoreWindow(DenseConjunctionBulkScorer.java:208) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.DenseConjunctionBulkScorer.score(DenseConjunctionBulkScorer.java:132) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.ReqExclBulkScorer.score(ReqExclBulkScorer.java:66) at org.elasticsearch.server@9.1.0/org.elasticsearch.search.internal.CancellableBulkScorer.score(CancellableBulkScorer.java:46) at org.elasticsearch.server@9.1.0/org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:460) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:809) at org.elasticsearch.server@9.1.0/org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:387) at org.elasticsearch.server@9.1.0/org.elasticsearch.search.internal.ContextIndexSearcher.lambda$search$3(ContextIndexSearcher.java:365) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:328) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.TaskExecutor$Task.run(TaskExecutor.java:173) at org.apache.lucene.core@10.2.0/org.apache.lucene.search.TaskExecutor.lambda$invokeAll$1(TaskExecutor.java:98) ... 6 more ``` ### Version and environment details Lucene 10.2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org