benwtrent opened a new issue, #15324: URL: https://github.com/apache/lucene/issues/15324
### Description We are hitting a weird EOF in Lucene. It appears that its possible for an essentialQueue scorer can hit `NO_MORE_DOCS`, advance the TermScorer past its maxDoc, and then attempt to gather a score. I do see we adjusted this path in 10.2: https://github.com/apache/lucene/pull/14186 Haven't been able to test this same data in 10.2 yet. But, I have been staring at these code paths for days and just cannot see how we are progressing the `top` iterator to a doc past the maxDoc in the segment when using a filter. Note: - It HAS to be a filter. I tried the same exact query, but with a `must` clause with boosting by `0` (so not contributing to score at all), and we don't hit the EOF - It MUST be done with disjunctions, I tried with conjunctions and a filter, and it worked just fine - It requires a fairly restricted filter (matching a few percentage of docs or less than a percent of docs). - One of the clauses much match more docs than the other (the particular failure is when one clause matches 3x more docs). But both match less than a 1/3 of the docs. But are less restrictive than the filter. As for the TermScorer and such, that code path hasn't changed in a long time. basically, it never verifies its at NO_MORE_DOCS when scoring, it just always accepts the "advanceExact" as `true` even when that passes the maxDoc. Which seems like a mistake already. I would expect `advanceExact` to return `false` if its advancing past maxDoc right? ``` org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl.readByte(MemorySegmentIndexInput.java:762) at org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$3.longValue(Lucene90NormsProducer.java:399) at org.apache.lucene.search.TermScorer.score(TermScorer.java:93) at org.apache.lucene.search.DisjunctionSumScorer.score(DisjunctionSumScorer.java:43) at org.apache.lucene.search.DisjunctionScorer.score(DisjunctionScorer.java:176) at org.apache.lucene.search.MaxScoreBulkScorer.scoreInnerWindowWithFilter(MaxScoreBulkScorer.java:201) at org.apache.lucene.search.MaxScoreBulkScorer.scoreInnerWindow(MaxScoreBulkScorer.java:147) at org.apache.lucene.search.MaxScoreBulkScorer.score(MaxScoreBulkScorer.java:128) at org.elasticsearch.search.internal.CancellableBulkScorer.score(CancellableBulkScorer.java:46) at org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:461) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:810) at org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:388) at org.elasticsearch.search.internal.ContextIndexSearcher.lambda$search$3(ContextIndexSearcher.java:368) at java.util.concurrent.FutureTask.run(FutureTask.java:328) at org.apache.lucene.search.TaskExecutor$Task.run(TaskExecutor.java:173) at org.apache.lucene.search.TaskExecutor.lambda$invokeAll$1(TaskExecutor.java:98) ``` ### Version and environment details Lucene 10.1.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
