pmpailis commented on PR #14882: URL: https://github.com/apache/lucene/pull/14882#issuecomment-3048383136
I think that the issue seems to be with the following ``` int disiTo = Math.min(upTo, bitSet.length()) ``` as the above computation does not take into account the initial `offset`, so we could end up reading less data than initially requested. So for example, we could have `bitSize.length: 100`, `offset: 90`, and `upTo: 150`. Instead of reading `[90, 150]`, we would end up operating on just the `[90, 100]` range causing all sorts of issues later in the pipeline. Switching back to ``` int disiTo = upTo == DocIdSetIterator.NO_MORE_DOCS ? bitSet.length() : upTo; ``` seems to address all failures (had 50 full test successful runs). Would you suggest to correct only when `upTo == NO_MORE_DOCS` (or sth like `upTo - offset >= bitSet.length()`) or to maybe just restrict the fix for the initial `knn` filter case and provide a custom `upTo` at that point? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org