jpountz commented on PR #1068: URL: https://github.com/apache/lucene/pull/1068#issuecomment-1241631941
Thank you for your comments, I think I understand the bug now. I think a better description of the bug is that `BitSetConjunctionDISI#docID()` doesn't honor its contract that it must return `NO_MORE_DOCS` when the iterator is exhausted. And this issue may only occur if any of the bitsets involved in the conjunction have a length that is less than `maxDoc`, which is not typical. This would explain why we haven't seen this bug earlier. > Is it valid for an iterator to advance outside of the ConjunctionDISI? I cant find anywhere that prevents this, but I was under the assumption it should be invalid to do this. It is invalid indeed. The goal of the assertion that all iterators are on the same doc is to catch such issues. > I was trying to keep all sub-iterators on the same doc all the time. I have a preference for not advancing other iterators, because it should not be necessary, and if it is then this means we have another bug somewhere else. The `ConjunctionDISI` that doesn't optimize for bitsets only advances other iterators when reaching `NO_MORE_DOCS` because avoiding to do this would require introducing more conditionals, which would have overhead. But it isn't necessary. I would suggest to no longer advance other iterators, and change the test to make sure that `docID()` returns `NO_MORE_DOCS` when the iterator is exhausted, instead of checking if all sub iterators are on the same doc ID? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org