mikemccand opened a new pull request, #12530: URL: https://github.com/apache/lucene/pull/12530
### Description Relates #7820. `CheckIndex` today only detects and exorcises corruption with the latest commit point, yet `IndexWriter` will be angry on init if there are older commit points and they have "on load" corruption (e.g. missing `_N.si` files). This is badly inconsistent: if `IndexWriter` or `IndexReader` will hit an index corruption, `CheckIndex` should always be able to find it too. And it is conceivable though exceptionally unlikely for an index to legitimately get into this broken state on power loss, OS/JVM crash, at just the right/wrong time during `IndexWriter.commit`. This PR is just a first step: it fixes the detection bug in `CheckIndex`, adding a new test case carried over and iterated from [this nice test case](https://github.com/apache/lucene/issues/7009#issuecomment-1223544484) (thank you @buzztaiki!). So `CheckIndex` will now catch this exotic form of corruption. But it does not yet fix `-exorcise` to be able to correct such a situation. That's trickier, especially for `_N.si` files missing since `CheckIndex -exorcise` even on the latest commit point cannot correct that error either. Progress not perfection! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org