mikemccand commented on code in PR #12530: URL: https://github.com/apache/lucene/pull/12530#discussion_r1388366133
########## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ########## @@ -610,6 +610,31 @@ public Status checkIndex(List<String> onlySegments, ExecutorService executorServ return result; } + // https://github.com/apache/lucene/issues/7820: also attempt to open any older commit points (segments_N), which will catch certain + // corruption like missing _N.si files for segments not also referenced by the newest commit point (which was already loaded, Review Comment: > @mikemccand Just to clarify this comment - I was using @buzztaiki 's [original test case](https://github.com/apache/lucene/issues/7009#issuecomment-1223544484) with slight modifications to test this: Hmm I'm confused -- doesn't `parseSegmentInfos` read a single `segments_N` file? It goes through that segments file and reads each separate `SegmentInfo`, but not the other `segments_(N-1)` files in the index? I thought the issue here was `segments_N` (and all the separate segments / `.si` files it references) is intact, but, `segments_(N-1)` is broken because it references a segment where its `.si` file is missing? > I would like to try to make missing .si files behave the same way as having missing .cfs do currently and make it possible to use -exorcise for this case Maybe we could fix the exception thrown when a `.si` cannot be found to a subclass of `CorruptIndexException` and add a member e.g.`set/getAffectedSegment` that would tell us which segment the `.si` belonged to? And `CheckIndex` could catch that and do its `excorcise` thing? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org