gokaai commented on code in PR #12530: URL: https://github.com/apache/lucene/pull/12530#discussion_r1327351474
########## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ########## @@ -610,6 +610,31 @@ public Status checkIndex(List<String> onlySegments, ExecutorService executorServ return result; } + // https://github.com/apache/lucene/issues/7820: also attempt to open any older commit points (segments_N), which will catch certain + // corruption like missing _N.si files for segments not also referenced by the newest commit point (which was already loaded, Review Comment: > corruption like missing _N.si files for segments not also referenced by the newest commit point (which was already loaded, successfully, above I think loading the latest commit point, i.e., [`readCommit(lastSegmentsFile)`](https://github.com/apache/lucene/blob/17dd8179e52d88ffc9b5cf3bf69833b54f552022/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java#L601) won't succeed as assumed here because it will call [`parseSegmentInfos`](https://github.com/apache/lucene/blob/17dd8179e52d88ffc9b5cf3bf69833b54f552022/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L372C1-L372C1) will [iterate through all segmentInfos until the latest commit](https://github.com/apache/lucene/blob/17dd8179e52d88ffc9b5cf3bf69833b54f552022/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L395) and when it reaches the missing `_si` file, it would break -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org