mikemccand commented on code in PR #12530:
URL: https://github.com/apache/lucene/pull/12530#discussion_r1388366133


##########
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##########
@@ -610,6 +610,31 @@ public Status checkIndex(List<String> onlySegments, 
ExecutorService executorServ
       return result;
     }
 
+    // https://github.com/apache/lucene/issues/7820: also attempt to open any 
older commit points (segments_N), which will catch certain
+    // corruption like missing _N.si files for segments not also referenced by 
the newest commit point (which was already loaded,

Review Comment:
   > @mikemccand Just to clarify this comment - I was using @buzztaiki 's 
[original test 
case](https://github.com/apache/lucene/issues/7009#issuecomment-1223544484) 
with slight modifications to test this:
   
   Hmm I'm confused -- doesn't `parseSegmentInfos` read a single `segments_N` 
file?  It goes through that segments file and reads each separate 
`SegmentInfo`, but not the other `segments_(N-1)` files in the index?
   
   I thought the issue here was `segments_N` (and all the separate segments / 
`.si` files it references) is intact, but, `segments_(N-1)` is broken because 
it references a segment where its `.si` file is missing?
   
   > I would like to try to make missing .si files behave the same way as 
having missing .cfs do currently and make it possible to use -exorcise for this 
case
   
   Maybe we could fix the exception thrown when a `.si` cannot be found to a 
subclass of `CorruptIndexException` and add a member 
e.g.`set/getAffectedSegment` that would tell us which segment the `.si` 
belonged to?  And `CheckIndex` could catch that and do its `excorcise` thing?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to