mikemccand commented on code in PR #12872:
URL: https://github.com/apache/lucene/pull/12872#discussion_r1418947126


##########
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##########
@@ -873,13 +876,26 @@ public int compare(String a, String b) {
     if (0 == result.numBadSegments) {
       result.clean = true;
     } else {
-      msg(
-          infoStream,
-          "WARNING: "
-              + result.numBadSegments
-              + " broken segments (containing "
-              + result.totLoseDocCount
-              + " documents) detected");
+      if (result.numBadSegmentsWithoutSegmentInfo > 0) {
+        msg(
+            infoStream,
+            "WARNING: "
+                + result.numBadSegments
+                + " broken segments (containing "

Review Comment:
   Maybe say `containing at least`?



##########
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##########
@@ -664,7 +667,7 @@ public int compare(String a, String b) {
     // failure (fastFail == false).  so if we get here, we should // always 
have a valid lastCommit:
     assert lastCommit != null;
 
-    if (lastCommit == null) {
+    if (lastCommit == null) { // TODO: Remove/move unreachable if statement?

Review Comment:
   Hmm this might still happen?  E.g. some transient `IOException` or an Access 
Denied or so?



##########
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##########
@@ -957,6 +974,9 @@ private Status.SegmentInfoStatus testSegment(
     SegmentReader reader = null;
 
     try {
+      if (info.info.maxDoc() == 0) {
+        throw new CheckIndexException(" this segment has missing or corrupt 
segment info");

Review Comment:
   Remove the extra space before `this`?
   
   Also, can we include specifics about which segment we were unable to read 
the `.si` for in the exception message?



##########
lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java:
##########
@@ -389,13 +389,39 @@ private static void parseSegmentInfos(
     }
 
     long totalDocs = 0;
+    SegmentInfo info;
     for (int seg = 0; seg < numSegments; seg++) {
       String segName = input.readString();
       byte[] segmentID = new byte[StringHelper.ID_LENGTH];
       input.readBytes(segmentID, 0, segmentID.length);
       Codec codec = readCodec(input);
-      SegmentInfo info =
-          codec.segmentInfoFormat().read(directory, segName, segmentID, 
IOContext.READ);
+      try {
+        info = codec.segmentInfoFormat().read(directory, segName, segmentID, 
IOContext.READ);
+      } catch (FileNotFoundException | NoSuchFileException e) {

Review Comment:
   Hmm can we upgrade this to catch any "normal" (not crazy things like OOME) 
`Throwable`?  E.g. if the `.si` file had some bits scrambled or what not we 
will get a more exotic exception or maybe an exception because the checksum is 
mismatched.
   
   Can we somehow return the root cause exception here and include it in 
`CheckIndexException` that we raise above?



##########
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##########
@@ -957,6 +974,9 @@ private Status.SegmentInfoStatus testSegment(
     SegmentReader reader = null;
 
     try {
+      if (info.info.maxDoc() == 0) {

Review Comment:
   Ahh I see, you now separately handle the `maxDoc == 0` case, here.  But, 
what if an index has an illegal 0 doc segment, and an intact `.si`?  We will 
now (incorrectly) state that it is missing its `SegmentInfo`?  Maybe we could 
add a separate boolean somewhere to track `segmentInfoMissing` or so?



##########
lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java:
##########
@@ -389,13 +389,39 @@ private static void parseSegmentInfos(
     }
 
     long totalDocs = 0;
+    SegmentInfo info;
     for (int seg = 0; seg < numSegments; seg++) {
       String segName = input.readString();
       byte[] segmentID = new byte[StringHelper.ID_LENGTH];
       input.readBytes(segmentID, 0, segmentID.length);
       Codec codec = readCodec(input);
-      SegmentInfo info =
-          codec.segmentInfoFormat().read(directory, segName, segmentID, 
IOContext.READ);
+      try {
+        info = codec.segmentInfoFormat().read(directory, segName, segmentID, 
IOContext.READ);
+      } catch (FileNotFoundException | NoSuchFileException e) {
+        // https://github.com/apache/lucene/issues/7820 - If we throw a 
CorruptIndex exception here,
+        // a missing segment info will cause parseSegmentInfos() to 
short-circuit and stop reading
+        // the rest of the segments. So, when encountering a segment with a 
missing
+        // segment info (.si) file, we initialize an empty segment info 
(containing 0 docs)
+        // to keep track of the segment name, so that we can continue reading 
other segments, and
+        // the segment with the missing .si file can be later validated or 
exorcised by CheckIndex()
+        info =

Review Comment:
   Hmm I don't like that this will suppress corruption when `SegmentInfos` is 
being loaded NOT from `CheckIndex`?  This is the wrong layer to have such 
leniency?  This method should throw the exception back up to the caller and 
caller should handle properly?
   
   Could we instead 1) create a subclass of `CorruptIndexException`, maybe 
`CorruptSegmentInfoException` or so, 2) require that caller pass in the segment 
name, and root cause exception, when creating it, and 3) catch any `Throwable` 
here and re-throw that subclass?
   
   Then we can fix `CheckIndex` to specifically catch that, pull the segment 
name out, and do its thing.



##########
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##########
@@ -948,7 +965,7 @@ private Status.SegmentInfoStatus testSegment(
     segInfoStat.maxDoc = info.info.maxDoc();
 
     final Version version = info.info.getVersion();
-    if (info.info.maxDoc() <= 0) {
+    if (info.info.maxDoc() < 0) {

Review Comment:
   Hmm why did you have to remove the `0` case?  Lucene considers this 
corruption (IW should never flush or merge to a 0 doc segment).



##########
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##########
@@ -4270,11 +4290,21 @@ public int doCheck(Options opts) throws IOException, 
InterruptedException {
                 + result.totLoseDocCount
                 + " documents would be lost, if -exorcise were specified\n");
       } else {
-        opts.out.println("WARNING: " + result.totLoseDocCount + " documents 
will be lost\n");
-        opts.out.println(
-            "NOTE: will write new segments file in 5 seconds; this will remove 
"
-                + result.totLoseDocCount
-                + " docs from the index. YOU WILL LOSE DATA. THIS IS YOUR LAST 
CHANCE TO CTRL+C!");
+        if (result.numBadSegmentsWithoutSegmentInfo > 0) {
+          opts.out.println(
+              "WARNING: " + result.totLoseDocCount + " or more documents will 
be lost\n");

Review Comment:
   Can we change the wording to `at least ...`?  I think that's stronger / more 
alarming on casual read than ` or more`.
   
   Or maybe `more than XX documents will be lost`, since we (almost certainly) 
know the lost segment(s) have `maxDoc > 0`?
   
   Whichever wording we use, let's be consistent in the thrown exception 
message above too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to