gokaai commented on code in PR #12530:
URL: https://github.com/apache/lucene/pull/12530#discussion_r1388271749


##########
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##########
@@ -610,6 +610,31 @@ public Status checkIndex(List<String> onlySegments, 
ExecutorService executorServ
       return result;
     }
 
+    // https://github.com/apache/lucene/issues/7820: also attempt to open any 
older commit points (segments_N), which will catch certain
+    // corruption like missing _N.si files for segments not also referenced by 
the newest commit point (which was already loaded,

Review Comment:
   @mikemccand Just to clarify this comment - I was using @buzztaiki 's 
[original test 
case](https://github.com/apache/lucene/issues/7009#issuecomment-1223544484) 
with slight modifications to test this:
   
   ```java
   package org.apache.lucene.index;
   
   import java.io.IOException;
   import java.nio.file.Files;
   import java.nio.file.Path;
   
   import org.apache.lucene.document.Document;
   import org.apache.lucene.document.Field.Store;
   import org.apache.lucene.document.IntField;
   import org.apache.lucene.store.FSDirectory;
   
   public class TestBrokenIndex {
       public static void main(String[] args) throws Exception {
           Path path = Files.createTempDirectory("lucene_");
           Runtime.getRuntime().addShutdownHook(new Thread(new 
DirectoryCleaner(path)));
   
           FSDirectory dir = FSDirectory.open(path.toFile().toPath());
           IndexWriterConfig iwc = new IndexWriterConfig()
                   .setMergePolicy(NoMergePolicy.INSTANCE);
           try (IndexWriter iw = new IndexWriter(dir, iwc)) {
               for (int i = 0; i < 10; i++) {
                   Document doc = new Document();
                   doc.add(new IntField("id", i, Store.NO));
                   iw.addDocument(doc);
                   iw.commit();
               }
           }
   
           for (Path f : Files.newDirectoryStream(path)) {
               String fname = f.getFileName().toString();
               if (fname.endsWith(".si")) {
                   Files.delete(f);
                   break;
               }
           }
           CheckIndex.main(new String[]{path.toString()});
       }
   
       private static class DirectoryCleaner implements Runnable {
           private final Path path;
   
           DirectoryCleaner(Path path) {
               this.path = path;
           }
   
           public void run() {
               try {
                   for (Path f : Files.newDirectoryStream(path)) {
                       Files.delete(f);
                   }
                   Files.delete(path);
               } catch (IOException e) {
                   throw new RuntimeException(e);
               }
           }
       }
   }
   ```
   
   When I run this test, the error does not clearly mention that there is a 
corrupt segment which is causing `CheckIndex` to fail:
   
   ```java
   Opening index @ 
/var/folders/xh/fjjhkc7n23bg9lw00x7ft4_np1t662/T/lucene_15492910455934766862
   
   Checking index with threadCount: 1
   ERROR: could not read any segments file in directory
   org.apache.lucene.index.CorruptIndexException: Unexpected file read error 
while reading index. 
(resource=BufferedChecksumIndexInput(MemorySegmentIndexInput(path="/private/var/folders/xh/fjjhkc7n23bg9lw00x7ft4_np1t662/T/lucene_15492910455934766862/segments_a")))
        at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:297)
        at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:613)
        at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:559)
        at org.apache.lucene.index.CheckIndex.doCheck(CheckIndex.java:4181)
        at org.apache.lucene.index.CheckIndex.doMain(CheckIndex.java:4054)
        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:3986)
        at org.apache.lucene.index.TestBrokenIndex.main(TestBrokenIndex.java:36)
   Caused by: java.nio.file.NoSuchFileException: 
/private/var/folders/xh/fjjhkc7n23bg9lw00x7ft4_np1t662/T/lucene_15492910455934766862/_4.si
   could not read any segments file in directory
   
        at 
java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
        at 
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
        at 
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
        at 
java.base/sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:224)
        at java.base/java.nio.channels.FileChannel.open(FileChannel.java:309)
        at java.base/java.nio.channels.FileChannel.open(FileChannel.java:369)
        at 
org.apache.lucene.store.MemorySegmentIndexInputProvider.openInput(MemorySegmentIndexInputProvider.java:51)
        at 
org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:252)
        at 
org.apache.lucene.store.Directory.openChecksumInput(Directory.java:156)
        at 
org.apache.lucene.codecs.lucene99.Lucene99SegmentInfoFormat.read(Lucene99SegmentInfoFormat.java:94)
        at 
org.apache.lucene.index.SegmentInfos.parseSegmentInfos(SegmentInfos.java:398)
        at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:359)
        at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:295)
   Caused by: java.nio.file.NoSuchFileException: 
/private/var/folders/xh/fjjhkc7n23bg9lw00x7ft4_np1t662/T/lucene_15492910455934766862/_4.si
   
        ... 6 more
        Suppressed: org.apache.lucene.index.CorruptIndexException: checksum 
passed (b2227536). possibly transient resource issue, or a Lucene or JVM bug 
(resource=BufferedChecksumIndexInput(MemorySegmentIndexInput(path="/private/var/folders/xh/fjjhkc7n23bg9lw00x7ft4_np1t662/T/lucene_15492910455934766862/segments_a")))
                at 
org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:501)
                at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:366)
                ... 7 more
   
   Execution failed for task ':lucene:core:TestBrokenIndex.main()'.
   ```
   
   The exception happens while trying to read the latestSegment, well before it 
executes the check for other segments. On the contrary, if I delete data 
(`.cfs/.cfe`) files instead of `.si` files, there is a more clear error:
   
   (On replacing `if (fname.endsWith(".si")) { Files.delete(f); ... ` with `if 
(fname.endsWith(".cfs")) { Files.delete(f); ... `
   
   ```java
   Opening index @ 
/var/folders/xh/fjjhkc7n23bg9lw00x7ft4_np1t662/T/lucene_17693986113485513388
   
   Checking index with threadCount: 1
   0.00% total deletions; 10 documents; 0 deletions
   Segments file=segments_a numSegments=10 version=10.0.0 
id=5pu3cylg0srqobtopelrfelqu
   1 of 10: name=_0 maxDoc=1
       version=10.0.0
       id=5pu3cylg0srqobtopelrfelpr
       codec=Lucene99
       compound=true
       numFiles=3
       size (MB)=0.002
       diagnostics = {timestamp=1699544238618, java.runtime.version=21+35-LTS, 
os=Mac OS X, java.vendor=Amazon.com Inc., os.arch=aarch64, os.version=13.5.2, 
lucene.version=10.0.0, source=flush}
       no deletions
       test: open reader.........OK [took 0.013 sec]
       test: check integrity.....OK [took 0.000 sec]
       test: check live docs.....OK [took 0.000 sec]
       test: field infos.........OK [1 fields] [took 0.000 sec]
       test: field norms.........OK [0 fields] [took 0.000 sec]
       test: terms, freq, prox...    test: stored fields.......OK [0 total 
field count; avg 0.0 fields per doc] [took 0.001 sec]
       test: term vectors........OK [0 total term vector count; avg 0.0 
term/freq vector fields per doc] [took 0.000 sec]
       test: docvalues...........OK [1 docvalues fields; 0 BINARY; 0 NUMERIC; 0 
SORTED; 1 SORTED_NUMERIC; 0 SORTED_SET] [took 0.001 sec]
       test: points..............OK [1 fields, 1 points] [took 0.002 sec]
       test: vectors.............OK [0 fields, 0 vectors] [took 0.000 sec]
   
   .......
   
   10 of 10: name=_9 maxDoc=1
       version=10.0.0
       id=5pu3cylg0srqobtopelrfelqr
       codec=Lucene99
       compound=true
       numFiles=3
   FAILED
       WARNING: exorciseIndex() would remove reference to this segment; full 
exception:
   java.nio.file.NoSuchFileException: 
/private/var/folders/xh/fjjhkc7n23bg9lw00x7ft4_np1t662/T/lucene_17693986113485513388/_9.cfs
        at 
java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
        at 
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
        at 
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
        at 
java.base/sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
       WARNING: exorciseIndex() would remove reference to this segment; full 
exception:
   
        at 
java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:171)
        at java.base/java.nio.file.Files.readAttributes(Files.java:1853)
        at java.base/java.nio.file.Files.size(Files.java:2462)
        at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:208)
        at 
org.apache.lucene.index.SegmentCommitInfo.sizeInBytes(SegmentCommitInfo.java:230)
        at org.apache.lucene.index.CheckIndex.testSegment(CheckIndex.java:930)
        at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:739)
        at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:559)
        at org.apache.lucene.index.CheckIndex.doCheck(CheckIndex.java:4181)
        at org.apache.lucene.index.CheckIndex.doMain(CheckIndex.java:4054)
        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:3986)
        at org.apache.lucene.index.TestBrokenIndex.main(TestBrokenIndex.java:36)
   
   WARNING: 1 broken segments (containing 1 documents) detected
   Took 0.040 sec total.
   1 broken segments (containing 1 documents) detected
   
   WARNING: would write new segments file, and 1 documents would be lost, if 
-exorcise were specified
   
   
   
   would write new segments file, and 1 documents would be lost, if -exorcise 
were specified
   
   Execution failed for task ':lucene:core:TestBrokenIndex.main()'.
   ```
   
   I would like to try to make missing `.si` files behave the same way as 
having missing `.cfs` do currently and make it possible to use `-exorcise` for 
this case



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to