[ https://issues.apache.org/jira/browse/LUCENE-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308339#comment-17308339 ]
Robert Muir commented on LUCENE-9867: ------------------------------------- [~sqshq] you really shouldn't see FileNotFoundException from lucene at all, as it uses exclusively nio.2 APIs: If there was a problem I would expect NoSuchFileException. If you happen to have a stacktrace for that one, can you attach it here as well? I will simulate the environment with a centos 7 kvm and try to trigger the problem, but it may take a few weeks (I am leaving on travel shortly for a bit). So please don't get discouraged. I am a little bit confused about how this happens with first indexing pattern, as I see {{segments_578fu}}, which is a large commit number, that doesn't line up with what you have described (seems to indicate large number of commits?). Is there anything else going on (e.g. other processes) that we should know about? > CorruptIndexException after failed segment merge caused by No space left on > device > ---------------------------------------------------------------------------------- > > Key: LUCENE-9867 > URL: https://issues.apache.org/jira/browse/LUCENE-9867 > Project: Lucene - Core > Issue Type: Bug > Components: core/store > Affects Versions: 8.5 > Reporter: Alexander L > Priority: Major > > Failed segment merge caused by "No space left on device" can't be recovered > and Lucene fails with CorruptIndexException after restart. The expectation is > that Lucene will be able to restart automatically without manual intervention. > We have 2 indexing patterns: > * Create and commit an empty index, then start long initial indexing process > (might take hours), perform a second commit in the end > * Using existing index, add no more than 4k documents and commit after that > Seems like the first pattern might cause more problems, but we definitely > witnessed a similar situation for the second pattern, although it was a bit > different - caused by {{OutOfMemoryError: Java Heap Space}}, with missing > {{_q.cfe}} file which produced only {{FileNotFoundException}}, not > {{CorruptIndexException}}. Please let me know if we need a separate ticket > for that. > Lucene version: 8.5.0 > Java version: OpenJDK 11 > OS: CentOS Linux 7 > Kernel: Linux 3.10.0-1160.11.1.el7.x86_64 > Virtualization: kvm > Filesystem: xfs > Failed merge stacktrace: > {code:java} > 2021-02-02T08:51:51.679+0000 > org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: No > space left on device > at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:704) > at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684) > Caused by: java.io.IOException: No space left on device > at java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method) > at > java.base/sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:62) > at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:113) > at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:79) > at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:280) > at java.base/java.nio.channels.Channels.writeFullyImpl(Channels.java:74) > at java.base/java.nio.channels.Channels.writeFully(Channels.java:97) > at java.base/java.nio.channels.Channels$1.write(Channels.java:172) > at > org.apache.lucene.store.FSDirectory$FSIndexOutput$1.write(FSDirectory.java:416) > at > java.base/java.util.zip.CheckedOutputStream.write(CheckedOutputStream.java:74) > at > java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) > at > java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127) > at > org.apache.lucene.store.OutputStreamIndexOutput.writeBytes(OutputStreamIndexOutput.java:53) > at > org.apache.lucene.store.RateLimitedIndexOutput.writeBytes(RateLimitedIndexOutput.java:73) > at org.apache.lucene.util.compress.LZ4.encodeLiterals(LZ4.java:159) > at org.apache.lucene.util.compress.LZ4.encodeSequence(LZ4.java:172) > at org.apache.lucene.util.compress.LZ4.compress(LZ4.java:441) > at > org.apache.lucene.codecs.compressing.CompressionMode$LZ4FastCompressor.compress(CompressionMode.java:165) > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.flush(CompressingStoredFieldsWriter.java:229) > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.finishDocument(CompressingStoredFieldsWriter.java:159) > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.merge(CompressingStoredFieldsWriter.java:636) > at > org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:229) > at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106) > at > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4463) > at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4057) > at > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:625) > at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:662) > {code} > Followed by failed startup: > {code:java} > 2021-02-02T08:52:07.926+0000 > org.apache.lucene.index.CorruptIndexException: Unexpected file read error > while reading index. > (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/data/5f91aa0b07ce4d5e7beffaa2/segments_578fu"))) > at > org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) > at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:846) > Caused by: java.nio.file.NoSuchFileException: > /data/5f91aa0b07ce4d5e7beffaa2/_6lfem.si > at > java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) > at > java.base/sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:182) > at java.base/java.nio.channels.FileChannel.open(FileChannel.java:292) > at java.base/java.nio.channels.FileChannel.open(FileChannel.java:345) > at > org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:81) > at > org.apache.lucene.store.Directory.openChecksumInput(Directory.java:157) > at > org.apache.lucene.codecs.lucene70.Lucene70SegmentInfoFormat.read(Lucene70SegmentInfoFormat.java:91) > at > org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:353) > at > org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:289) > ... 33 common frames omitted > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org