[ 
https://issues.apache.org/jira/browse/LUCENE-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157948#comment-17157948
 ] 

Adrien Grand commented on LUCENE-9428:
--------------------------------------

[~Lai_Ding] This exception occurs when the content of a file differs from the 
data that had been written. Unfortunately this is something that is very 
expensive to check, so Lucene doesn't keep verifying checksums continuously, it 
only does it at merge time since files need to be fully read anyway.

> merge index failed with checksum failed (hardware problem?)
> -----------------------------------------------------------
>
>                 Key: LUCENE-9428
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9428
>             Project: Lucene - Core
>          Issue Type: Bug
>         Environment: lucene version:5.5.4
> jdk version :jdk1.8-1.8.0_231-fcs
>            Reporter: AllenL
>            Priority: Major
>
> Recently, a procedure using ElasticSearch appeared merge Index Failed with 
> the following exception information
>  
> {code:java}
> [2020-07-03 13:37:34,113][ERROR][index.engine             ] [Deathbird] 
> [st-sess][4] failed to merge
> [2020-07-03 13:37:34,113][ERROR][index.engine             ] [Deathbird] 
> [st-sess][4] failed to mergeorg.apache.lucene.index.CorruptIndexException: 
> checksum failed (hardware problem?) : expected=31f090d9 actual=d9697caa 
> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/lib/elasticsearch/17412c54-f974-11e9-9eef-80615f029e06/nodes/0/indices/st-sess/4/index/_3jm_Lucene50_0.tim")))
>  
> at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:334) at 
> org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:451) 
> at 
> org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.checkIntegrity(BlockTreeTermsReader.java:333)
>  
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.checkIntegrity(PerFieldPostingsFormat.java:317)
>  
> at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:96) 
> at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:193) 
> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:95) 
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4086) 
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3666) 
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
>  
> at 
> org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94)
>  
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
> [2020-07-03 13:37:34,203][WARN ][index.engine             ] [Deathbird] 
> [st-sess][4] failed engine [merge 
> failed]org.apache.lucene.index.MergePolicy$MergeException: 
> org.apache.lucene.index.CorruptIndexException: checksum failed (hardware 
> problem?) : expected=31f090d9 actual=d9697caa 
> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/lib/elasticsearch/shterm-17412c54-f974-11e9-9eef-80615f029e06/nodes/0/indices/st-sess/4/index/_3jm_Lucene50_0.tim")))
>  
> at 
> org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$1.doRun(InternalEngine.java:1237)
>  
> at 
> org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748){code}
>  
> The exception shows that it may be a hardware problem. Try to check the 
> hardware and find no exception. Check the command as follows:
>  # check device /dev/sda, /dev/sdb; but finds no hardware errors
>      using command: smartctl --xall /dev/sdx
>  # check message log /var/log/messages, no hardware problem happend
>  # The system has a state detection script, i get the system load recorded is 
> normal, IOwait is very low
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to