I saw something like this a year ago which i reported as a possible bug ( https://issues.apache.org/jira/browse/SOLR-10840, which has a full description and stack traces)
This occurred very randomly on an AWS instance; moving the index directory to a different file system did not fix the problem Eventually I cloned our environment to a new AWS instance, which proved to be the solution. Why, I have no idea... -Simon On Mon, Sep 24, 2018 at 1:13 PM, Susheel Kumar <susheel2...@gmail.com> wrote: > Got it. I'll have first hardware folks check and if they don't see/find > anything suspicious then i'll return here. > > Wondering if any body has seen similar error and if they were able to > confirm if it was hardware fault or so. > > Thnx > > On Mon, Sep 24, 2018 at 1:01 PM Erick Erickson <erickerick...@gmail.com> > wrote: > > > Mind you it could _still_ be Solr/Lucene, but let's check the hardware > > first ;) > > On Mon, Sep 24, 2018 at 9:50 AM Susheel Kumar <susheel2...@gmail.com> > > wrote: > > > > > > Hi Erick, > > > > > > Thanks so much for your reply. I'll now look mostly into any possible > > > hardware issues than Solr/Lucene. > > > > > > Thanks again. > > > > > > On Mon, Sep 24, 2018 at 12:43 PM Erick Erickson < > erickerick...@gmail.com > > > > > > wrote: > > > > > > > There are several of reasons this would "suddenly" start appearing. > > > > 1> Your disk went bad and some sector is no longer faithfully > > > > recording the bits. In this case the checksum will be wrong > > > > 2> You ran out of disk space sometime and the index was corrupted. > > > > This isn't really a hardware problem. > > > > 3> Your disk controller is going wonky and not reading reliably. > > > > > > > > The "possible hardware issue" message is to alert you that this is > > > > highly unusual and you should at leasts consider doing integrity > > > > checks on your disk before assuming it's a Solr/Lucene problem > > > > > > > > Best, > > > > Erick > > > > On Mon, Sep 24, 2018 at 9:26 AM Susheel Kumar <susheel2...@gmail.com > > > > > > wrote: > > > > > > > > > > Hello, > > > > > > > > > > I am still trying to understand the corrupt index exception we saw > > in our > > > > > logs. What does the hardware problem comment indicates here? Does > > that > > > > > mean it caused most likely due to hardware issue? > > > > > > > > > > We never had this problem in last couple of months. The Solr is > > 6.6.2 and > > > > > ZK: 3.4.10. > > > > > > > > > > Please share your thoughts. > > > > > > > > > > Thanks, > > > > > Susheel > > > > > > > > > > Caused by: org.apache.lucene.index.CorruptIndexException: checksum > > > > > failed *(hardware > > > > > problem?)* : expected=db243d1a actual=7a00d3d2 > > > > > > > > > > > (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/ > app/solr/data/COLL_shard1_replica1/data/index/_i27s.cfs") > > > > > [slice=_i27s_Lucene50_0.tim]) > > > > > > > > > > It suddenly started in the logs and before which there was no such > > error. > > > > > Searches & ingestions all seems to be working prior to that. > > > > > > > > > > ---- > > > > > > > > > > 2018-09-03 17:16:49.056 INFO (qtp834133664-519872) [c:COLL > s:shard1 > > > > > r:core_node1 x:COLL_shard1_replica1] > > > > > o.a.s.u.p.StatelessScriptUpdateProcessorFactory > > update-script#processAdd: > > > > > > > newid=G31MXMRZESC0CYPR!A-G31MXMRZESC0CYPR.2552019802_1-25520 > 08480_1-en_US > > > > > 2018-09-03 17:16:49.057 ERROR (qtp834133664-519872) [c:COLL > s:shard1 > > > > > r:core_node1 x:COLL_shard1_replica1] o.a.s.h.RequestHandlerBase > > > > > org.apache.solr.common.SolrException: Exception writing document > id > > > > > G31MXMRZESC0CYPR!A-G31MXMRZESC0CYPR.2552019802_1-2552008480_ > 1-en_US > > to > > > > the > > > > > index; possible analysis error. > > > > > at > > > > > > > > > > > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpd > ateHandler2.java:206) > > > > > at > > > > > > > > > > > org.apache.solr.update.processor.RunUpdateProcessor.processA > dd(RunUpdateProcessorFactory.java:67) > > > > > at > > > > > > > > > > > org.apache.solr.update.processor.UpdateRequestProcessor.proc > essAdd(UpdateRequestProcessor.java:55) > > > > > at > > > > > > > > > > > org.apache.solr.update.processor.DistributedUpdateProcessor. > doLocalAdd(DistributedUpdateProcessor.java:979) > > > > > at > > > > > > > > > > > org.apache.solr.update.processor.DistributedUpdateProcessor. > versionAdd(DistributedUpdateProcessor.java:1192) > > > > > at > > > > > > > > > > > org.apache.solr.update.processor.DistributedUpdateProcessor. > processAdd(DistributedUpdateProcessor.java:748) > > > > > at > > > > > > > > > > > org.apache.solr.update.processor.UpdateRequestProcessor.proc > essAdd(UpdateRequestProcessor.java:55) > > > > > at > > > > > > > > > > > org.apache.solr.update.processor.StatelessScriptUpdateProcessorFactory$ > ScriptUpdateProcessor.processAdd(StatelessScriptUpdateProces > sorFactory.java:380) > > > > > at > > > > > > > > > > > org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi > nLoader.java:98) > > > > > at > > > > > > > > > > > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod > ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180) > > > > > at > > > > > > > > > > > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod > ec$1.readIterator(JavaBinUpdateRequestCodec.java:136) > > > > > at > > > > > > > > > > > org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC > odec.java:306) > > > > > at > > > > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode > c.java:251) > > > > > at > > > > > > > > > > > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod > ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122) > > > > > at > > > > > > > > > > > org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC > odec.java:271) > > > > > at > > > > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode > c.java:251) > > > > > at > > > > > > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo > dec.java:173) > > > > > at > > > > > > > > > > > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod > ec.unmarshal(JavaBinUpdateRequestCodec.java:187) > > > > > at > > > > > > > > > > > org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc > s(JavabinLoader.java:108) > > > > > at > > > > > > org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55) > > > > > at > > > > > > > > > > > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe > questHandler.java:97) > > > > > at > > > > > > > > > > > org.apache.solr.handler.ContentStreamHandlerBase.handleReque > stBody(ContentStreamHandlerBase.java:68) > > > > > at > > > > > > > > > > > org.apache.solr.handler.RequestHandlerBase.handleRequest(Req > uestHandlerBase.java:173) > > > > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) > > > > > at > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) > > > > > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java: > 529) > > > > > at > > > > > > > > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp > atchFilter.java:361) > > > > > at > > > > > > > > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp > atchFilter.java:305) > > > > > at > > > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte > r(ServletHandler.java:1691) > > > > > at > > > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan > dler.java:582) > > > > > at > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped > Handler.java:143) > > > > > at > > > > > > > > > > > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa > ndler.java:548) > > > > > at > > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doHandle( > SessionHandler.java:226) > > > > > at > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doHandle( > ContextHandler.java:1180) > > > > > at > > > > > > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand > ler.java:512) > > > > > at > > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doScope( > SessionHandler.java:185) > > > > > at > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doScope( > ContextHandler.java:1112) > > > > > at > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped > Handler.java:141) > > > > > at > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandlerCollection.ha > ndle(ContextHandlerCollection.java:213) > > > > > at > > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerCollection.handle( > HandlerCollection.java:119) > > > > > at > > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl > erWrapper.java:134) > > > > > at > > > > > > > > > > > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr > iteHandler.java:335) > > > > > at > > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl > erWrapper.java:134) > > > > > at org.eclipse.jetty.server.Server.handle(Server.java:534) > > > > > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel. > java:320) > > > > > at > > > > > > > > > > > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne > ction.java:251) > > > > > at > > > > > org.eclipse.jetty.io > > > > .AbstractConnection$ReadCallback.succeeded(AbstractConnectio > n.java:273) > > > > > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest. > java:95) > > > > > at > > > > > org.eclipse.jetty.io > > > > .SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > > > > > at > > > > > > > > > > > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume > .executeProduceConsume(ExecuteProduceConsume.java:303) > > > > > at > > > > > > > > > > > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume > .produceConsume(ExecuteProduceConsume.java:148) > > > > > at > > > > > > > > > > > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume > .run(ExecuteProduceConsume.java:136) > > > > > at > > > > > > > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued > ThreadPool.java:671) > > > > > at > > > > > > > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT > hreadPool.java:589) > > > > > at java.lang.Thread.run(Thread.java:748) > > > > > Caused by: org.apache.lucene.store.AlreadyClosedException: this > > > > IndexWriter > > > > > is closed > > > > > at > > org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:749) > > > > > at > > org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:763) > > > > > at > > > > > > org.apache.lucene.index.IndexWriter.updateDocument(IndexWrit > er.java:1567) > > > > > at > > > > > > > > > > > org.apache.solr.update.DirectUpdateHandler2.updateDocument(D > irectUpdateHandler2.java:924) > > > > > at > > > > > > > > > > > org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocVa > lues(DirectUpdateHandler2.java:913) > > > > > at > > > > > > > > > > > org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(D > irectUpdateHandler2.java:302) > > > > > at > > > > > > > > > > > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUp > dateHandler2.java:239) > > > > > at > > > > > > > > > > > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpd > ateHandler2.java:194) > > > > > ... 54 more > > > > > Caused by: org.apache.lucene.index.CorruptIndexException: checksum > > failed > > > > > (hardware problem?) : expected=db243d1a actual=7a00d3d2 > > > > > > > > > > > (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/ > app/solr/data/COLL_shard1_replica1/data/index/_i27s.cfs") > > > > > [slice=_i27s_Lucene50_0.tim])) > > > > > at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil. > java:419) > > > > > at > > > > > > org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecU > til.java:526) > > > > > at > > > > > > > > > > > org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.chec > kIntegrity(BlockTreeTermsReader.java:336) > > > > > at > > > > > > > > > > > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$Fie > ldsReader.checkIntegrity(PerFieldPostingsFormat.java:348) > > > > > at > > > > > > > > > > > org.apache.lucene.codecs.perfield.PerFieldMergeState$FilterF > ieldsProducer.checkIntegrity(PerFieldMergeState.java:271) > > > > > at > > org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:96) > > > > > at > > > > > > > > > > > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$Fie > ldsWriter.merge(PerFieldPostingsFormat.java:164) > > > > > at > > > > > > org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:216) > > > > > at > > org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:101) > > > > > at > > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4356) > > > > > at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java: > 3931) > > > > > at > > org.apache.solr.update.SolrIndexWriter.merge(SolrIndexWriter.java:188) > > > > > at > > > > > > > > > > > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(Con > currentMergeScheduler.java:624) > > > > > at > > > > > > > > > > > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread > .run(ConcurrentMergeScheduler.java:661) > > > > > > > > > > 2018-09-03 17:16:49.116 INFO (qtp834133664-519872) [c:COLL > s:shard1 > > > > > r:core_node1 x:COLL_shard1_replica1] o.a.s.c.S.Request > > > > > [COLL_shard1_replica1] webapp=/solr path=/update > > > > > params={wt=javabin&version=2} status=400 QTime=69 > > > > > > >