I realized that after doing the commit manually, two shards had a lot fewer files than the 3rd shard (which failed on commit). However, with the passage of time, the number of files continued to decrease for the shard with more files. FWIW, each shard has exactly same number of document and similar to size in bytes. So, i am not sure what caused the skew. What seems is that one shard was lagging in shard merging and the merging continues for couple hours even after indexing is finished. For this second attempt, I will commit after the file count starts to stabilize In my next attempt, I will increase number of merge threads and see how it goes.
Regards Nawab On Thu, Aug 3, 2017 at 12:52 PM, Nawab Zada Asad Iqbal <khi...@gmail.com> wrote: > Hi, > > I have a host with 3 solr processes running, each with one shard only; > there are no replicas. I am reindexing some 100 GB of data per solr (or per > shard since each solr has one shard). > > After about 3 hours, I manually committed once. I was able to get through > 40 GB in each shard, and the commit response came within 2 minutes. > > After another 3 hours, I stopped the indexing client and manually > committed again. Two shards returned within few minutes (total size now > 85GB +). However, 3rd shard (or solr process) was stuck for almost two > hours now (until I stopped the server). And has started to throw the > following exception: > This shard also has a lot more files in data/index folder. 68k vs 31k & in > other two shards. I am not sure if that can impact. FWIW, the file > descriptors limit on this host is 65k. > I am relying on default concurrent merge scheduler (which probably opens 4 > threads on each solr server). > > java.nio.file.NoSuchFileException: /box/var/solr/shard1/ > filesearch/data/index/segments_2 > at sun.nio.fs.UnixException.translateToIOException( > UnixException.java:86) > at sun.nio.fs.UnixException.rethrowAsIOException( > UnixException.java:102) > at sun.nio.fs.UnixException.rethrowAsIOException( > UnixException.java:107) > at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes( > UnixFileAttributeViews.java:55) > at sun.nio.fs.UnixFileSystemProvider.readAttributes( > UnixFileSystemProvider.java:144) > at sun.nio.fs.LinuxFileSystemProvider.readAttributes( > LinuxFileSystemProvider.java:99) > at java.nio.file.Files.readAttributes(Files.java:1737) > at java.nio.file.Files.size(Files.java:2332) > at org.apache.lucene.store.FSDirectory.fileLength( > FSDirectory.java:243) > at org.apache.solr.handler.admin.LukeRequestHandler.getFileLength( > LukeRequestHandler.java:615) > at org.apache.solr.handler.admin.LukeRequestHandler.getIndexInfo( > LukeRequestHandler.java:588) > at org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody( > LukeRequestHandler.java:138) > at org.apache.solr.handler.RequestHandlerBase.handleRequest( > RequestHandlerBase.java:173) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) > at org.apache.solr.servlet.SolrDispatchFilter.doFilter( > SolrDispatchFilter.java:361) > at org.apache.solr.servlet.SolrDispatchFilter.doFilter( > SolrDispatchFilter.java:305) > at org.eclipse.jetty.servlet.ServletHandler$CachedChain. > doFilter(ServletHandler.java:1691) > at org.eclipse.jetty.servlet.ServletHandler.doHandle( > ServletHandler.java:582) > at org.eclipse.jetty.server.handler.ScopedHandler.handle( > ScopedHandler.java:143) > at org.eclipse.jetty.security.SecurityHandler.handle( > SecurityHandler.java:548) > at org.eclipse.jetty.server.session.SessionHandler. > doHandle(SessionHandler.java:226) > at org.eclipse.jetty.server.handler.ContextHandler. > doHandle(ContextHandler.java:1180) > at org.eclipse.jetty.servlet.ServletHandler.doScope( > ServletHandler.java:512) > at org.eclipse.jetty.server.session.SessionHandler. > doScope(SessionHandler.java:185) > at org.eclipse.jetty.server.handler.ContextHandler. > doScope(ContextHandler.java:1112) > at org.eclipse.jetty.server.handler.ScopedHandler.handle( > ScopedHandler.java:141) > at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle( > ContextHandlerCollection.java:213) > at org.eclipse.jetty.server.handler.HandlerCollection. > handle(HandlerCollection.java:119) > at org.eclipse.jetty.server.handler.HandlerWrapper.handle( > HandlerWrapper.java:134) > at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle( > RewriteHandler.java:335) > at org.eclipse.jetty.server.handler.HandlerWrapper.handle( > HandlerWrapper.java:134) > at org.eclipse.jetty.server.Server.handle(Server.java:534) > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) > at org.eclipse.jetty.server.HttpConnection.onFillable( > HttpConnection.java:251) > at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded( > AbstractConnection.java:273) > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) > at org.eclipse.jetty.io.SelectChannelEndPoint$2.run( > SelectChannelEndPoint.java:93) > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume. > executeProduceConsume(ExecuteProduceConsume.java:303) > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume. > produceConsume(ExecuteProduceConsume.java:148) > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run( > ExecuteProduceConsume.java:136) > at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob( > QueuedThreadPool.java:671) > at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run( > QueuedThreadPool.java:589) > at java.lang.Thread.run(Thread.java:748) > > ~~~~~~~~~~~~~~~~~~~~~~~~ > > After stopping the server, I noticed this stacktrace:- (It seems that one > indexing request was still stuck in the system even after 2 hours and it > was probably in a deadlock). > > org.apache.solr.common.SolrException: Exception writing document id > file_52520005033 to the index; possible analysis error. > at org.apache.solr.update.DirectUpdateHandler2.addDoc( > DirectUpdateHandler2.java:206) > at org.apache.solr.update.processor.RunUpdateProcessor.processAdd( > RunUpdateProcessorFactory.java:67) > at org.apache.solr.update.processor.UpdateRequestProcessor. > processAdd(UpdateRequestProcessor.java:55) > at org.apache.solr.update.processor.DistributedUpdateProcessor. > versionAdd(DistributedUpdateProcessor.java:1005) > at org.apache.solr.update.processor.DistributedUpdateProcessor. > processAdd(DistributedUpdateProcessor.java:748) > at org.apache.solr.update.processor.LogUpdateProcessorFactory$ > LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103) > at org.apache.solr.update.processor.UpdateRequestProcessor. > processAdd(UpdateRequestProcessor.java:55) > at org.apache.solr.update.processor.LanguageIdentifierUpdateProces > sor.processAdd(LanguageIdentifierUpdateProcessor.java:205) > at org.apache.solr.handler.loader.XMLLoader. > processUpdate(XMLLoader.java:261) > at org.apache.solr.handler.loader.XMLLoader.load( > XMLLoader.java:188) > at org.apache.solr.handler.UpdateRequestHandler$1.load( > UpdateRequestHandler.java:97) > at org.apache.solr.handler.ContentStreamHandlerBase. > handleRequestBody(ContentStreamHandlerBase.java:68) > at org.apache.solr.handler.RequestHandlerBase.handleRequest( > RequestHandlerBase.java:173) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) > at org.apache.solr.servlet.HttpSolrCall.execute( > HttpSolrCall.java:723) > at org.apache.solr.servlet.HttpSolrCall.call( > HttpSolrCall.java:529) > at org.apache.solr.servlet.SolrDispatchFilter.doFilter( > SolrDispatchFilter.java:361) > at org.apache.solr.servlet.SolrDispatchFilter.doFilter( > SolrDispatchFilter.java:305) > at org.eclipse.jetty.servlet.ServletHandler$CachedChain. > doFilter(ServletHandler.java:1691) > at org.eclipse.jetty.servlet.ServletHandler.doHandle( > ServletHandler.java:582) > at org.eclipse.jetty.server.handler.ScopedHandler.handle( > ScopedHandler.java:143) > at org.eclipse.jetty.security.SecurityHandler.handle( > SecurityHandler.java:548) > at org.eclipse.jetty.server.session.SessionHandler. > doHandle(SessionHandler.java:226) > at org.eclipse.jetty.server.handler.ContextHandler. > doHandle(ContextHandler.java:1180) > at org.eclipse.jetty.server.session.SessionHandler. > doHandle(SessionHandler.java:226) > at org.eclipse.jetty.server.handler.ContextHandler. > doHandle(ContextHandler.java:1180) > at org.eclipse.jetty.servlet.ServletHandler.doScope( > ServletHandler.java:512) > at org.eclipse.jetty.server.session.SessionHandler. > doScope(SessionHandler.java:185) > at org.eclipse.jetty.server.handler.ContextHandler. > doScope(ContextHandler.java:1112) > at org.eclipse.jetty.server.handler.ScopedHandler.handle( > ScopedHandler.java:141) > at org.eclipse.jetty.server.handler.ContextHandlerCollection. > handle(ContextHandlerCollection.java:213) > at org.eclipse.jetty.server.handler.HandlerCollection. > handle(HandlerCollection.java:119) > at org.eclipse.jetty.server.handler.HandlerWrapper.handle( > HandlerWrapper.java:134) > at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle( > RewriteHandler.java:335) > at org.eclipse.jetty.server.handler.HandlerWrapper.handle( > HandlerWrapper.java:134) > at org.eclipse.jetty.server.Server.handle(Server.java:534) > at org.eclipse.jetty.server.HttpChannel.handle( > HttpChannel.java:320) > at org.eclipse.jetty.server.HttpConnection.onFillable( > HttpConnection.java:251) > at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded( > AbstractConnection.java:273) > at org.eclipse.jetty.io.FillInterest.fillable( > FillInterest.java:95) > at org.eclipse.jetty.io.SelectChannelEndPoint$2.run( > SelectChannelEndPoint.java:93) > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume. > executeProduceConsume(ExecuteProduceConsume.java:303) > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume. > produceConsume(ExecuteProduceConsume.java:148) > at org.eclipse.jetty.util.thread.strategy. > ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob( > QueuedThreadPool.java:671) > at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run( > QueuedThreadPool.java:589) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.lucene.util.ThreadInterruptedException: > java.lang.InterruptedException > at org.apache.lucene.index.ConcurrentMergeScheduler.doStall( > ConcurrentMergeScheduler.java:618) > at org.apache.lucene.index.ConcurrentMergeScheduler.maybeStall( > ConcurrentMergeScheduler.java:602) > at org.apache.lucene.index.ConcurrentMergeScheduler.merge( > ConcurrentMergeScheduler.java:524) > at org.apache.lucene.index.IndexWriter.maybeMerge( > IndexWriter.java:2083) > at org.apache.lucene.index.IndexWriter.doAfterSegmentFlushed( > IndexWriter.java:5005) > at org.apache.lucene.index.DocumentsWriter$ > MergePendingEvent.process(DocumentsWriter.java:731) > at org.apache.lucene.index.IndexWriter.processEvents( > IndexWriter.java:5043) > at org.apache.lucene.index.IndexWriter.processEvents( > IndexWriter.java:5034) > at org.apache.lucene.index.IndexWriter.updateDocument( > IndexWriter.java:1574) > at org.apache.solr.update.DirectUpdateHandler2.updateDocument( > DirectUpdateHandler2.java:924) > at org.apache.solr.update.DirectUpdateHandler2. > updateDocOrDocValues(DirectUpdateHandler2.java:913) > at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate( > DirectUpdateHandler2.java:302) > at org.apache.solr.update.DirectUpdateHandler2.addDoc0( > DirectUpdateHandler2.java:239) > at org.apache.solr.update.DirectUpdateHandler2.addDoc( > DirectUpdateHandler2.java:194) > ... 44 more > Caused by: java.lang.InterruptedException > at java.lang.Object.wait(Native Method) > at org.apache.lucene.index.ConcurrentMergeScheduler.doStall( > ConcurrentMergeScheduler.java:616) > ... 57 m > > Then a long list of: > 2017-08-03 19:33:57.807 WARN (ShutdownMonitor) [ ] > o.e.j.u.t.QueuedThreadPool qtp761960786{STOPPING,10<=6509<=10000,i=0,q=6480} > Couldn't stop Thread[qtp761960786-14603,5,main] > ... >