To add to the details of above issue the query as soon it is executed, even before the OutOfMemory error causes the solr servers to become non-responsive.
On Tue, Dec 23, 2014 at 5:04 PM, Modassar Ather <modather1...@gmail.com> wrote: > Hi, > > I have a setup of 4 shard Solr cluster with embedded zookeeper on one of > them. The zkClient time out is set to 30 seconds, -Xms is 20g and -Xms is > 24g. > When executing a huge query with many wildcards inside it the server > crashes and becomes non-responsive. Even the dashboard does not responds > and shows connection lost error. This requires me to restart the servers. > > I have set query *timeAllowed* to 5 minutes but it also seems to be not > getting honored and the query hangs around. > > Kindly help me debug the issue and fix it or if there is a way the > timeAllowed can be made honored or a the query which is hanging for > sometime can be stopped. > > *Following are few exceptions.* > > *org.apache.zookeeper.server.NIOServerCnxn doIO* > WARNING: caught end of stream exception > EndOfStreamException: Unable to read additional data from client sessionid > <session id>, likely client has closed socket > at > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > at java.lang.Thread.run(Thread.java:745) > > *org.apache.zookeeper.server.NIOServerCnxn sendBuffer* > SEVERE: Unexpected Exception: > java.nio.channels.CancelledKeyException > at > sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) > at > sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) > > *org.apache.zookeeper.server.persistence.FileTxnLog commit* > WARNING: fsync-ing the write ahead log in SyncThread:0 took 28346ms which > will adversely effect operation latency. See the ZooKeeper troubleshooting > guide > org.apache.solr.common.SolrException log > SEVERE: org.apache.solr.common.SolrException: no servers hosting shard: > at > org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149) > at > org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > *Caused by: java.lang.OutOfMemoryError: Java heap space* > at > org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.<init>(Lucene41PostingsReader.java:640) > at > org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.docsAndPositions(Lucene41PostingsReader.java:278) > at > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.docsAndPositions(SegmentTermsEnum.java:1011) > at > org.apache.lucene.search.spans.SpanTermQuery.getSpans(SpanTermQuery.java:123) > at > org.apache.lucene.search.spans.SpanOrQuery$1.initSpanQueue(SpanOrQuery.java:180) > at > org.apache.lucene.search.spans.SpanOrQuery$1.next(SpanOrQuery.java:193) > at > org.apache.lucene.search.spans.SpanOrQuery$1.initSpanQueue(SpanOrQuery.java:182) > at > org.apache.lucene.search.spans.SpanOrQuery$1.next(SpanOrQuery.java:193) > at > org.apache.lucene.search.spans.NearSpansUnordered$SpansCell.next(NearSpansUnordered.java:88) > at > org.apache.lucene.search.spans.NearSpansUnordered.initList(NearSpansUnordered.java:295) > at > org.apache.lucene.search.spans.NearSpansUnordered.next(NearSpansUnordered.java:164) > at > org.apache.lucene.search.spans.SpanScorer.<init>(SpanScorer.java:46) > at > org.apache.lucene.search.spans.SpanWeight.scorer(SpanWeight.java:88) > at > org.apache.lucene.search.DisjunctionMaxQuery$DisjunctionMaxWeight.scorer(DisjunctionMaxQuery.java:160) > at > org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:356) > at > org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:356) > at > org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:356) > at > org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:356) > at > org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:356) > at > org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:356) > at > org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:356) > at > org.apache.lucene.queries.function.BoostedQuery$BoostedWeight.scorer(BoostedQuery.java:101) > at org.apache.lucene.search.Weight.bulkScorer(Weight.java:131) > at > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618) > at > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297) > at > org.apache.solr.search.grouping.CommandHandler.searchWithTimeLimiter(CommandHandler.java:219) > at > org.apache.solr.search.grouping.CommandHandler.execute(CommandHandler.java:154) > at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:362) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777) > > Thanks, > Modassar >