Any thoughts on this / anything configuration items I can check? Could the 180 second clusterstatus timeout messages that I'm getting be related? Any issue with running 7 nodes in the zookeeper quorum? For reference the clusterstatus stack trace is:

org.apache.solr.common.SolrException: CLUSTERSTATUS the collection time out:180s at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:740) at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:692) at org.apache.solr.handler.admin.CollectionsHandler.handleClusterStatus(CollectionsHandler.java:1042) at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:259) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:783) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:282) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
    at org.eclipse.jetty.server.Server.handle(Server.java:368)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
    at java.lang.Thread.run(Thread.java:745)

Thanks for any thoughts!

-Joe

On 6/3/2015 8:39 PM, Joseph Obernberger wrote:
Hi All - I've run into a problem where every-once in a while one or more of the shards (27 shard cluster) will loose connection to zookeeper and report "updates are disabled". In additional to the CLUSTERSTATUS timeout errors, which don't seem to cause any issue, this one certainly does as that shard no longer takes any (you guessed it!) updates!
We are using Zookeeper with 7 nodes (7 servers in our quorum).
There stack trace is:

---------
282833508 [qtp1221263105-801058] INFO org.apache.solr.update.processor.LogUpdateProcessor [UNCLASS shard17 core_node17 UNCLASS] â [UNCLASS] webapp=/solr path=/update params={wt=javabin&version=2} {add=[COLLECT20001208773720 (1502857505963769856)]} 0 3 282837711 [qtp1221263105-802489] INFO org.apache.solr.update.processor.LogUpdateProcessor [UNCLASS shard17 core_node17 UNCLASS] â [UNCLASS] webapp=/solr path=/update params={wt=javabin&version=2} {add=[COLLECT20001208773796 (1502857510369886208)]} 0 3 282839485 [qtp1221263105-800319] INFO org.apache.solr.update.processor.LogUpdateProcessor [UNCLASS shard17 core_node17 UNCLASS] â [UNCLASS] webapp=/solr path=/update params={wt=javabin&version=2} {add=[COLLECT20001208773821 (1502857512230060032)]} 0 4 282841460 [qtp1221263105-801228] INFO org.apache.solr.update.processor.LogUpdateProcessor [UNCLASS shard17 core_node17 UNCLASS] â [UNCLASS] webapp=/solr path=/update params={wt=javabin&version=2} {} 0 1 282841461 [qtp1221263105-801228] ERROR org.apache.solr.core.SolrCore [UNCLASS shard17 core_node17 UNCLASS] â org.apache.solr.common.SolrException: Cannot talk to ZooKeeper - Updates are disabled. at org.apache.solr.update.processor.DistributedUpdateProcessor.zkCheck(DistributedUpdateProcessor.java:1474) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:661) at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:94) at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:96) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:166) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:225) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:190) at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:116) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:173) at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:106) at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:103) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
        at org.eclipse.jetty.server.Server.handle(Server.java:368)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
        at java.lang.Thread.run(Thread.java:745)
---------

Any ideas on how to debug this, or a solution? I believe this only happens when we are actively indexing, which is nearly 100% of the time. I checked the zookeeper logs, but I don't see any errors there.
Thank you!

-Joe



Reply via email to