Hi Eric, I am running the code from Eclipse with default heap size of 384Mb and indexing using Solr SimplePostTool, posting xml files through Http Request. I feel its not concern with Heap, otherwise the program would have made my process pretty slow rather this is observed every 2-3 hours of indexing after which some of the node goes down.
I personally feel this may be due to the reason of Leader re election, because again last exception traced on my Cloud UI Log(mentioned below), couple of question striking me. 1) Is the leader election not getting over in the zookeeper alloted time. 2) Do I need to increase the zookeerTimeOut param with some greater value from what its been set currently. 3) Can't we manually elect the Leader and let the election happens if it goes down. ERROR StreamingSolrServers error org.apache.solr.common.SolrException: Service Unavailable request: http://host:port2 /solr/recollection_shard3_replica2/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2Fhost%3Aport1%2Fsolr%2Frecollection_shard1_replica1%2F&wt=javabin&version=2 at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:240) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) 14:01:19 ERROR SolrCore org.apache.solr.common.SolrException: No registered leader was found, collection:recollection slice:shard3 org.apache.solr.common.SolrException: No registered leader was found, collection:recollection slice:shard3 at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:484) at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:467) at org.apache.solr.update.SolrCmdDistributor$RetryNode.checkRetry(SolrCmdDistributor.java:351) at org.apache.solr.update.SolrCmdDistributor.doRetriesIfNeeded(SolrCmdDistributor.java:78) at org.apache.solr.update.SolrCmdDistributor.finish(SolrCmdDistributor.java:61) at org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:499) at org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1288) at org.apache.solr.update.processor.LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:179) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:83) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:710) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:929) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1002) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:585) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Looking forward for your response. Thanks, Tim On Wed, May 21, 2014 at 8:34 PM, Erick Erickson <erickerick...@gmail.com>wrote: > How much memory have you allocated the JVMs? Also, what's does the > Solr log show on the machine that isn't coming up? Sounds like the > node went down and perhaps went into recovery.... > > And how are you indexing? > > Best, > Erick > > On Tue, May 20, 2014 at 11:54 PM, Tim Burner <imtimbur...@gmail.com> > wrote: > > Hi Everyone, > > > > I have installed Solr-4.6 Cloud with external Zookeeper-3.4.5 and > Tomcat-7, > > the configuration is as mentioned below. > > > > Single Machine Cluster Setup with 3 shards and 2 Replica deployed on 3 > > Tomcats with 3 Zookeeper. > > > > Everything is installed good and fine, I start with the index and till I > > reach some millions of documents(~1.6M) the indexing stops saying "*#503 > > Service Unavailable" *and the Cloud Dashboard log says > > > > *"ERROR DistributedUpdateProcessor ClusterState says we are the leader, > > but locally we don't think so"* > > > > > > *"ERROR SolrCore org.apache.solr.common.SolrException: ClusterState says > we > > are the leader (http://host:port1/solr/recollection_shard1_replica1), > but > > locally we don't think so. Request came from > > http://host:port2/solr/recollection_shard2_replica1/"* > > > > > > *"ERROR ZkController Error registering > > SolrCore:org.apache.solr.common.SolrException: Error getting leader from > zk > > for shard shard2"* > > > > Any suggestions/advice would be appreciated! > > > > Thanks! > > Tim >