I still see the same cloud startup issue with Solr 5.0.0. I created 4,000 collections from scratch and then attempted to stop/start the cloud.
node1: WARN - 2015-03-02 18:09:02.371; org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog WARN - 2015-03-02 18:10:07.196; org.apache.solr.cloud.ZkController; Timed out waiting to see all nodes published as DOWN in our cluster state. WARN - 2015-03-02 18:13:46.238; org.apache.solr.cloud.ZkController; Still seeing conflicting information about the leader of shard shard1 for collection DDDDDD-3219 after 30 seconds; our state says http://host:8002/solr/DDDDDD-3219_shard1_replica1/, but ZooKeeper says http://host:8000/solr/DDDDDD-3219_shard1_replica2/ node2: WARN - 2015-03-02 18:09:01.871; org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog WARN - 2015-03-02 18:17:04.458; org.apache.solr.common.cloud.ZkStateReader$3; ZooKeeper watch triggered, but Solr cannot talk to ZK stop/start WARN - 2015-03-02 18:53:12.725; org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog WARN - 2015-03-02 18:56:30.702; org.apache.solr.cloud.ZkController; Still seeing conflicting information about the leader of shard shard1 for collection DDDDDD-3581 after 30 seconds; our state says http://host:8001/solr/DDDDDD-3581_shard1_replica2/, but ZooKeeper says http://host:8002/solr/DDDDDD-3581_shard1_replica1/ node3: WARN - 2015-03-02 18:09:03.022; org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog WARN - 2015-03-02 18:10:08.178; org.apache.solr.cloud.ZkController; Timed out waiting to see all nodes published as DOWN in our cluster state. WARN - 2015-03-02 18:13:47.737; org.apache.solr.cloud.ZkController; Still seeing conflicting information about the leader of shard shard1 for collection DDDDDD-2707 after 30 seconds; our state says http://host:8002/solr/DDDDDD-2707_shard1_replica2/, but ZooKeeper says http://host:8000/solr/DDDDDD-2707_shard1_replica1/ On 27 February 2015 at 17:48, Shawn Heisey <apa...@elyograg.org> wrote: > On 2/26/2015 11:14 PM, Damien Kamerman wrote: > > I've run into an issue with starting my solr cloud with many collections. > > My setup is: > > 3 nodes (solr 4.10.3 ; 64GB RAM each ; jdk1.8.0_25) running on a single > > server (256GB RAM). > > 5,000 collections (1 x shard ; 2 x replica) = 10,000 cores > > 1 x Zookeeper 3.4.6 > > Java arg -Djute.maxbuffer=67108864 added to solr and ZK. > > > > Then I stop all nodes, then start all nodes. All replicas are in the down > > state, some have no leader. At times I have seen some (12 or so) leaders > in > > the active state. In the solr logs I see lots of: > > > > org.apache.solr.cloud.ZkController; Still seeing conflicting information > > about the leader of shard shard1 for collection DDDDDD-4351 after 30 > > seconds; our state says > http://ftea1:8001/solr/DDDDDD-4351_shard1_replica1/, > > but ZooKeeper says http://ftea1:8000/solr/DDDDDD-4351_shard1_replica2/ > > <snip> > > > I've tried staggering the starts (1min) but does not help. > > I've reproduced with zero documents. > > Restarts are OK up to around 3,000 cores. > > Should this work? > > This is going to push SolrCloud beyond its limits. Is this just an > exercise to see how far you can push Solr, or are you looking at setting > up a production install with several thousand collections? > > In Solr 4.x, the clusterstate is one giant JSON structure containing the > state of the entire cloud. With 5000 collections, the entire thing > would need to be downloaded and uploaded at least 5000 times during the > course of a successful full system startup ... and I think with > replicationFactor set to 2, that might actually be 10000 times. The > best-case scenario is that it would take a VERY long time, the > worst-case scenario is that concurrency problems would lead to a > deadlock. A deadlock might be what is happening here. > > In Solr 5.x, the clusterstate is broken up so there's a separate state > structure for each collection. This setup allows for faster and safer > multi-threading and far less data transfer. Assuming I understand the > implications correctly, there might not be any need to increase > jute.maxbuffer with 5.x ... although I have to assume that I might be > wrong about that. > > I would very much recommend that you set your scenario up from scratch > in Solr 5.0.0, to see if the new clusterstate format can eliminate the > problem you're seeing. If it doesn't, then we can pursue it as a likely > bug in the 5.x branch and you can file an issue in Jira. > > Thanks, > Shawn > > -- Damien Kamerman