I’ve also seen timeout with zkCli.sh of Solr8.4 when connected to 3 ZK and the first is not accessible. Solr 8.4 has ZK3.5.5 while 7.x has Zk3.4.x
Jan Høydahl > 10. jan. 2020 kl. 17:44 skrev Markus Jelsma <markus.jel...@openindex.io>: > > Hello, > > I have multiple collections, one 7.5.0 and the rest is on 8.3.1. They all > share the same ZK ensemble and have the same ZK connection string. The first > ZK address in the connection string is one that is not reachable, it seems > firewalled, the rest is accessible. > > The 7.5.0 nodes do not appear to have problems with a partial accessible ZK > ensemble. It gave a simple warning but the cores on the nodes keep starting > up nicely. > > I have trouble starting up 8.x nodes because it times out when connecting to > ZK. The logs are filled with: > > 2020-01-10 16:33:33.146 WARN (qtp1620948294-21) [ ] > o.a.s.h.a.ZookeeperStatusHandler Failed talking to zookeeper bad_node1:2181 > => org.apache.solr.common.SolrException: Failed talking to Zookeeper > 89.188.14.28:2181 > at > org.apache.solr.handler.admin.ZookeeperStatusHandler.getZkRawResponse(ZookeeperStatusHandler.java:245) > > And i get this one for one of the cores on a restarted node: > > 2020-01-10 16:31:11.752 ERROR > (searcherExecutor-12-thread-1-processing-n:s2.io:8983_solr > x:documents_shard2_replica_t19 c:documents s:shard2 r:core_node20) > [c:documents s:shard2 r:core_node20 x:documents_shard2_replica_t19] > o.a.s.h.RequestHandlerBase java.lang.NullPointerException > at > org.apache.solr.handler.component.SearchHandler.initComponents(SearchHandler.java:183) > > This one is probably preventing the core from getting properly loaded. One > the same node, however, there is another shard of the same collection, which > did start up normally, as did other cores on the node. > > Is this a known 8.x problem? I can work around it by temporarily removing the > bad node address from the ZK connection string but thats all. > > Thanks, > Markus >