Hmmm, there are usually a couple of ports that each ZK instance needs, is it possible that you've got more than one process using one of those ports?
By default (I think), zookeeper uses "peer port + 1000" for its leader election process, see: https://zookeeper.apache.org/doc/r3.3.3/zookeeperStarted.html the "Running Replicated Zookeeper" section. I'm not quite clear whether the above ZK2 port and ZK3 port are just meant to indicate a single Zookeeper instance on a node or not so I thought I'd check. Firewalls should always fail, not intermittently so I'm puzzled about that.... Best, Erick On Fri, Oct 2, 2015 at 1:33 AM, Adrian Liew <adrian.l...@avanade.com> wrote: > Hi Edwin, > > I have followed the standards recommended by the Zookeeper article. It seems > to be working. > > Incidentally, I am facing intermittent issues whereby I am unable to connect > to Zookeeper service via Solr's zkCli.bat command, even after having setting > automatic startup of my ZooKeeper service. I have basically configured > (non-sucking-service-manager) nssm to auto start Solr with a dependency of > Zookeeper to ensure both services are running on startup for each Solr VM. > > Here is an example what I tried to run to connect to the ZK service: > > E:\solr-5.3.0\server\scripts\cloud-scripts>zkcli.bat -z 10.0.0.6:2183 -cmd > list > Exception in thread "main" org.apache.solr.common.SolrException: > java.util.concu > rrent.TimeoutException: Could not connect to ZooKeeper 10.0.0.6:2183 within > 3000 > 0 ms > at > org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:18 > 1) > at > org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:11 > 5) > at > org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:10 > 5) > at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:181) > Caused by: java.util.concurrent.TimeoutException: Could not connect to > ZooKeeper > 10.0.0.6:2183 within 30000 ms > at > org.apache.solr.common.cloud.ConnectionManager.waitForConnected(Conne > ctionManager.java:208) > at > org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:17 > 3) > ... 3 more > > > Further to this I inspected the output shown in console window by > zkServer.cmd: > > 2015-10-02 08:24:09,305 [myid:3] - WARN > [WorkerSender[myid=3]:QuorumCnxManager@ > 382] - Cannot open channel to 2 at election address /10.0.0.5:3888 > java.net.SocketTimeoutException: connect timed out > at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method) > at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source) > at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source) > at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source) > at java.net.AbstractPlainSocketImpl.connect(Unknown Source) > at java.net.PlainSocketImpl.connect(Unknown Source) > at java.net.SocksSocketImpl.connect(Unknown Source) > at java.net.Socket.connect(Unknown Source) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(Quorum > CnxManager.java:368) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxM > anager.java:341) > at > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$Worke > rSender.process(FastLeaderElection.java:449) > at > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$Worke > rSender.run(FastLeaderElection.java:430) > at java.lang.Thread.run(Unknown Source) > 2015-10-02 08:24:09,305 [myid:3] - INFO > [WorkerReceiver[myid=3]:FastLeaderElect > ion@597] - Notification: 1 (message format version), 3 (n.leader), > 0x700000011 ( > n.zxid), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x7 (n.peerEpoch) > LOOKING > (my state) > > I noticed the error message by zkServer.cmd as Cannot open channel to 2 at > election address /10.0.0.5:3888 > > Can firewall settings be the issue here? I feel this may be a network issue > between the individual Solr VMs. I am using a Windows Server 2012 R2 64 bit > environment to run Zookeeper 3.4.6 and Solr 5.3.0. > > Currently, I have setup my firewalls in the Advanced Configuration Firewall > Settings as below: > > As for the Firewall settings I have configured the below for each Azure VM > (Phoenix-Solr-0, Phoenix-Solr-1, Phoenix-Solr-2) in the Firewall Advanced > Security Settings: > > For allowed inbound connections: > > Solr port 8983 > ZK1 port 2181 > ZK2 port 2888 > ZK3 port 3888 > > Regards, > Adrian > > -----Original Message----- > From: Zheng Lin Edwin Yeo [mailto:edwinye...@gmail.com] > Sent: Friday, October 2, 2015 11:03 AM > To: solr-user@lucene.apache.org > Subject: Re: Cannot connect to a zookeeper 3.4.6 instance via zkCli.cmd > > Hi Adrian, > > How is your setup of your system like? By right it shouldn't be an issue if > we use different ports. > > in fact, if the various zookeeper instance are running on a single machine, > they have to be on different ports in order for it to work. > > > Regards, > Edwin > > > > On 1 October 2015 at 18:19, Adrian Liew <adrian.l...@avanade.com> wrote: > >> Hi all, >> >> The problem below was resolved by appropriately setting my server ip >> addresses to have the following for each zoo.cfg: >> >> server.1=10.0.0.4:2888:3888 >> server.2=10.0.0.5:2888:3888 >> server.3=10.0.0.6:2888:3888 >> >> as opposed to the following: >> >> server.1=10.0.0.4:2888:3888 >> server.2=10.0.0.5:2889:3889 >> server.3=10.0.0.6:2890:3890 >> >> I am not sure why the above can be an issue (by right it should not), >> however I followed the recommendations provided by Zookeeper >> administration guide under RunningReplicatedZookeeper ( >> https://zookeeper.apache.org/doc/r3.1.2/zookeeperStarted.html#sc_Runni >> ngReplicatedZooKeeper >> ) >> >> Given that I am testing multiple servers in a mutiserver environment, >> it will be safe to use 2888:3888 on each server rather than have >> different ports. >> >> Regards, >> Adrian >> >> From: Adrian Liew [mailto:adrian.l...@avanade.com] >> Sent: Thursday, October 1, 2015 5:32 PM >> To: solr-user@lucene.apache.org >> Subject: Cannot connect to a zookeeper 3.4.6 instance via zkCli.cmd >> >> Hi there, >> >> Currently, I have setup an azure virtual network to connect my >> Zookeeper clusters together with three Azure VMs. Each VM has an >> internal IP of 10.0.0.4, 10.0.0.5 and 10.0.0.6. I have also setup Solr >> 5.3.0 which runs in Solr Cloud mode connected to all three Zookeepers >> in an external ensemble manner. >> >> I am able to connect to 10.0.0.4 and 10.0.0.6 via the zkCli.cmd after >> starting the Zookeeper services. However for 10.0.0.5, I keep getting >> the below error even if I started the zookeeper service. >> >> [cid:image001.png@01D0FC6E.BDC2D990] >> >> I have restarted 10.0.0.5 VM several times and still am unable to >> connect to Zookeeper via zkCli.cmd. I have checked zoo.cfg (making >> sure ports, data and logs are all set correctly) and myid to ensure >> they have the correct configurations. >> >> The simple command line I used to connect to Zookeeper is zkCli.cmd >> -server 10.0.0.5:2182 for example. >> >> Any ideas? >> >> Best regards, >> >> Adrian Liew | Consultant Application Developer Avanade Malaysia Sdn. >> Bhd..| Consulting Services >> (: Direct: +(603) 2382 5668 >> È: +6010-2288030 >> >> >>