On 4/20/2016 10:06 PM, Zap Org wrote: > I have 5 zookeeper and 2 solr machines and after a month or two whole > clustre shutdown i dont know why. The logs i get in zookeeper are attached > below. otherwise i dont get any error. All this is based on linux VM. > > 2016-03-11 16:50:18,159 [myid:5] - WARN [SyncThread:5:FileTxnLog@334] - > fsync-ing the write ahead log in SyncThread:5 took 7268ms which will > adversely effect operation latency. See the ZooKeeper troubleshooting guide > 2016-03-11 16:50:18,161 [myid:5] - WARN [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2185:NIOServerCnxn@357] - caught end of stream exception > EndOfStreamException: Unable to read additional data from client sessionid > 0x4535f00ee370001, likely client has closed socket > at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > at java.lang.Thread.run(Thread.java:745)
You'll need to further describe exactly what "whole cluster shutdown" means. I cannot tell from the logs, and there are very few situations I can imagine where Solr would just die. I will need to know which version of Solr you are using. If zookeeper is separate from Solr, that version will also be needed. The logs you have included indicate are all WARN and INFO logs (no ERROR), and say that the zookeeper client disconnected. Assuming that this zookeeper is only used for this one SolrCloud, the zookeeper client might be Solr, an instance of CloudSolrClient, or it might be the zkcli script. One of the later log entries said "/localhost" which suggests that this is not set up the way I would recommend setting up a production SolrCloud deployment. I recommend each Solr running on a separate machine using the same port number, each Zookeeper running on a separate machine using the same port number, and everything using an identical zkHost string. In that setup, Zookeeper and Solr might share machines, but none of the machines will be running more than one of each kind of process. If you are running that kind of setup, you will never be using "localhost" or "127.0.0.1" for connecting to zookeeper. There are no Solr logs included here, so if something is happening with Solr, I cannot tell what it is. Thanks, Shawn