On 4/20/2016 10:06 PM, Zap Org wrote:
> I have 5 zookeeper and 2 solr machines and after a month or two whole
> clustre shutdown i dont know why. The logs i get in zookeeper are attached
> below. otherwise i dont get any error. All this is based on linux VM.
>
> 2016-03-11 16:50:18,159 [myid:5] - WARN  [SyncThread:5:FileTxnLog@334] -
> fsync-ing the write ahead log in SyncThread:5 took 7268ms which will
> adversely effect operation latency. See the ZooKeeper troubleshooting guide
> 2016-03-11 16:50:18,161 [myid:5] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2185:NIOServerCnxn@357] - caught end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x4535f00ee370001, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)

You'll need to further describe exactly what "whole cluster shutdown"
means.  I cannot tell from the logs, and there are very few situations I
can imagine where Solr would just die.  I will need to know which
version of Solr you are using.  If zookeeper is separate from Solr, that
version will also be needed.

The logs you have included indicate are all WARN and INFO logs (no
ERROR), and say that the zookeeper client disconnected.  Assuming that
this zookeeper is only used for this one SolrCloud, the zookeeper client
might be Solr, an instance of CloudSolrClient, or it might be the zkcli
script.

One of the later log entries said "/localhost" which suggests that this
is not set up the way I would recommend setting up a production
SolrCloud deployment.  I recommend each Solr running on a separate
machine using the same port number, each Zookeeper running on a separate
machine using the same port number, and everything using an identical
zkHost string.  In that setup, Zookeeper and Solr might share machines,
but none of the machines will be running more than one of each kind of
process.  If you are running that kind of setup, you will never be using
"localhost" or "127.0.0.1" for connecting to zookeeper.

There are no Solr logs included here, so if something is happening with
Solr, I cannot tell what it is.

Thanks,
Shawn

Reply via email to