Hi,

I have a SolrCloud (on HDFS) of 50 nodes and a ZK quorum of 5 nodes. The
SolrCloud is having difficulties talking to ZK when I am ingesting data
into the collections. At that time I am also running queries (that return
millions of docs). The ingest job is crying with the the following exception

org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
from server at http://xxx/solr/collection1_shard15_replica1: Cannot talk to
ZooKeeper - Updates are disabled.

I think this is happening when the ingest job is trying to update the
clusterstate.json file but the query is reading from that file and thus has
some kind of a lock on that file. Are there any factors that will cause the
"READ" to acquire lock for a long time? Is my understanding correct? I am
using the cursor approach using SolrJ to get back results from Solr.

How often is the ZK updated with the latest cluster state and what
parameter governs that? Should I just increase the ZK client timeout so
that it retries connecting to the ZK for a longer period of time (right now
it is 15 seconds)?

Thanks!

Reply via email to