On 11/18/2016 6:50 PM, Chetas Joshi wrote:
> The numFound is millions but I was also trying with rows= 1 Million. I will 
> reduce it to 500K.
>
> I am sorry. It is state.json. I am using Solr 5.5.0
>
> One of the things I am not able to understand is why my ingestion job is
> complaining about "Cannot talk to ZooKeeper - Updates are disabled."
>
> I have a spark streaming job that continuously ingests into Solr. My shards 
> are always up and running. The moment I start a query on SolrCloud it starts 
> running into this exception. However as you said ZK will only update the 
> state of the cluster when the shards go down. Then why my job is trying to 
> contact ZK when the cluster is up and why is the exception about updating ZK?

SolrCloud and SolrJ (CloudSolrClient) both maintain constant connections
to all the zookeeper servers they are configured to use.  If zookeeper
quorum is lost, SolrCloud will go read-only -- no updating is possible. 
That is what is meant by "updates are disabled."

Solr and Lucene are optimized for very low rowcounts, typically two or
three digits.  Asking for hundreds of thousands of rows is problematic. 
The cursorMark feature is designed for efficient queries when paging
deeply into results, but it assumes your rows value is relatively small,
and that you will be making many queries to get a large number of
results, each of which will be fast and won't overload the server.

Since it appears you are having a performance issue, here's a few things
I have written on the topic:

https://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn

Reply via email to