We are running a test case, ingesting 2B records in a collection in 24 hrs.
This collection is spread across 10 solr nodes with a replication factor of
2.

We are noticing many replicas going into recovery while indexing. And it is
degrading indexing performance.
We are observing errors like:

*org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
Error from server at http://host:8983/solr/test_shard13_replica_n50
<http://host:8983/solr/test_shard13_replica_n50>*


*Expected mime type application/octet-stream but got application/json.*
*o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: No
registered leader was found after waiting for 4000ms*

Sometimes both the replicas of a shard go into recovery and the error log
is something related to zookeeper, cannot elect a leader.

Also related to indexing performance, when we left a run overnight we can
see that in the morning the indexing performance had degraded from <1000ms
to >10s for 10k batch insertions.
But we have noticed that restarting solr on all nodes again starts gives
better performance. We are using Solr 7.4. What can be the issue here?

Reply via email to