Hi folks,

Been doing some SolrCloud testing and I've been experiencing some
problems. I'll try to be relatively brief, but feel free to ask for
additional information.

I've added about 200 million documents to a SolrCloud. The cloud
contains 3 collections, and all documents were added to all three
collections.

While indexing these documents, we noticed 486k (!!) "No registered
leader was found"-errors. 482k (!!) of which referred to the same shard.
The other shards are or more or less evenly distributed in the log.

This indexing job has been running for about 5 days now, and is pretty
much IO-bound. CPU usage is ~50%. The load average, on the other hand,
has been 128 for 5 days straight. Which is high, but fine: the machine
is responsive.

Memory usage is fine. Most of it is going towards file system caches and
the like. Each Solr instance has 8GB Xmx, and is currently using about
7GB. I haven't noticed any OutOfMemoryErrors in the log files.

Monitoring shows that both Solr instances have been up throughout these
procedings.

Now, I'm willing to accept that these Solr instances don't have enough
memory, or anything else, but I'm not seeing any of this reflected in
the log files, which I'm finding troubling.

What I do notice in the log file, is the very vague "SolrException:
Service Unavailable". See below.

Could anyone shed some light on what could be causing these errors?

Thanks a bunch,

 - Bram


SolrCloud Setup:
----------------

- Version: 5.4.0
- 3 Collections
-- firstCollection : 18 shards
-- secondCollection: 36 shards
-- thirdCollection : 79 shards
- Routing: implicit
- 2 Solr Instances
-- 8GB Xmx.

Machine:
--------
- Hexacore Xeon E5-1650
- 64GB RAM
- 50TB Disk (RAID6, 10 disks)

Leader Stack Trace:
-------------------

Caused by:
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: No
registered leader was found after waiting for 4000ms , collection:
biweekly slice: thirdCollectionShard39
        at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:495)
~[solr-solrj-4.7.1.jar:4.7.1 1582953 - sarowe - 2014-03-29 00:43:32]
        at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199)
~[solr-solrj-4.7.1.jar:4.7.1 1582953 - sarowe - 2014-03-29 00:43:32]
        at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:118)
~[solr-solrj-4.7.1.jar:4.7.1 1582953 - sarowe - 2014-03-29 00:43:32]
        at
org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
~[solr-solrj-4.7.1.jar:4.7.1 1582953 - sarowe - 2014-03-29 00:43:32]
        at
org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
~[solr-solrj-4.7.1.jar:4.7.1 1582953 - sarowe - 2014-03-29 00:43:32]


Service Unavailable Log:
------------------------


527280878 ERROR (qtp59559151-194160) [c:collectionTwo
s:collectionTwoShard12 r:core_node12
x:collectionTwo_collectionTwoShard12_replica1]
o.a.s.u.SolrCmdDistributor forwarding update to
http://[CENSORED]:8983/solr/collectionTwo_collectionTwoShard1_replica1/
failed - retrying ... retries: 15 add{,id=000195641101}
params:update.distrib=TOLEADER&distrib.from=http://[CENSORED]:6666/solr/collectionTwo_collectionTwoShard12_replica1/
rsp:503:org.apache.solr.common.SolrException: Service Unavailable



Reply via email to