Hi,

I have a problem with SolrCloud in an specific test case and I wanted to
know if it is the way it should work or if is there any way to avoid this...

I have the next scenario:

- Three machines
- Each one with one zookeeper and one solr 4.1.0
- Each Solr stores 7 Million documents and the index is 2GB

The test consist on sending queries to solr (100 concurrent queries
continously) and then forcing the leader failure by shutting down both
zookeeper and solr.

When we shut down any solr that is not the leader there are no problems,
the other two respond to the queries without problems. However if we shut
down the leader the next steps occur:

- Both Solrs continue responding to the queries until the leader election
starts
- One of them is elected as leader and the other one stops responding
queries (I've read it goes to recovery mode until its index is synchronized
with the leader's one)
- Then, even though both indexes are the same (They were synchronized
before the leader failure), the whole index is replicated.
- During the time while the 2GB are replicated from leader to the remaining
server, the server recovering is not responding to queries, therefore the
leader must attend to the whole amount of queries and finally it crashes
due to having to many queries to answer (Aside of replicating its index)

My question here is... Is it normal that the whole index replicates in a
leader change even though the leader and the other solr indexes should be
the same? Is there any way to avoid it? Maybe I have some configuration
wrong? Should changing Solr to 4.5.X avoid this operative?

Aside from this problem everything seems to work fine, but that point of
failure is too risky for us

Thanks in advance


-- 
Alejandro Marqués Rodríguez

Paradigma Tecnológico
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42

Reply via email to