Re: SolrCloud 5.2.1 nodes are out of sync - how to handle

Erick Erickson Mon, 06 Jun 2016 08:16:56 -0700

Sure, the routing doesn't matter to the ADDREPLICA
command, you give it a shard ID.


I'm more worried about how the nodes got out of
sync in the first place. Are _both_ Solr noded on a
particular machine out of sync? And what is the evidence
that they are?

You can issue something like
'.../solr/coll_shard1_replica1?q=*:*&distrib=false'
 against each _core_ and see if the counts are
the same just to check.

But in the normal course of events, this should be
all automatic. So what do you think caused the replicas
to get out of synch in the first place? And what's the symptom?

Best,
Erick

On Thu, Jun 2, 2016 at 10:46 PM, Ilan Schwarts <ila...@gmail.com> wrote:
> In my question i confusef you, there are 2 shards and 2 nodes on each
> shard, one leader and one not. When created the collection num of shards
> was 2 and replication factor was 2.
> Now the status is shard 1 has 2 out of sync nodes, so it is needed to
> merge/sync them. Do you still suggest same? Add replica to the damaged
> shard and then delete it? If the collection was created with composite
> routing is it possible?
> On Jun 3, 2016 4:18 AM, "Erick Erickson" <erickerick...@gmail.com> wrote:
>
>> A pedantic nit... leader/replica is not much like
>> "old master/slave".
>>
>> That out of the way, here's what I'd do.
>> 1> use the ADDREPLICA to add a new replica for the shard
>>     _on the same node as the bad one_.
>> 2> Once that had recoverd (green in the admin UI) and you
>>      were confident of
>>    its integrity (you can verify by running queries against this
>>   new replica and the leader with &distrib=false), use
>>    DELETEREPLICA on the "bad" core.
>>
>> Best,
>> Erick
>>
>> On Wed, Jun 1, 2016 at 5:54 AM, Ilan Schwarts <ila...@gmail.com> wrote:
>> > Hi,
>> > We have in lab SolrCloud 5.2.1
>> > 2 Shards, each shard has 2 cores/nodes, replication factor is 1. meaning
>> > that one node is leader (like old master-slave).
>> > (upon collection creation numShards=1 rp=1)
>> >
>> > Now there is a problem in the lab, shard 1 has 2 cores, but the number of
>> > docs is different, and when adding a document to one of the cores, it
>> will
>> > not replicate the data to the other one.
>> > If i check cluster state.json it appears fine, it writes there are 2
>> active
>> > cores and only 1 is set as leader.
>> >
>> > What is the recovery method for a scenario like this ? I dont have logs
>> > anymore and cannot reproduce.
>> > Is it possible to merge the 2 cores into 1, and then split that core to 2
>> > cores ?
>> > Or maybe to enforce sync if possible ?
>> >
>> > The other shard, Shard 2 is functioning well, the replication works fine,
>> > when adding a document to 1 core, it will replicate it to the other.
>> >
>> > --
>> >
>> >
>> > -
>> > Ilan Schwarts
>>

Re: SolrCloud 5.2.1 nodes are out of sync - how to handle

Reply via email to