On Nov 29, 2012, at 1:26 PM, Daniel Collins <danwcoll...@gmail.com> wrote:

> Hi Mark,
> 
> I get that use case, if the non-leader dies, when it comes back it has to
> allow for recovery, that makes perfect sense.
> 
> I guess I was (naively!) assuming there was an optimized scenario if the
> leader dies, and is the first one to come back (is still therefore leader),
> there is no recovery to do as by definition no updates can have been made
> whilst the shard was inactive.
> 
> Aside: Interesting point about Solr only ack updates when they are on every
> replica, are you talking about when the records are removed from the
> transaction log?
> 
> My understanding was the the external "update" request completes as soon as
> the document has made it to the leader's transaction log (might not even
> have committed into the leader index), and the replicas then were pushed
> those updates as they became available.

No, currently it won't return until the update hits the replicas - its sent to 
replicas in parallel.

> 
> If a single replica dies, the leader can still process update/add document
> requests, so it can't be waiting for replicas in that scenario?

There should be no wait if there are any nodes waiting in line to be leader - 
it should only wait when a node comes up and realizes it's the leader and no 
one else was in line to be leader.

- Mark

> 
>> On Nov 28, 2012, at 11:58 AM, Mark Miller <markrmil...@gmail.com> wrote:
>> 
>> 
>> and we don't want to lose any updates.
>> 
>> 
>> That's probably somewhat inaccurate - in this case it's more about
>> consistency - we only ack updates once they are on every replica. So it's
>> not a lost updates issue, but a consistency issue.
>> 
>> The lost updates part is more like when you stop the cluster, than you
>> start an old shard or something before starting more recent shards - you
>> don't want that thing to become the leader because the other shards were
>> not up yet.
>> 
>> - Mark
>> 
>> 

Reply via email to