Re: Different docs order in different replicas of the same shard

Erick Erickson Fri, 25 May 2018 07:59:24 -0700

For (1), it's not a problem. Every update goes through the leader,
where it gets a version stamp (the _version_ field). So if doc1 is
updated twice the leader will assign a version stamp. Call the updated
doc1.1 and doc1.2. If replica X sees doc1.2 first, it indexes it. If
it subsequently sees doc1.1, it'll reject it and the caller will have
to decide what to do. If the caller thinks their copy should really be
the "one true copy", it can re-submit the doc and it'll be assigned a
new version (say doc1.3) and when replicaX sees it it'll be indexed.

If replica X sees them in order (doc1.1, then doc1.2), then the second
doc replaces the first.

The point is that you can guarantee consistency, i.e. all replicas
have the same document.

Sorting is a different thing though, and the _same_ document can be
sorted differently depending on which replica it's on. This is for two
reasons:
1> deleted docs still contribute to scoring until they're "merged
away" as part of normal indexing, therefore the score may be slightly
different for the same doc, depending on the replica.
2> tied scores are broken by the internal Lucene document ID, and due
to merging the internal ID of two docs relative to each other may be
different on different replicas.

<1> is "just how it works"
<2> can be handled by always specifying a deterministic sort if all
other sorts result in a tie, the "id" field is a good one to use.

There's a lot more detail here:
https://medium.com/@sarkaramrit2/getting-different-results-while-issuing-a-query-multiple-times-in-solrcloud-632103096076

Best,
Erick

On Fri, May 25, 2018 at 6:28 AM, SOLR4189 <klin892...@yandex.ru> wrote:
> I use SOLR-6.5.1 and I want to start to use replicas.
>
> For it I want to understand something:
>
> 1) Can asynchronous forwarding document from leader to all replicas or some
> another reasons cause that replica A may see update X then Y, and replica B
> may see update Y then X?
> If yes, thus a particular document in replicaA might sort differently
> relative to a document from replicaB if they have the same score (in the
> same order as they were stored in the index). Is it an edge case?
>
> 2) What does it mean  Custom update chain post-processors may never be
> invoked on a recovering replica
> <https://lucene.apache.org/solr/guide/7_2/update-request-processors.html>  ,
> if all my UpdateProcessors are post-processors (i.e. are after
> DistributedUpdateProcessor)? Will all buffered update requests in recovery
> be indexed in replica without my features?
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Different docs order in different replicas of the same shard

Reply via email to