Hello,

thank you for the detailed answer.

If a timeout between shard leader and replica can lead to a smaller rf value 
(because replication has timed out), is it possible to increase this timeout in 
the configuration?

Best Regards,
Martin Mois

Comments inline:

On Mon, Oct 12, 2015 at 1:31 PM, MOIS Martin (MORPHO)
<martin.m...@morpho.com> wrote:
> Hello,
>
> I am running Solr 5.2.1 in a cluster with 6 nodes. My collections have been 
> created with
replicationFactor=2, i.e. I have one replica for each shard. Beyond that I am 
using autoCommit/maxDocs=10000
and autoSoftCommits/maxDocs=1 in order to achieve near realtime search behavior.
>
> As far as I understand from section "Write Side Fault Tolerance" in the 
> documentation
(https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance),
 I
cannot enforce that an update gets replicated to all replicas, but I can only 
get the achieved
replication factor by requesting the return value rf.
>
> My question is now, what exactly does rf=2 mean? Does it only mean that the 
> replica has
written the update to its transaction log? Or has the replica also performed 
the soft commit
as configured with autoSoftCommits/maxDocs=1? The answer is important for me, 
as if the update
would only get written to the transaction log, I could not search for it 
reliable, as the
replica may not have added it to the searchable index.

rf=2 means that the update was successfully replicated to and
acknowledged by two replicas (including the leader). The rf only deals
with the durability of the update and has no relation to visibility of
the update to searchers. The auto(soft)commit settings are applied
asynchronously and do not block an update request.

>
> My second question is, does rf=1 mean that the update was definitely not 
> successful on
the replica or could it also represent a timeout of the replication request 
from the shard
leader? If it could also represent a timeout, then there would be a small 
chance that the
replication was successfully despite of the timeout.

Well, rf=1 implies that the update was only applied on the leader's
index + tlog and either replicas weren't available or returned an
error or the request timed out. So yes, you are right that it can
represent a timeout and as such there is a chance that the replication
was indeed successful despite of the timeout.

>
> Is there a way to retrieve the replication factor for a specific document 
> after the update
in order to check if replication was successful in the meantime?
>

No, there is no way to do that.

> Thanks in advance.
>
> Best Regards,
> Martin Mois
> #
> " This e-mail and any attached documents may contain confidential or 
> proprietary information.
If you are not the intended recipient, you are notified that any dissemination, 
copying of
this e-mail and any attachments thereto or use of their contents by any means 
whatsoever is
strictly prohibited. If you have received this e-mail in error, please advise 
the sender immediately
and delete this e-mail and all attached documents from your computer system."
> #



--
Regards,
Shalin Shekhar Mangar.

#
" This e-mail and any attached documents may contain confidential or 
proprietary information. If you are not the intended recipient, you are 
notified that any dissemination, copying of this e-mail and any attachments 
thereto or use of their contents by any means whatsoever is strictly 
prohibited. If you have received this e-mail in error, please advise the sender 
immediately and delete this e-mail and all attached documents from your 
computer system."
#

Reply via email to