bq: It seems to me a huge wasting of resources. How else would you guarantee consistency? Especially taking in to account Lucene's write-once segments? Master/Slave sidesteps the problem by moving entire, closed segments to the slave, but as Shawn says if the master goes down the slaves don't have _any_ docs from the not-closed segments.
Best, Erick On Mon, Jan 11, 2016 at 1:42 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 1/11/2016 1:23 PM, Gian Maria Ricci - aka Alkampfer wrote: >> Ok, this imply that if I have X replica of a shard, the document is indexed >> X+1 times? one for each replica plus the leader shard? It seems to me a huge >> wasting of resources. >> >> In a Master/slave scenario indexing takes places only on master node, then >> slave replicates analyzed data. > > The leader *is* a replica. So if you have a replicationFactor of three, > you have three replicas for each shard. For each shard, one of those > replicas gets elected to be the leader. You do not have a leader and > two replicas. > > The above is perhaps extremely pedantic, but understanding how SolrCloud > works requires understanding that being temporarily assigned the leader > role does not change how the replica works, it just adds some additional > coordination responsibilities. > > To answer your question, let's assume you build an index with > replicationFactor=3. No new replicas are added, and all machines are > up. In that situation, each document gets indexed a total of three times. > > In return for this additional complexity and resource usage, you don't > have a single point of failure for indexing. With master/slave > replication, if your master goes down for any length of time, you must > reconfigure all of your remaining Solr nodes to change the master. > Chances are very good that you will experience downtime. > > Thanks, > Shawn >