Shawn thanks for the detailed answer. I have 5 shards and 1 leader - 1
replica for each. I mean I have 10 Solr nodes. When I look at admin gui of
one of the shards leader I see that its replica has less MB of index than
leader. I don't update the data, I don't index new ones. I think that after
a time later it will sync its replica to itself but nothing has changed.

2013/5/1 Shawn Heisey <s...@elyograg.org>

> On 4/30/2013 8:33 AM, Furkan KAMACI wrote:
>
>> I think that replication occurs after commit by default. It has been long
>> time however there is still mismatch between leader and replica
>> (approximately 5 MB). I tried to pull indexes from leader but it is still
>> same.
>>
>
> My mail server has been down most of the day, and the Apache mail
> infrastructure hasn't noticed yet that I'm back up.  I don't have copies of
> the newest messages on this thread.  I checked the web archive to see what
> else has been said.  I'll be repeating some of what has been said before.
>
> On SolrCloud terminology: SolrCloud divides your index into one or more
> shards, each of which has a different piece of the index.  Each shard is
> made up of replicas.  One replica in each shard is designated leader. Note:
> a leader is still a replica, it is just the winner of the latest leader
> election.  Summary: shards, replicas, leader.
>
> One term that you are using is "follower" ... this is not a valid
> SolrCloud term.  It might make sense to use this term for a replica that is
> not a leader, but I have never seen it used in anything official. Any
> replica can become leader, if the conditions are just right.
>
> There are only two times that the leader replica has special significance
> - when you are indexing and when a replica starts operation, either as an
> existing replica that went down or as a new replica.
>
> In SolrCloud, replication is *NOT* used when you index new data.  The
> *ONLY* time that replication happens in SolrCloud is when a replica is
> starts up, and even then it will only happen if the leader cannot figure
> out how to use its transaction log to sync the replica.
>
> SolrCloud does distributed indexing.  This means that when an update comes
> in, SolrCloud determines which shard needs that update.  If the core that
> received the request is not the leader of that shard, the request is
> forwarded to the correct leader.  That leader will index the update and
> send it to all of the replicas for that shard, each of which will index the
> update independently.
>
> Because each replica indexes independently, you can end up with different
> sizes.  The actual search results should be the same, although scoring can
> sometimes be a little bit different between replicas because deleted
> documents that exist in one replica but not another will contribute to the
> score.  SolrCloud does not attempt to keep the replicas absolutely
> identical, as long as they contain the same non-deleted documents.
>
> Thanks,
> Shawn
>
>

Reply via email to