I’m also surpised that you see a slowdown, it’s worth investigating.

Let’s take the NRT case with only a leader. I’ve seen the NRT indexing time 
increase when even a single follower was added (30-40% in this case). We 
believed that the issue was the time the leader sat waiting around for the 
follower to acknowledge receipt of the documents. Also note that these were 
very short documents.

You’d still pay that price with more than one TLOG replica. But again, I’d 
expect the two times to be roughly equivalent.

Indexing does not stop during index replication. That said, if you commit very 
frequently, you’ll be pushing lots of info around the network. Was your CPU 
running hot in the TLOG case or idling? If idling, then Solr isn’t getting fed 
fast enough. Perhaps there’s increased network traffic with the TLOG replicas 
replicating changed segments and that’s slowing down ingestion?

It’d be interesting to index to NRT, leader-only and also a single TLOG 
collection.


Best,
Erick

> On Oct 25, 2019, at 8:28 AM, Dominique Bejean <dominique.bej...@eolya.fr> 
> wrote:
> 
> Shawn,
> 
> So, I understand that while non leader TLOG is copying the index from
> leader, the leader stop indexing.
> One shot large heavy bulk indexing should be very much more impacted than
> continus ligth indexing.
> 
> Regards.
> 
> Dominique
> 
> 
> Le ven. 25 oct. 2019 à 13:54, Shawn Heisey <apa...@elyograg.org> a écrit :
> 
>> On 10/25/2019 1:16 AM, Dominique Bejean wrote:
>>> For collection created with all replicas as NRT
>>> 
>>> * Indexing time : 22 minutes
>> 
>> <snip>
>> 
>>> For collection created with all replicas as TLOG
>>> 
>>> * Indexing time : 34 minutes
>> 
>> NRT indexes simultaneously on all replicas.  So when indexing is done on
>> one, it is also done on all the others.
>> 
>> PULL and non-leader TLOG replicas must copy the index from the leader.
>> The leader will do the indexing and the other replicas will copy the
>> completed index from the leader.  This takes time.  If the index is
>> large, it can take a LOT of time, especially if the disks or network are
>> slow.  TLOG replicas can become leader and PULL replicas cannot.
>> 
>> What I would do personally is set two replicas for each shard to TLOG
>> and all the rest to PULL.  When a TLOG replica is acting as leader, it
>> will function exactly like an NRT replica.
>> 
>>> The conclusion seems to be that by using TLOG :
>>> 
>>> * You save CPU resources on non leaders nodes at index time
>>> * The JVM Heap and GC are the same
>>> * Indexing performance ares really less with TLOG
>> 
>> Java works in such a way that it will always eventually allocate and use
>> the entire max heap that it is allowed.  It is not always possible to
>> determine how much heap is truly needed, though analyzing large GC logs
>> will sometimes reveal that info.
>> 
>> Non-leader replicas will probably require less heap if they are TLOG or
>> PULL.  I cannot say how much less, that will be something that has to be
>> determined.  Those replicas will also use less CPU.
>> 
>> With newer Solr versions, you can ask SolrCloud to prefer PULL replicas
>> for querying, so queries will be targeted to those replicas, unless they
>> all go down, in which case it will go to non-preferred replica types.  I
>> do not know how to do this, I only know that it is possible.
>> 
>> Thanks,
>> Shawn
>> 

Reply via email to