Hi everyone,

We have SolrCloud cluster with 3 zk and 3 solr nodes. It's 1 shard only and
all replicas are PULL.
We have bulk updates so like once a day we reindex all cores (no soft
commits, only hard commit every 15s), do commit with openSearcher=true and
all our indexes become available for search.

The issue is that for PULL replication when leader reindexing starts it
downloads index every
hard commit / 2 seconds (o.a.s.h.ReplicationHandler Poll scheduled at an
interval of 7000ms) then puts index into proper directory and just reopens
searcher so that we see no changes on leader because there was no commit
with openSearcher=true yet and that index keeps growing on PULL replicas.

Judging by this page
<https://lucene.apache.org/solr/guide/7_7/index-replication.html#index-replication-in-solr>
there's no setting for pollInterval or when to start replication on slaves
in SolrCloud and the info is rather confusing because in cloud we still use
the same handlers which we cannot configure.

We changed replication from NRT to PULL because we don't need realtime and
burn CPU with bulk updates on every machine, but this constantly catching
up index on slaves isn't any better...

Do you know any way to fix it?

Reply via email to