There is no guarantee that sending an update to a non-leader node is slower. It 
certainly seems like a bad idea, but forwarding a document is fast and indexing 
a document is slow, so it might not even be measurable.

We’ve indexed a million docs per minute by sending all updates to the load 
balancer for the cluster, ignoring shards or leaders.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 1, 2019, at 9:34 AM, Jason Gerlowski <gerlowsk...@gmail.com> wrote:
> 
> Hi Ganesh,
> 
> I'm not an expert on pysolr, but from a quick scan of their update
> code, it does look like pysolr attempts to send update requests to _a_
> leader node for a particular collection.  But that's all it does.  It
> doesn't check which shard the document(s) will belong to and try to
> pick the _correct_ leader. If your collections only have 1 shard, this
> is still pretty great.  But if your collections have multiple shards
> (and multiple leaders), then this will perform worse than SolrJ.
> 
> (This is based on what I gleaned from the code here:
> https://github.com/django-haystack/pysolr/blob/master/pysolr.py#L1268
> . Happy to be corrected by someone with more context.)
> 
> Best,
> 
> Jason
> 
> On Tue, Feb 26, 2019 at 1:50 PM Ganesh Sethuraman
> <ganeshmail...@gmail.com> wrote:
>> 
>> We are using Solr Cloud 7.2.1. Is there a leader aware python client (like
>> SolrJ for Java), which can send the updates to the leader and it its highly
>> available?
>> I see PySolr https://pypi.org/project/pysolr/ project, not able to find any
>> documentation if it supports leader aware updates.
>> 
>> Regards
>> Ganesh

Reply via email to