Why would you want to write a load balancer when there are so many that are free and very fast?
For update traffic, there is very little benefit in sending updates directly to the shard leader. Forwarding an update to the leader is fast. Indexing is slow. So the bottleneck is always at the leader. Before you build anything, measure. Collect a large update and send that directly to the leader. Then do the same to a non-leader shard. Compare the speed. If you are batching and indexing with multiple threads, I doubt you’ll see a meaningful difference. I commonly see 10% difference in identical load benchmarks, so the speedup has to be much larger than that to be real. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Feb 11, 2019, at 8:38 AM, Boban Acimovic <b...@it-agenten.com> wrote: > > I would actually like to write a load balancer itself, but I want it to be > able to send the data as efficiently as possible. I know how to read ZK data, > but I don’t know how can I figure out which shard is responsible upon data > that I have in a document that I want to index. > > > > >> On 11. Feb 2019, at 17:23, Walter Underwood <wun...@wunderwood.org> wrote: >> >> We send all updates to the load balancer, so they’ll end up on the wrong >> shard, not on the leader, etc. Indexing speed is still limited by the CPU >> available on each leader. I don’t think that sending the update to the right >> leader makes any improvement in throughput. >> >> On the other hand, the CloudSolrClient ignores errors from Solr, which makes >> it unacceptable for production use. >> >> I would stay with your current indexing client and worry about something >> else. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog)