Re: Limiting the number of queries/updates to Solr

Shawn Heisey Wed, 02 Aug 2017 20:34:32 -0700

On 8/2/2017 8:41 PM, S G wrote:
> Problem is that peak load estimates are just estimates.
> It would be nice to enforce them from Solr side such that if a rate higher 
> than that is seen at any core, the core will automatically begin to reject 
> the requests.
> Such a feature would contribute to cluster stability while making sure the 
> customer gets an exception to remind them of a slower rate.


Solr doesn't have anything like this.  This is primarily because there
is no network server code in Solr.  The networking is provided by the
servlet container.  The container in modern Solr versions is nearly
guaranteed to be Jetty.  As long as I have been using Solr, it has
shipped with a Jetty container.

https://wiki.apache.org/solr/WhyNoWar

I have no idea whether Jetty is capable of the kind of rate limiting
you're after.  If it is, it would be up to you to figure out the
configuration.

You could always put a proxy server like haproxy in front of Solr.  I'm
pretty sure that haproxy is capable rejecting connections when the
request rate gets too high.  Other proxy servers (nginx, apache, F5
BigIP, solutions from Microsoft, Cisco, etc) are probably also capable
of this.

IMHO, intentionally causing connections to fail when a limit is exceeded
would not be a very good idea.  When the rate gets too high, the first
thing that happens is all the requests slow down.  The slowdown could be
dramatic.  As the rate continues to increase, some of the requests
probably would begin to fail.

What you're proposing would be guaranteed to cause requests to fail. 
Failing requests are even more likely than slow requests to result in
users finding a new source for whatever service they are getting from
your organization.

Your customer teams might not be able to control the request rate, as it
would probably be related to the number of users who connect to their
services.  It seems like a better option to inform a team that they have
exceeded their request estimates and that they will need to come up with
additional budget so more hardware can be deployed.  If that doesn't
happen, then their service may suffer, and it will not be your fault.

The RateLimiter class in Lucene that you mentioned is designed to limit
the I/O rate of disk or network data transfers, not a request rate.  One
of the most visible uses of this capability in Solr is the ability to
limit the transfer rate of the old-style index replication.  It is also
used in Lucene to slow down the disk I/O usage of segment merging.

A custom Solr component could be built that can be added to a request
handler that does what you're proposing.  If you wanted to write such a
component, you could donate it to the project and try to get it included
in Solr.  Even though I believe such a feature is a bad idea, I'm sure
it would be loved by some users.

Thanks,
Shawn

Re: Limiting the number of queries/updates to Solr

Reply via email to