Hi,

I have opened a couple of jira's, one to make the HttpShardHandlerFactory and 
LBHttpSolrServer more easily extended: 
https://issues.apache.org/jira/browse/SOLR-4448 and one with an implementation 
of a backup requesting load balancer : 
https://issues.apache.org/jira/browse/SOLR-4449 .

The implementation does not attempt to cancel inflight requests if a successful 
response is received, in fact it returns the successful response immediately 
then allows the inflight requests to complete. That way it can detect 'zombie' 
servers in a way similar to the current load balancer and not send them 
requests for a specified time.

Phil

-----Original Message-----
From: Jeff Wartes [mailto:jwar...@whitepages.com] 
Sent: 01 February 2013 01:51
To: solr-user@lucene.apache.org
Subject: RE: Solr load balancer


For what it's worth, Google has done some pretty interesting research into 
coping with the idea that particular shards might very well be busy doing 
something else when your query comes in.

Check out this slide deck: http://research.google.com/people/jeff/latency.html
Lots of interesting ideas, but in particular, around slide 39 he talks about 
"backup requests" where you wait for something like your typical response time 
and then issue a second request to a different shard. You take whichever answer 
you get first, and cancel the other. The initial wait + cancellation means your 
extra cluster load is minimal, and you still get the benefit of reducing your 
p95+ response times if the first request was high-latency due to something 
unrelated to the query. (Say, GC.)

Of course, a central principle of this approach is being able to cancel a query 
and have it stop consuming resources. I'd love to be corrected, but I don't 
think Solr allows this. You can stop waiting for a response, but even the 
timeAllowed param doesn't seem to stop resource usage after the allotted time.  
Meaning, a few exceptionally long-running queries can take out your 
high-throughput cluster by tying up entire CPUs for long periods.

Let me know the JIRA number, I'd love to see work in this area.


-----Original Message-----
From: Phil Hoy [mailto:p...@brightsolid.com]
Sent: Tuesday, January 29, 2013 11:33 AM
To: solr-user@lucene.apache.org
Subject: RE: Solr load balancer

Hi Erick,

Thanks, I have read the blogs you cited and I found them very interesting, and 
we have tuned the jvm accordingly but still we get the odd longish gc pause. 

That said we perhaps have an unusual setup; we index a lot of small documents 
using servers with ssd's and 128 GB RAM in a sharded set up with replicas and 
our queries rely heavily on query filters and faceting with minimal free-text 
style searching. For that reason we rely heavily on the filter cache to improve 
query latency, therefore we assign a large percentage of available ram to the 
jvm hosting solr. 

Anyhow we are happy with the current configuration and performance profile, 
aside from the odd gc pause that is, and as we have index replicas it seems to 
me that we should be able to cope, hence my willingness to tweak how the load 
balancer behaves.

Thanks,
Phil



-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: 20 January 2013 15:56
To: solr-user@lucene.apache.org
Subject: Re: Solr load balancer

Hmmm, the first thing I'd look at is why you are having long GC pauses. Here's 
a great place to start:

http://www.lucidimagination.com/blog/2011/03/27/garbage-collection-bootcamp-1-0/
and:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

I've wondered about a similar approach, but by firing off the same query to 
multiple nodes in your cluster, you'll be effectively doubling (at least) the 
load on your system. Leading to more memory issues perhaps in a "non-virtuous 
cycle".

FWIW,
Erick

On Fri, Jan 18, 2013 at 5:41 AM, Phil Hoy <p...@brightsolid.com> wrote:
> Hi,
>
> I would like to experiment with some custom load balancers to help with query 
> latency in the face of long gc pauses and the odd time-consuming query that 
> we need to be able to support. At the moment setting the socket timeout via 
> the HttpShardHandlerFactory does help, but of course it can only be set to a 
> length of time as long as the most time consuming query we are likely to 
> receive.
>
> For example perhaps a load balancer that sends multiple queries concurrently 
> to all/some replicas and only keeps the first response might be effective. Or 
> maybe a load balancer which takes account of the frequency of timeouts would 
> be able to recognize zombies more effectively.
>
> To use alternative load balancer implementations cleanly and without having 
> to hack solr directly, I would need to be able to make the existing 
> LBHttpSolrServer and HttpShardHandlerFactory more amenable to extension, I 
> can then override the default load balancer using solr's plugin mechanism.
>
> So my question is, if I made a patch to make the load balancer more 
> pluggable, is this something that would be acceptable and if so what do I do 
> next?
>
> Phil
>
> ______________________________________________________________________
> "brightsolid" is used in this email to collectively mean brightsolid online 
> innovation limited and its subsidiary companies brightsolid online publishing 
> limited and brightsolid online technology limited.
> findmypast.co.uk is a brand of brightsolid online publishing limited.
> brightsolid online innovation limited, Gateway House, Luna Place, Dundee 
> Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC274983.
> brightsolid online publishing limited, The Glebe, 6 Chapel Place, Rivington 
> Street, London EC2A 3DQ. Registered in England No. 04369607.
> brightsolid online technology limited, Gateway House, Luna Place, Dundee 
> Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC161678.
>
> Email Disclaimer
>
> This message is confidential and may contain privileged information. You 
> should not disclose its contents to any other person. If you are not the 
> intended recipient, please notify the sender named above immediately. It is 
> expressly declared that this e-mail does not constitute nor form part of a 
> contract or unilateral obligation. Opinions, conclusions and other 
> information in this message that do not relate to the official business of 
> brightsolid shall be understood as neither given nor endorsed by it.
> ______________________________________________________________________
> This email has been scanned by the brightsolid Email Security System. 
> Powered by MessageLabs
> ______________________________________________________________________

______________________________________________________________________
This email has been scanned by the brightsolid Email Security System. Powered 
by MessageLabs 
______________________________________________________________________

______________________________________________________________________
"brightsolid" is used in this email to collectively mean brightsolid online 
innovation limited and its subsidiary companies brightsolid online publishing 
limited and brightsolid online technology limited.
findmypast.co.uk is a brand of brightsolid online publishing limited.
brightsolid online innovation limited, Gateway House, Luna Place, Dundee 
Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC274983.
brightsolid online publishing limited, The Glebe, 6 Chapel Place, Rivington 
Street, London EC2A 3DQ. Registered in England No. 04369607.
brightsolid online technology limited, Gateway House, Luna Place, Dundee 
Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC161678.

Email Disclaimer

This message is confidential and may contain privileged information. You should 
not disclose its contents to any other person. If you are not the intended 
recipient, please notify the sender named above immediately. It is expressly 
declared that this e-mail does not constitute nor form part of a contract or 
unilateral obligation. Opinions, conclusions and other information in this 
message that do not relate to the official business of brightsolid shall be 
understood as neither given nor endorsed by it.
______________________________________________________________________
This email has been scanned by the brightsolid Email Security System. Powered 
by MessageLabs
______________________________________________________________________

Reply via email to