I am using solr/solrj 4.6.1 along with the apache httpclient 4.3.2 as part of a web application which connects to the solr server via solrj using CloudSolrServer(); The web application is wired up with Guice, and there is a single instance of the CloudSolrServer class used by all inbound requests. All this is running on Amazon.
Basically, everything looks and runs fine for a while, but even with moderate concurrency, solrj starts leaving sockets open. We are handling only about 250 connections to the web app per minute and each of these issues from 3 - 7 requests to solr. Over a 30 minute period of this type of use, we end up with many 1000s of lingering sockets. I can see these when running netstats tcp 0 0 ip-10-80-14-26.ec2.in:41098 ip-10-99-145-47.ec2.i:glrpc TIME_WAIT All to the same target host, which is my solr server. There are no other pieces of infrastructure on that box, just solr. Eventually, the server just dies as no further sockets can be opened and the opened ones are not reused. The solr server itself is unphased and running like a champ. Average timer per request of 0.126, as seen in the solr web app admin UI query handler stats. Apache httpclient had a bunch of leakage from version 4.2.x that they cleaned up and refactored in 4.3.x, which is why I upgraded. Currently, solrj makes use of the old leaky 4.2 classes for establishing connections and using a connection pool. http://www.apache.org/dist/httpcomponents/httpclient/RELEASE_NOTES-4.3.x.txt -- Jared Rodriguez