On 2/2/2018 10:00 AM, mmb1234 wrote:
Client / java8 app: An AsyncHTTPClient POST-ing gzip payloads. PoolingNHttpClientConnectionManager maxtotal=10,000 and maxperroute=1000) ConnectionRequestTimeout = ConnectTimeout = SocketTimeout = 4000 (4 secs)
I have to concur with the person who identifies themselves as "S G", later in the thread.
You should absolutely be using SolrJ, not a java http client. SolrJ handles parsing of the response for you, providing you with java objects that are REALLY easy to extract information from. It also handles all of the HTTP details for you, including URL encoding, etc.
It's easy enough to provide a custom HttpClient object (HttpClient is from anaother Apache project, it is used by Solr/SolrJ) to the CloudSolrClient or HttpSolrClient object. That lets you change the timeouts and the limits on max concurrent connections.
I just noticed that you said the socket timeout is 4 seconds. I didn't notice this when I read your message the first time. While Solr is known for blazing speed, 4 seconds is WAY too short for this timeout. If a client resets the TCP connection because of a socket timeout, Jetty (which Solr is probably running in) is going to report an EofException -- specifically, org.eclipse.jetty.io.EofException.
Some of the bug reports I saw on Jetty specifically mentioned EofException in connection with a high CLOSE_WAIT count.
Your socket timeout should be at least one minute. Four seconds would be fine for connection timeouts. In my own software configurations for various client software, I have been known to use one second for the connection timeout -- on a LAN, if it doesn't connect in far less than one second, something is probably wrong somewhere.
I'm curious about the mention of gzip. The Jetty that Solr ships with doesn't have gzip compression enabled for HTTP, and I'm not aware of anything in Solr that handles gzip files. Maybe the Extracting Request Handler does, but if you've been following recommendations that are common on this list, you won't be using ERH except as a proof of concept.
Thanks, Shawn