Looks like you probably have to raise the http client connection pool limits to handle that kind of load currently.
They are specified as top level config in solr.xml: maxUpdateConnections maxUpdateConnectionsPerHost -- Mark Miller about.me/markrmiller On July 21, 2014 at 7:14:59 PM, Darren Lee (d...@amplience.com) wrote: > Hi, > > I'm doing some benchmarking with Solr Cloud 4.9.0. I am trying to work out > exactly how > much throughput my cluster can handle. > > Consistently in my test I see a replica go into recovering state forever > caused by what > looks like a timeout during replication. I can understand the timeout and > failure (I > am hitting it fairly hard) but what seems odd to me is that when I stop the > heavy load it still > does not recover the next time it tries, it seems broken forever until I > manually go in, > clear the index and let it do a full resync. > > Is this normal? Am I misunderstanding something? My cluster has 4 nodes (2 > shards, 2 replicas) > (AWS m3.2xlarge). I am indexing with ~800 concurrent connections and a 10 sec > soft commit. > I consistently get this problem with a throughput of around 1.5 million > documents per > hour. > > Thanks all, > Darren > > > Stack Traces & Messages: > > [qtp779330563-627] ERROR org.apache.solr.servlet.SolrDispatchFilter â > null:org.apache.http.conn.ConnectionPoolTimeoutException: > Timeout waiting for connection from pool > at > org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226) > > at > org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195) > > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:422) > > at > org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863) > > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) > > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106) > > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57) > > at > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:724) > > Error while trying to recover. > core=assets_shard2_replica1:java.util.concurrent.ExecutionException: > org.apache.solr.client.solrj.SolrServerException: IOException occured when > talking to server at: http://xxx.xxx.15.171:8080/solr > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:188) > at > org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:615) > > at > org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:371) > at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235) > Caused by: org.apache.solr.client.solrj.SolrServerException: IOException > occured > when talking to server at: http://xxx.xxx.15.171:8080/solr > at > org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566) > > at > org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:245) > > at > org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:241) > > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:744) > Caused by: java.net.SocketException: Socket closed > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160) > > at > org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84) > > at > org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273) > > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140) > > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) > > at > org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260) > > at > org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283) > > at > org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251) > > at > org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197) > > at > org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271) > > at > org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123) > > at > org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682) > > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486) > > at > org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863) > > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) > > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106) > > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57) > > at > org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:452) > > ... 6 more > > 853915 [RecoveryThread] ERROR org.apache.solr.cloud.RecoveryStrategy â > Recovery > failed - trying again... (0) core=assets_shard2_replica1 > 853915 [RecoveryThread] ERROR org.apache.solr.cloud.RecoveryStrategy â > Recovery > failed - interrupted. core=assets_shard2_replica1 > 853915 [RecoveryThread] ERROR org.apache.solr.cloud.RecoveryStrategy â > Recovery > failed - I give up. core=assets_shard2_replica1 > 853918 [RecoveryThread] WARN org.apache.solr.cloud.RecoveryStrategy â > Stopping > recovery for > zkNodeName=xxx.xxx.15.174:8080_solr_assets_shard2_replica1core=assets_shard2_replica1 > > 853933 [Thread-382] WARN org.apache.solr.cloud.RecoveryStrategy â Stopping > recovery > for > zkNodeName=xxx.xxx.15.174:8080_solr_assets_shard2_replica1core=assets_shard2_replica1 > >