Hi all, We're using Solr 4.1.0 and a 15 node Solr Cloud (configured for a 2 minute autoCommit with no searcher being built). We have a large dataset in Cassandra and use a Hadoop cluster to read over the dataset, build documents, and insert them (via CloudSolrServer). That part works as expected. We found that we had not included all the data in the documents we wanted so we generated updates and sent them to the cloud. We observed that 15 to 20 tasks of the Hadoop job would complete fine but then we started getting task timeouts. Task would be retried and complete but the longer the job ran the more tasks would see repeated timeouts (some taking 8 hours to finish). We finally killed the job after 12 or so hours of running with only 0.70% progress through the job.
Grabbing thread stack traces showed the trace I've placed at the end of this post. Basically the request is waiting (and keeps waiting) for a response that does not show up within our 1200 second task timeout window. It sure feels like we're saturating some resource and even with the cloud relatively quiet because every Hadoop task is tied up waiting for a response the Solr Cloud can't seem to straighten up and fly right. We've worked around this by clearing out the index and building the documents with all data from the start. Are document updates particularly expensive? (I realize they are more expensive than straight inserts but should we expect the behavior we've been seeing?) java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:149) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:111) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:264) at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247) at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:353) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181) at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:256) at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:286) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)