Hi all,

We're using Solr 4.1.0 and a 15 node Solr Cloud (configured for a 2 minute
autoCommit with no searcher being built).  We have a large dataset in
Cassandra and use a Hadoop cluster to read over the dataset, build
documents, and insert them (via CloudSolrServer).  That part works as
expected.  We found that we had not included all the data in the documents
we wanted so we generated updates and sent them to the cloud.  We observed
that 15 to 20 tasks of the Hadoop job would complete fine but then we
started getting task timeouts.  Task would be retried and complete but the
longer the job ran the more tasks would see repeated timeouts (some taking
8 hours to finish).  We finally killed the job after 12 or so hours of
running with only 0.70% progress through the job.

Grabbing thread stack traces showed the trace I've placed at the end of
this post.  Basically the request is waiting (and keeps waiting) for a
response that does not show up within our 1200 second task timeout window.
 It sure feels like we're saturating some resource and even with the cloud
relatively quiet because every Hadoop task is tied up waiting for a
response the Solr Cloud can't seem to straighten up and fly right.

We've worked around this by clearing out the index and building the
documents with all data from the start.

Are document updates particularly expensive?  (I realize they are more
expensive than straight inserts but should we expect the behavior we've
been seeing?)


java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at 
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:149)
        at 
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:111)
        at 
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:264)
        at 
org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98)
        at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252)
        at 
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282)
        at 
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247)
        at 
org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216)
        at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298)
        at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
        at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647)
        at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
        at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:353)
        at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
        at 
org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:256)
        at 
org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:286)
        at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)

Reply via email to