Re: Solr hangs on distributed updates

2014-12-16 Thread Peter Keegan
> As of 4.10, commits/optimize etc are executed in parallel. Excellent - thanks. On Tue, Dec 16, 2014 at 6:51 AM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > > On Tue, Dec 16, 2014 at 11:34 AM, Peter Keegan > wrote: > > > > > A distributed update is streamed to all available replicas

Re: Solr hangs on distributed updates

2014-12-16 Thread Shalin Shekhar Mangar
On Tue, Dec 16, 2014 at 11:34 AM, Peter Keegan wrote: > > > A distributed update is streamed to all available replicas in parallel. > > Hmm, that's not what I'm seeing with 4.6.1, as I tail the logs on leader > and replicas. Mark Miller comments on this last May: > > > http://mail-archives.apache.

Re: Solr hangs on distributed updates

2014-12-16 Thread Peter Keegan
> A distributed update is streamed to all available replicas in parallel. Hmm, that's not what I'm seeing with 4.6.1, as I tail the logs on leader and replicas. Mark Miller comments on this last May: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201404.mbox/%3CetPan.534d8d6d.74b0dc51.

Re: Solr hangs on distributed updates

2014-12-15 Thread Shalin Shekhar Mangar
On Mon, Dec 15, 2014 at 8:41 PM, Peter Keegan wrote: > > If a timeout occurs, does the distributed update then go to the next > replica? > A distributed update is streamed to all available replicas in parallel. > > On Fri, Dec 12, 2014 at 3:42 PM, Shalin Shekhar Mangar < > shalinman...@gmail.co

Re: Solr hangs on distributed updates

2014-12-15 Thread Peter Keegan
If a timeout occurs, does the distributed update then go to the next replica? On Fri, Dec 12, 2014 at 3:42 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > > Sorry I should have specified. These timeouts go inside the > section and apply for inter-shard update requests only. The socke

Re: Solr hangs on distributed updates

2014-12-15 Thread Peter Keegan
I added distribUpdateConnTimeout and distribUpdateSoTimeout to solr.xml and the commit did timeout.(btw, is there any way to view solr.xml in the admin console?). Also, although we do have an init.d start/stop script for Solr, the 'stop' command was not executed during shutdown because there was n

Re: Solr hangs on distributed updates

2014-12-12 Thread Peter Keegan
The AMIs are Red Hat (not Amazon's) and the instances are properly sized for the environment (t1.micro for ZK, m3.xlarge for Solr). I do plan to add hooks for a clean shutdown of Solr when the VM is shut down, but if Solr takes too long, AWS may clobber it anyway. One frustrating part of auto scali

Re: Solr hangs on distributed updates

2014-12-12 Thread Peter Keegan
> The Solr leader should stop sending requests to the stopped replica once > that replica's live node is removed from ZK (after session expiry). Fwiw, here's the Zookeeper log entry for a graceful shutdown of the Solr replica: 2014-12-12 15:04:21,304 [myid:2] - INFO [ProcessThread(sid:2 cport:81

Re: Solr hangs on distributed updates

2014-12-12 Thread Chris Hostetter
: No, I wasn't aware of these. I will give that a try. If I stop the Solr : jetty service manually, things recover fine, but the hang occurs when I : 'stop' or 'terminate' the EC2 instance. The Zookeeper leader reports a I don't know squat about AWS Auto-Scaling, (and barely anything about AWS)

Re: Solr hangs on distributed updates

2014-12-12 Thread Shalin Shekhar Mangar
Sorry I should have specified. These timeouts go inside the section and apply for inter-shard update requests only. The socket and connection timeout inside the shardHandlerFactory section apply for inter-shard search requests. On Fri, Dec 12, 2014 at 8:38 PM, Peter Keegan wrote: > Btw, are the

Re: Solr hangs on distributed updates

2014-12-12 Thread Shalin Shekhar Mangar
Okay, that should solve the hung threads on the leader. When you stop the jetty service then it is a graceful shutdown where existing requests finish before the searcher thread pool is shutdown completely. A EC2 terminate probably just kills the processes and leader threads just wait due to a lack

Re: Solr hangs on distributed updates

2014-12-12 Thread Peter Keegan
Btw, are the following timeouts still supported in solr.xml, and do they only apply to distributed search? ${socketTimeout:0} ${connTimeout:0} Thanks, Peter On Fri, Dec 12, 2014 at 3:14 PM, Peter Keegan wrote: > No, I wasn't aware of these. I will give that a try. If I stop the S

Re: Solr hangs on distributed updates

2014-12-12 Thread Peter Keegan
No, I wasn't aware of these. I will give that a try. If I stop the Solr jetty service manually, things recover fine, but the hang occurs when I 'stop' or 'terminate' the EC2 instance. The Zookeeper leader reports a 15-sec timeout from the stopped node, and expires the session, but the Solr leader n

Re: Solr hangs on distributed updates

2014-12-12 Thread Shalin Shekhar Mangar
Do you have distribUpdateConnTimeout and distribUpdateSoTimeout set to reasonable values in your solr.xml? These are the timeouts used for inter-shard update requests. On Fri, Dec 12, 2014 at 2:20 PM, Peter Keegan wrote: > We are running SolrCloud in AWS and using their auto scaling groups to sp

Solr hangs on distributed updates

2014-12-12 Thread Peter Keegan
We are running SolrCloud in AWS and using their auto scaling groups to spin up new Solr replicas when CPU utilization exceeds a threshold for a period of time. All is well until the replicas are terminated when CPU utilization falls below another threshold. What happens is that index updates sent t