Re: Help with Solr 1.3 lockups?

Stephen Weiss Thu, 15 Jan 2009 14:13:45 -0800

I've been wondering about this one myself - most of the services wehave installed work this way, if they crash out for whatever reasonthey restart automatically (Apache, MySQL, even the OS itself).Failures are detected and corrected by the load balancers and also insome cases by the machine itself (like with kernel panics). But notSOLR, and I'm not quite sure what to do to get it there. We use Jettybut it's the same story. It's not like it fails out all that often,but when it does it will still respond to HTTP requests (because Jettyitself is still working), which makes it a lot harder to detect afailure... I've tried writing something for nagios but the problem isthat most responses solr would give to a request vary depending onindex updates, so it's not like I can just take a checksum and compareit - and even then, it would only really alert us to the problem, we'dstill have to go in and restart everything (personally I don't enjoyrestarting servers from my blackberry nearly as much as I should).

I'd have to come up with something that can intelligently interpretthe response and decide if the server's still working properly or not,and the processing time on that alone might make it too inefficient torun every few seconds, but at least with that we'd be able to tell thecluster "don't send anything to this server for now". Is there somereally obvious way to track if a particular servlet is still runningproperly (in either Tomcat or Jetty, because if Tomcat has this I'dswitch) and restart the container if it's not?


Thanks!!

--
Steve

On Jan 15, 2009, at 1:57 PM, Jerome L Quinn wrote:

An even bigger problem is the fact that once Solr is wedged, itstays thatway until a human notices and restarts things. The tomcat staysrunning
and there's no automatic detection that will either restart Solr, or
restart the Tomcat container.

Any suggestions on either front?

Thanks,
Jerry Quinn

Re: Help with Solr 1.3 lockups?

Reply via email to