Re: Node down, but not out

2013-07-24 Thread jimtronic
Well, it seems to work. I wonder what the best way to test this would be? How can I remove a node from a cluster but still have it be up and running? Jim On Wed, Jul 24, 2013 at 12:10 PM, Jim Musil wrote: > Wow! Awesome. Give me a bit to try to plug this into my environment. > > The other way I

Re: Node down, but not out

2013-07-24 Thread jimtronic
Wow! Awesome. Give me a bit to try to plug this into my environment. The other way I was going to attempt this was to use the health check file option for the ping request handler. I would have to write a separate process in python or something that would ping zookeeper for active nodes and if the

Re: Node down, but not out

2013-07-24 Thread Timothy Potter
Hi Jim, Based on our discussion, I cooked up this solution for my book Solr in Action and would appreciate you looking it over to see if it meets your needs. The basic idea is to extend Solr's built-in PingRequestHandler to verify a replica is connected to Zookeeper and is in the "active" state. T

Re: Node down, but not out

2013-07-23 Thread jimtronic
I think the best bet here would be a ping like handler that would simply return the state of only this box in the cluster: Something like /admin/state which would return "down","active","leader","recovering" I'm not really sure where to begin however. Any ideas? jim On Mon, Jul 22, 2013 at 12:5

Re: Node down, but not out

2013-07-22 Thread Timothy Potter
There is but I couldn't get it to work in my environment on Jetty, see: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201306.mbox/%3CCAJt9Wnib+p_woYODtrSPhF==v8Vx==mDBd_qH=x_knbw-bn...@mail.gmail.com%3E Let me know if you have any better luck. I had to resort to something hacky but wa

Re: Node down, but not out

2013-07-22 Thread jimtronic
I'm not sure why it went down exactly -- I restarted the process and lost the logs. (d'oh!) An OOM seems likely, however. Is there a setting for killing the processes when solr encounters an OOM? Thanks! Jim -- View this message in context: http://lucene.472066.n3.nabble.com/Node-down-but-n

Re: Node down, but not out

2013-07-22 Thread Timothy Potter
Why was it down? e.g. did it OOM? If so, the recommended approach is kill the process on OOM vs. leaving it in the cluster in a zombie state. I had similar issues when my nodes OOM'd is why I ask. That said, you can get the /clusterstate.json which contains Zk's status of a node using a request lik