Thanks a ton shalin. Now i have a very clear view of state change. Will certainly help me stabilize my cluster issues. Thanks a lot.
Gopal On Tue, Apr 28, 2015 at 8:16 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > Comments inline: > > On Tue, Apr 28, 2015 at 3:00 PM, Gopal Jee <zgo...@gmail.com> wrote: > > > I am trying to understand the role of overseer and solrCloud stateChange > > mechanism. I tried finding resources on web, but with not much luck. > > Can someone point me to some relevant doc or explain. Few doubts i have: > > 1. In doc, it says overseer updates clusterstate.json when a new node > > joins. How does overseer node knows when a new node joins. Even overseer > is > > one independent node. > > > > When a new node is loaded, it 'publishes' a 'state' message, for each local > core that it loads, to the overseer queue. This message contains the node's > base_url, state, core_name etc. This is how the overseer knows about a > node. > > > > 2. There is an overseer queue znode in zookeeper. Do all solr servers > > update its state in overseer queue? what type of events are published to > > the queue? is this queue maintained inside zookeeper? > > > > This queue is maintained inside ZooKeeper. See > http://zookeeper.apache.org/doc/trunk/recipes.html#sc_recipes_Queues > > All Solr servers publish a message to this queue when they change state. > See the Overseer.processMessage method for the kind of messages supported > at > > https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/cloud/Overseer.java#L330 > > > > > 3. when a node goes down and loose connection with zookeeper, does > > zookeeper updates its state in clusterstate.json or it lets overseer know > > about lost connection and let it update clusterstate.json? > > > > If a node shuts down gracefully then yes, it publishes a 'down' state > message to the overseer queue and the overseer updates the cluster state. > If a node is killed or crashes or looses connection with ZK for whatever > reasons, then the ZooKeeper server waits until the ZK session expiry > timeout to remove the node's corresponding entry from /live_nodes > automatically. > > > > 4. in docs it says that when a node is in down state, it can not cater to > > read or write request. I have tried issuing a get request on one node > which > > is showing down in solr cloud panel and i did get response ( with all > > relevant documents). How is this happening? When a node goes to down > state, > > does it not block its request handlers or notify it not to cater to any > get > > requests? > > > > When a replica goes into 'down' state then the other SolrCloud nodes as > well as SolrJ clients will not route requests to that replica. Also, if the > 'down' replica gets a request then it will forward the request to an > 'active' replica automatically. > > No, it doesn't actively block requests if in 'down' state (because it > doesn't need to). > > > > > > Thanks in advance for helping me understand solrCloud intern state change > > mechanism. > > > > -- > > > > > > -- > Regards, > Shalin Shekhar Mangar. > --