Comments inline: On Tue, Apr 28, 2015 at 3:00 PM, Gopal Jee <zgo...@gmail.com> wrote:
> I am trying to understand the role of overseer and solrCloud stateChange > mechanism. I tried finding resources on web, but with not much luck. > Can someone point me to some relevant doc or explain. Few doubts i have: > 1. In doc, it says overseer updates clusterstate.json when a new node > joins. How does overseer node knows when a new node joins. Even overseer is > one independent node. > When a new node is loaded, it 'publishes' a 'state' message, for each local core that it loads, to the overseer queue. This message contains the node's base_url, state, core_name etc. This is how the overseer knows about a node. > 2. There is an overseer queue znode in zookeeper. Do all solr servers > update its state in overseer queue? what type of events are published to > the queue? is this queue maintained inside zookeeper? > This queue is maintained inside ZooKeeper. See http://zookeeper.apache.org/doc/trunk/recipes.html#sc_recipes_Queues All Solr servers publish a message to this queue when they change state. See the Overseer.processMessage method for the kind of messages supported at https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/cloud/Overseer.java#L330 > 3. when a node goes down and loose connection with zookeeper, does > zookeeper updates its state in clusterstate.json or it lets overseer know > about lost connection and let it update clusterstate.json? > If a node shuts down gracefully then yes, it publishes a 'down' state message to the overseer queue and the overseer updates the cluster state. If a node is killed or crashes or looses connection with ZK for whatever reasons, then the ZooKeeper server waits until the ZK session expiry timeout to remove the node's corresponding entry from /live_nodes automatically. > 4. in docs it says that when a node is in down state, it can not cater to > read or write request. I have tried issuing a get request on one node which > is showing down in solr cloud panel and i did get response ( with all > relevant documents). How is this happening? When a node goes to down state, > does it not block its request handlers or notify it not to cater to any get > requests? > When a replica goes into 'down' state then the other SolrCloud nodes as well as SolrJ clients will not route requests to that replica. Also, if the 'down' replica gets a request then it will forward the request to an 'active' replica automatically. No, it doesn't actively block requests if in 'down' state (because it doesn't need to). > > Thanks in advance for helping me understand solrCloud intern state change > mechanism. > > -- > -- Regards, Shalin Shekhar Mangar.