[ https://issues.apache.org/jira/browse/SOLR-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270728#comment-17270728 ]
Ilan Ginzburg commented on SOLR-14927: -------------------------------------- As I'm working on the child work item for distributing the cluster state updates, I realize that some changes to the Collection API might be required earlier than I hoped. See [comment on SOLR-14928|https://issues.apache.org/jira/browse/SOLR-14928?focusedCommentId=17270726&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17270726]. > Remove Overseer > --------------- > > Key: SOLR-14927 > URL: https://issues.apache.org/jira/browse/SOLR-14927 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud > Reporter: Ilan Ginzburg > Assignee: Ilan Ginzburg > Priority: Major > Labels: cluster, collection-api, overseer, solrcloud, zookeeper > > This Jira is intended to capture sub jiras on the path to remove the Overseer > component from SolrCloud and move to all nodes being able to do the work > currently done by Overseer. > See detailed description in [this > doc|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/]. > Copying (edited) from the above doc: > The motivation for removing Overseer include: > * Mono threaded state change is slow and doesn’t scale, > * Communication between cluster nodes and the Overseer use Zookeeper as a > queueing mechanism, this is not a good idea, > * Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper > is inefficient and adds latency, > * Collection API scalability is poor, because not only a single node > processes commands for all Collections, but it also depends on the mono > threaded state change queue consumption, > * The code supporting Overseer in SolrCloud is complex (election, queue > management, recovery etc). > The general idea is that there’s already a central point in the SolrCloud > cluster and it’s Zookeeper. It might not be necessary to have a second > central point (Overseer) because nodes can interact directly with Zookeeper > and synchronize more efficiently by optimistic locking using “conditional > updates” (a.k.a compare and swap or CAS). > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org