[ 
https://issues.apache.org/jira/browse/SOLR-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270952#comment-17270952
 ] 

Ilan Ginzburg commented on SOLR-14928:
--------------------------------------

Actually I take my last comment back, in the sense that with distributed 
cluster state updates the situation is not different than with {{Overseer}} 
based cluster state update. Consider {{ZkController}} (running on any node) 
does change a core (replica) state or even presence in the cluster state due to 
one of the calls above.

With {{Overseer}} based cluster change, the {{ZkController}} request goes to 
overseer, {{ClusterStateUpdater}} does the required change and updates its 
tracked state. But the state that {{ClusterStateUpdater}} uses is *NOT* the 
state seen and acted upon by the Collection API commands, so a command 
subsequently executing on the overseer might not see cluster change if the 
watches haven't fired yet to update the state used by the Collection API (i.e. 
{{ZkStateReader.getClusterState()}}).
For the record, the state used by {{ClusterStateUpdater}} initially is a copy 
of {{ZkStateReader.getClusterState()}}, but it soon diverges after the first 
write and that copy is used as a write through cache by the 
{{ClusterStateUpdater}}).

With distributed cluster state updates, the {{ZkController}} state change 
update will be executed locally on the node on which the {{ZkController}} is 
executing. A Collection API call then running (on the {{Overseer}} given that 
at this stage the Collection API is not yet distributed) might not see the 
update until the watches for its cluster state fire.

*This is the exact same behavior with or without distributed cluster state 
changes.* There is a race window where stale data can be observed, it is 
unchanged when distributing the cluster state updates. The only difference is 
that the update that triggers the watch to update the cluster state used by the 
Collection API is either coming from an update done on the same node 
({{Overseer}} based cluster state update) or another node (distributed cluster 
state updates).

Note that when the Collection API command execution is distributed as well 
(future Jira), that race window will be closed.

> Remove Overseer ClusterStateUpdater
> -----------------------------------
>
>                 Key: SOLR-14928
>                 URL: https://issues.apache.org/jira/browse/SOLR-14928
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: Ilan Ginzburg
>            Assignee: Ilan Ginzburg
>            Priority: Major
>              Labels: cluster, collection-api, overseer
>
> Remove the Overseer {{ClusterStateUpdater}} thread and associated Zookeeper 
> queue at {{<_chroot_>/overseer/queue}}.
> Change cluster state updates so that each (Collection API) command execution 
> does the update directly in Zookeeper using optimistic locking (Compare and 
> Swap on the {{state.json}} Zookeeper files).
> Following this change cluster state updates would still be happening only 
> from the Overseer node (that's where Collection API commands are executing), 
> but the code will be ready for distribution once such commands can be 
> executed by any node (other work done in the context of parent task 
> SOLR-14927).
> See the [Cluster State 
> Updater|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/edit#heading=h.ymtfm3p518c]
>  section in the Removing Overseer doc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to