This is an automated email from the ASF dual-hosted git repository. vatamane pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/couchdb.git
commit b6ebcd107e0e2496ed7834600b25cb476e88dc78 Author: Nick Vatamaniuc <[email protected]> AuthorDate: Thu May 29 15:28:54 2025 -0400 Improve mem3 supervisor The answer to the question if order is important in the comments is: "yes" for most of the workers. So fix their order and use the `rest_for_one` [1] strategy. * `mem3_events` gen_event should be started before all the others * `mem3_nodes` gen_server is needed so everyone can query `mem3:nodes()` from it * `mem3_sync_nodes` needs to run before `mem3_sync` and `mem3_sync_event` so they can both can call `mem3_sync_nodes:add/1` * `mem3_distribution` force connects nodes from `mem3:nodes()`, so start it before `mem3_sync` since `mem3_sync:initial_sync/0` expects the connected nodes to be there when calling `mem3_sync_nodes:add(nodes())` * `mem3_sync_event_listener` has to start after `mem3_sync` so it can all `mem3_sync:push/2` * `mem3_seeds` and `mem3_reshard_sup` can wait till the end as they will spawn background work that can go for a while: seeding system dbs from other nodes or running resharding jobs. [1] https://www.erlang.org/doc/system/sup_princ.html#rest_for_one --- src/mem3/src/mem3_sup.erl | 30 +++++++++++++++++++++++++----- 1 file changed, 25 insertions(+), 5 deletions(-) diff --git a/src/mem3/src/mem3_sup.erl b/src/mem3/src/mem3_sup.erl index 862ef6b50..96e8ac394 100644 --- a/src/mem3/src/mem3_sup.erl +++ b/src/mem3/src/mem3_sup.erl @@ -18,19 +18,39 @@ start_link() -> supervisor:start_link({local, ?MODULE}, ?MODULE, []). init(_Args) -> + % Some startup order constraints based on call dependencies: + % + % * mem3_events gen_event should be started before all the others + % + % * mem3_nodes gen_server is needed so everyone can call mem3:nodes() + % + % * mem3_sync_nodes needs to run before mem3_sync and + % mem3_sync_event_listener, so they can both can call + % mem3_sync_nodes:add/1 + % + % * mem3_distribution force connects nodes from mem3:nodes(), so start it + % before mem3_sync since mem3_sync:initial_sync/0 expects the connected + % nodes to be there when calling mem3_sync_nodes:add(nodes()) + % + % * mem3_sync_event_listener has to start after mem3_sync, so it can call + % mem3_sync:push/2 + % + % * mem3_seeds and mem3_reshard_sup can wait till the end, as they will + % spawn background work that can go on for a while: seeding system dbs + % from other nodes running resharding jobs + % Children = [ child(mem3_events), child(mem3_nodes), - child(mem3_distribution), - child(mem3_seeds), - % Order important? + child(mem3_shards), child(mem3_sync_nodes), + child(mem3_distribution), child(mem3_sync), - child(mem3_shards), child(mem3_sync_event_listener), + child(mem3_seeds), child(mem3_reshard_sup) ], - {ok, {{one_for_one, 10, 1}, couch_epi:register_service(mem3_epi, Children)}}. + {ok, {{rest_for_one, 10, 1}, couch_epi:register_service(mem3_epi, Children)}}. child(mem3_events) -> MFA = {gen_event, start_link, [{local, mem3_events}]},
