Copilot commented on code in PR #13066:
URL: https://github.com/apache/apisix/pull/13066#discussion_r2901653091
##########
apisix/discovery/consul/init.lua:
##########
@@ -149,14 +185,30 @@ local function read_dump_services()
return
end
- all_services = entity.services
- log.info("load dump file into memory success")
+ for k, v in pairs(entity.services) do
+ local content, json_err = core.json.encode(v)
+ if content then
+ consul_dict:set(k, content)
Review Comment:
`read_dump_services()` writes into `ngx.shared.consul` but ignores the
return values from `consul_dict:set()`. If the shared dict is full (or an item
is too large), the set can fail or be forcible, silently dropping discovery
data and potentially causing 503s. Please capture `ok/err/forcible` and
log/handle failures consistently with `update_all_services()`.
```suggestion
local ok, err, forcible = consul_dict:set(k, content)
if not ok then
log.error("failed to set dump service into shared dict, key:
", k,
", error: ", err)
elseif forcible then
log.warn("set dump service into shared dict forcibly, key:
", k,
", some items may have been removed")
end
```
##########
apisix/discovery/consul/init.lua:
##########
@@ -611,35 +655,32 @@ end
function _M.init_worker()
local consul_conf = local_conf.discovery.consul
+ dump_params = consul_conf.dump
- if consul_conf.dump then
- local dump = consul_conf.dump
- dump_params = dump
-
- if dump.load_on_init then
- read_dump_services()
- end
- end
-
- events = require("apisix.events")
- events_list = events:event_list(
- "discovery_consul_update_all_services",
- "updating"
- )
-
- if 0 ~= ngx_worker_id() then
- events:register(discovery_consul_callback, events_list._source,
events_list.updating)
- return
- end
-
- log.notice("consul_conf: ", json_delay_encode(consul_conf, true))
default_weight = consul_conf.weight
sort_type = consul_conf.sort_type
-- set default service, used when the server node cannot be found
if consul_conf.default_service then
default_service = consul_conf.default_service
default_service.weight = default_weight
end
+
+ if process.type() ~= "privileged agent" then
+ return
+ end
+
+ -- flush stale data that may persist across reloads,
+ -- since consul_services is re-initialized empty
+ consul_dict:flush_all()
Review Comment:
`consul_dict:flush_all()` in privileged agent init_worker clears the shared
dict for *all* workers (including old workers during a graceful reload). This
can cause in-flight requests to suddenly lose discovery data and return 503s
until the next Consul fetch repopulates the dict, and `flush_all()` also
doesn't immediately free memory (can contribute to forcible evictions).
Consider avoiding a global flush on reload (e.g., generation/versioned keys and
swap) or at minimum only clearing expired entries / explicitly freeing memory
after flushing so existing workers aren't disrupted.
```suggestion
-- flush expired data that may persist across reloads,
-- since consul_services is re-initialized empty
consul_dict:flush_expired()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]