Copilot commented on code in PR #13084:
URL: https://github.com/apache/apisix/pull/13084#discussion_r2915641166
##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -409,6 +480,33 @@ end
function _M.access(conf, ctx)
+ if conf.allow_client_model_preference then
+ local body, err = core.request.get_body()
+ if body then
+ local request_body, decode_err = core.json.decode(body)
+ if request_body and not decode_err and request_body.models then
+ ctx.client_model_preference = match_client_models(
+ conf.instances, request_body.models)
+ core.log.info("client model preference: ",
+
core.json.delay_encode(ctx.client_model_preference))
+ request_body.models = nil
+ ngx.req.set_body_data(core.json.encode(request_body))
+ end
+ end
+ end
Review Comment:
The `models` field is only stripped when `allow_client_model_preference` is
enabled. This contradicts the PR description/docs (and the new tests) which
state that `models` is always stripped before forwarding upstream. As-is,
requests sent with `models` while the feature is disabled will forward an extra
`models` field to upstream providers.
Suggestion: always remove `models` from the JSON request body when present,
but only apply client-driven reordering when `allow_client_model_preference` is
true (i.e., strip unconditionally; reorder conditionally).
##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -299,6 +300,76 @@ local function get_instance_conf(instances, name)
end
+local function match_client_models(instances, models)
+ local ordered_names = {}
+ local matched = {}
+
+ for _, model_pref in ipairs(models) do
+ local target_model, target_provider
+ if type(model_pref) == "string" then
Review Comment:
`match_client_models()` assumes `models` is an array and iterates it with
`ipairs(models)`. If a client sends a non-array JSON value (e.g.
`"models":"gpt-4"`), `ipairs` will raise an error and the request will 500.
Suggestion: validate `request_body.models` is a table/array before calling
`match_client_models` (and/or harden `match_client_models` to return the
default ordering when `type(models) ~= "table"`). Also consider skipping
elements that are not a string or object.
##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -430,6 +528,25 @@ end
local function retry_on_error(ctx, conf, code)
+ if ctx.client_model_preference then
+ ctx.client_model_tried = ctx.client_model_tried or {}
+ ctx.client_model_tried[ctx.picked_ai_instance_name] = true
+ if (code == 429 and fallback_strategy_has(conf.fallback_strategy,
"http_429")) or
+ (code >= 500 and code < 600 and
+ fallback_strategy_has(conf.fallback_strategy, "http_5xx")) then
+ local name, ai_instance, err = pick_preferred_instance(ctx, conf)
+ if err then
+ core.log.error("all preferred instances failed: ", err)
+ return 502
+ end
+ ctx.balancer_ip = name
+ ctx.picked_ai_instance_name = name
+ ctx.picked_ai_instance = ai_instance
+ return
+ end
+ return code
+ end
Review Comment:
The new client-preference retry path in `retry_on_error()` (falling back
through `ctx.client_model_preference` on 429/5xx) isn’t covered by the added
tests. Current test cases validate selection/reordering and stripping, but not
that a 429/5xx from the preferred instance causes the plugin to retry the next
preferred instance (and that it stops retrying on non-matching status codes).
Suggestion: add a test that makes the first preferred instance return
429/5xx and asserts the response comes from the next preferred instance
(similar to existing fallback tests in `ai-proxy-multi.balancer.t`).
##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -299,6 +300,76 @@ local function get_instance_conf(instances, name)
end
+local function match_client_models(instances, models)
+ local ordered_names = {}
+ local matched = {}
+
+ for _, model_pref in ipairs(models) do
+ local target_model, target_provider
+ if type(model_pref) == "string" then
+ target_model = model_pref
+ elseif type(model_pref) == "table" then
+ target_model = model_pref.model
+ target_provider = model_pref.provider
+ end
+
+ for _, instance in ipairs(instances) do
+ if not matched[instance.name] then
+ local inst_model = instance.options and instance.options.model
+ local matches = false
+ if target_provider then
+ matches = (instance.provider == target_provider
+ and inst_model == target_model)
+ else
+ matches = (inst_model == target_model)
+ end
+ if matches then
+ matched[instance.name] = true
+ core.table.insert(ordered_names, instance.name)
+ break
+ end
+ end
+ end
+ end
+
+ for _, instance in ipairs(instances) do
+ if not matched[instance.name] then
+ core.table.insert(ordered_names, instance.name)
+ end
+ end
+
Review Comment:
The fallback order appended for instances not listed by the client is based
on the raw `conf.instances` array order. However, the server-driven picker uses
priority-based ordering (see `priority_balancer` sorting by
`instance.priority`). If `conf.instances` is not already sorted, this can cause
lower-priority instances to be tried before higher-priority ones when the
client omits them.
Suggestion: when appending unmatched instances, order them consistently with
the server-side priority behavior (e.g. sort unmatched by `instance.priority`
descending, and keep stable order within the same priority).
```suggestion
local unmatched = {}
for idx, instance in ipairs(instances) do
if not matched[instance.name] then
core.table.insert(unmatched, {
name = instance.name,
priority = instance.priority or 0,
index = idx,
})
end
end
table.sort(unmatched, function(a, b)
if a.priority == b.priority then
return a.index < b.index
end
return a.priority > b.priority
end)
for _, item in ipairs(unmatched) do
core.table.insert(ordered_names, item.name)
end
```
##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -409,6 +480,33 @@ end
function _M.access(conf, ctx)
+ if conf.allow_client_model_preference then
+ local body, err = core.request.get_body()
+ if body then
+ local request_body, decode_err = core.json.decode(body)
+ if request_body and not decode_err and request_body.models then
+ ctx.client_model_preference = match_client_models(
+ conf.instances, request_body.models)
+ core.log.info("client model preference: ",
+
core.json.delay_encode(ctx.client_model_preference))
+ request_body.models = nil
+ ngx.req.set_body_data(core.json.encode(request_body))
+ end
+ end
+ end
+
+ if ctx.client_model_preference then
+ local name, ai_instance, err = pick_preferred_instance(ctx, conf)
+ if err then
+ return 503, err
+ end
+ ctx.picked_ai_instance_name = name
+ ctx.picked_ai_instance = ai_instance
+ ctx.balancer_ip = name
+ ctx.bypass_nginx_upstream = true
+ return
Review Comment:
When client preference is used, instance selection bypasses `pick_target()`
and therefore bypasses the existing active health-check filtering done via
`fetch_health_instances()`/`healthcheck_manager`. This means a client can force
the plugin to try an instance that health checks currently mark as down,
changing behavior compared to the server-driven picker.
Suggestion: apply the same health-check availability filtering to the
preferred list (e.g. build an allowlist of healthy instance names before
iterating, or reuse the existing picker/checker status logic) so client-driven
ordering doesn't ignore configured health checks.
```suggestion
-- apply health-check availability filtering to the preferred list
local healthy_instances, health_err = fetch_health_instances(conf,
ctx)
local use_client_preference = true
if not healthy_instances then
core.log.warn("failed to fetch healthy instances for client
model preference: ",
health_err, ", falling back to balancer selection")
use_client_preference = false
else
local healthy_set = {}
for _, inst in ipairs(healthy_instances) do
if inst.name then
healthy_set[inst.name] = true
end
end
local filtered_preference = {}
for _, pref in ipairs(ctx.client_model_preference) do
if pref.instance_name and healthy_set[pref.instance_name]
then
filtered_preference[#filtered_preference + 1] = pref
end
end
if #filtered_preference == 0 then
core.log.warn("no healthy instances match client model
preference; ",
"falling back to balancer selection")
use_client_preference = false
else
ctx.client_model_preference = filtered_preference
end
end
if use_client_preference then
local name, ai_instance, err = pick_preferred_instance(ctx, conf)
if err then
return 503, err
end
ctx.picked_ai_instance_name = name
ctx.picked_ai_instance = ai_instance
ctx.balancer_ip = name
ctx.bypass_nginx_upstream = true
return
end
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]