kjprice opened a new pull request, #13084:
URL: https://github.com/apache/apisix/pull/13084
## Description
When `allow_client_model_preference` is enabled on the plugin config,
clients can include a `models` array in the request body to specify their
preferred model/instance ordering. This enables multiple teams sharing a single
gateway to express different model preferences without requiring separate
routes.
### Changes
**Schema** (`apisix/plugins/ai-proxy/schema.lua`):
- Added `allow_client_model_preference` boolean field (default: `false`) to
`ai_proxy_multi_schema`
**Plugin logic** (`apisix/plugins/ai-proxy-multi.lua`):
- `match_client_models()` — matches client `models` entries against
configured instances by model name and optionally provider
- `pick_preferred_instance()` — sequential picker that respects client
ordering with rate-limiting awareness
- Modified `access()` to read request body, extract `models`, reorder
instances, and strip `models` before forwarding
- Modified `retry_on_error()` to fall back through client-preferred order on
HTTP 429/5xx
**Request body `models` field supports:**
- String shorthand: `["gpt-4", "deepseek-chat"]`
- Object form: `[{"provider": "openai", "model": "gpt-4"}]`
- Mixed: both in the same array
**Behavior:**
- Unrecognized model entries are silently ignored
- Instances not in the client's list are appended in original priority order
- `models` field is always stripped before forwarding upstream
- When disabled (default), `models` field is ignored — fully backward
compatible
**Docs** (`docs/en/latest/plugins/ai-proxy-multi.md`):
- Added `allow_client_model_preference` to attributes table
- Added `models` to request format table
- Added "Client-Driven Model Selection" example section
**Tests** (`t/plugin/ai-proxy-multi.client-model-preference.t`):
- Schema validation (default false, explicit true)
- String shorthand model preference
- Object form model preference
- Fallback to server priority without `models` field
- Unrecognized models ignored
- `models` field stripped from forwarded request
- Feature disabled when `allow_client_model_preference` is false
Resolves #13083
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]