Baoyuantop commented on issue #13083:
URL: https://github.com/apache/apisix/issues/13083#issuecomment-4039295012

   Thank you for your detailed proposal. From my perspective, the direction of 
this proposal is correct. I see some technical details in your PR that need to 
be confirmed.
   
   1. Health check bypassed — Client preference paths directly match instances, 
bypassing health check filtering. The instance corresponding to the 
user-specified model might already be down, but the request is still sent, 
resulting in a 5xx error.
   
   2. Lack of input validation for the `models` field — If the client sends a 
non-array type (such as a string or number), `match_client_models()` will 
directly return a 500 error. This is the gateway; it shouldn't crash due to a 
malformed request.
   
   3. Appending order of unmatched instances disrupts server priority — After 
matching, the remaining instances are directly appended without maintaining the 
original priority order. This means that the priority configured by the 
administrator becomes invalid as soon as a client request arrives.
   
   4. The `models` field is only removed from the request body when the feature 
is enabled — It should be unconditionally removed. Otherwise, this non-standard 
field will be passed through to the upstream LLM, potentially leading to 
unpredictable behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to