vvvvvant opened a new issue, #12432:
URL: https://github.com/apache/apisix/issues/12432
### Description
I conducted a test where I added an unreachable node upstream of the active
check. I found that the status of this node was also "health". I can guarantee
that it is unreachable. Upon observing the forwarding logs of this route, I
noticed that some traffic to this node would time out or server err (502)
before being forwarded to a normal node (200) .
I set up a simple HTTP program locally (ensuring it is reachable). After
making a request, I found that there was no proactive health check request sent
from apisix. When I stopped one of the processes, although the apisix log
seemed to mark that upstream as unhealthy, requests were still passing through
it and only got forwarded to the healthy node after it became unreachable.
This is my upstream:
{
"nodes": [
{
"host": "192.168.8.25",
"port": 8001,
"weight": 1
},
{
"host": "192.168.8.25",
"port": 8002,
"weight": 1
}
],
"timeout": {
"connect": 6,
"send": 600,
"read": 600
},
"type": "roundrobin",
"checks": {
"active": {
"concurrency": 10,
"healthy": {
"http_statuses": [
200,
302,
404
],
"interval": 2,
"successes": 1
},
"http_path": "/health",
"timeout": 3,
"type": "http",
"unhealthy": {
"http_failures": 2,
"http_statuses": [
429,
500,
501,
502,
503,
504,
505,
499
],
"interval": 2,
"tcp_failures": 2,
"timeouts": 2
}
}
},
"scheme": "http",
"pass_host": "pass",
"name": "test-health",
"keepalive_pool": {
"idle_timeout": 60,
"requests": 1000,
"size": 320
}
}
Router:
{
"uri": "/*",
"name": "rotuer-4-test-healthy-new",
"desc": "临时测试主动健康检查",
"methods": [
"GET",
"POST",
"PUT",
"DELETE",
"PATCH",
"HEAD",
"OPTIONS",
"CONNECT",
"TRACE"
],
"host": "healthcheck.com",
"upstream_id": "575513659149648574",
"enable_websocket": true,
"status": 1
}
Logs:
192.168.8.25 - - [15/Jul/2025:17:41:11 +0800] healthcheck.com "GET /ping
HTTP/1.1" 200 18 0.004 "-" "curl/7.68.0" 192.168.8.25:8002 200 0.002
"http://healthcheck.com"
2025/07/15 17:41:17 [error] 2599#2599: *16525027 connect() failed (111:
Connection refused) while connecting to upstream, client: 192.168.8.25, server:
_, request: "GET /ping HTTP/1.1", upstream: "http://192.168.8.25:8002/ping",
host: "healthcheck.com"
2025/07/15 17:41:17 [warn] 2599#2599: *16525027 [lua] healthcheck.lua:1383:
log(): [healthcheck] (upstream#/apisix/upstreams/575513659149648574) unhealthy
TCP increment (2/2) for '(192.168.8.25:8002)' while connecting to upstream,
client: 192.168.8.25, server: _, request: "GET /ping HTTP/1.1", upstream:
"http://192.168.8.25:8002/ping", host: "healthcheck.com"
192.168.8.25 - - [15/Jul/2025:17:41:17 +0800] healthcheck.com "GET /ping
HTTP/1.1" 200 18 0.009 "-" "curl/7.68.0" 192.168.8.25:8002, 192.168.8.25:8001
502, 200 0.001, 0.004 "http://healthcheck.com"
2025/07/15 17:41:22 [error] 2327#2327: *16525984 connect() failed (111:
Connection refused) while connecting to upstream, client: 192.168.8.25, server:
_, request: "GET /ping HTTP/1.1", upstream: "http://192.168.8.25:8002/ping",
host: "healthcheck.com"
2025/07/15 17:41:22 [warn] 2327#2327: *16525984 [lua] healthcheck.lua:1383:
log(): [healthcheck] (upstream#/apisix/upstreams/575513659149648574) unhealthy
TCP increment (3/2) for '(192.168.8.25:8002)' while connecting to upstream,
client: 192.168.8.25, server: _, request: "GET /ping HTTP/1.1", upstream:
"http://192.168.8.25:8002/ping", host: "healthcheck.com"
192.168.8.25 - - [15/Jul/2025:17:41:22 +0800] healthcheck.com "GET /ping
HTTP/1.1" 200 18 0.006 "-" "curl/7.68.0" 192.168.8.25:8002, 192.168.8.25:8001
502, 200 0.001, 0.002 "http://healthcheck.com"
2025/07/15 17:41:27 [error] 2355#2355: *16527055 connect() failed (111:
Connection refused) while connecting to upstream, client: 192.168.8.25, server:
_, request: "GET /ping HTTP/1.1", upstream: "http://192.168.8.25:8002/ping",
host: "healthcheck.com"
2025/07/15 17:41:27 [warn] 2355#2355: *16527055 [lua] healthcheck.lua:1383:
log(): [healthcheck] (upstream#/apisix/upstreams/575513659149648574) unhealthy
TCP increment (4/2) for '(192.168.8.25:8002)' while connecting to upstream,
client: 192.168.8.25, server: _, request: "GET /ping HTTP/1.1", upstream:
"http://192.168.8.25:8002/ping", host: "healthcheck.com"
192.168.8.25 - - [15/Jul/2025:17:41:27 +0800] healthcheck.com "GET /ping
HTTP/1.1" 200 18 0.005 "-" "curl/7.68.0" 192.168.8.25:8002, 192.168.8.25:8001
502, 200 0.001, 0.002 "http://healthcheck.com"
And I found healthcheck's result is endpoints are all healthy:
{"name":"/apisix/upstreams/575513659149648574","nodes":[{"counter":{"success":0,"tcp_failure":0,"timeout_failure":0,"http_failure":0},"port":8001,"status":"healthy","hostname":"192.168.8.25","ip":"192.168.8.25"},{"counter":{"success":0,"tcp_failure":0,"timeout_failure":0,"http_failure":0},"port":8002,"status":"healthy","hostname":"192.168.8.25","ip":"192.168.8.25"}],"type":"http"}
### Environment
- APISIX version (run `apisix version`): v3.9.1
- Operating system (run `uname -a`): centos7.9
- OpenResty / Nginx version (run `openresty -V` or `nginx -V`): 1.25.3.1
- etcd version, if relevant (run `curl
http://127.0.0.1:9090/v1/server_info`): 3.4.13
- APISIX Dashboard version, if relevant: 3.0.1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]