Hello! On Tue, Sep 18, 2018 at 06:02:46AM -0400, domleb wrote:
> While running a load test that injects 10k TPS across 3 Nginx instances, we > are seeing spikes of errors where Nginx returns HTTP 502 and logs the > message 'no live upstreams while connecting to upstream'. There are no > other errors logged e.g. connection errors. > > Also, we have a single upstream virtual IP (we use iptables to balance load > across the backend) and according to the docs the upstream should never be > marked as down in this case: > > 'If there is only a single server in a group, max_fails, fail_timeout and > slow_start parameters are ignored, and such a server will never be > considered unavailable' > > Testing locally with our config confirms this and I cannot reproduce the 'no > live upstreams while connecting to upstream' message when simulating > connection and read errors with a single upstream. > > To debug I tried enabling debug logs but under load that degraded > performance too much. I also traced the worker process with strace and > didn't find any socket or other other errors during the 502 spike. > > I was able to create this issue on Nginx 1.12.2 and 1.15.3. > > So given that we don't see any source error and we have a single upstream, > I'm interested to know what other scenarios could result in a 502 with the > log message 'no live upstreams while connecting to upstream'? Could you please show the upstream configuration you are using? With a single server in the upstream block, "no live upstreams" error may happen if: - the server is marked "down" in the configuration, or - the server reached the max_conns limit. Also note that "a single server" does not apply to cases when there is a single hostname which resolves to multiple IP address (this defines multiple servers at once). -- Maxim Dounin http://mdounin.ru/ _______________________________________________ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx