Hi, We have observed that Tomcat doesn't gracefully close keep-alive connections. Tomcat waits for already started requests to complete, but once those are done, Tomcat will close all connections immediately, irrespective of any configured keepAliveTimeout. This causes problems for some HTTP clients, especially in Kubernetes-like environments when scaling down pods. Here, it can only work gracefully if the HTTP client who falls victim to an unexpectedly closed connection retries on a fresh connection, and it is not all clients that do this.
I would think that an entirely graceful shutdown sequence, in the presence of keep-alive connections, would work like the following: 1) Server receives shutdown request 2) Server immediately stops accepting new connections (already happens) 3) Server completes all requests already in (already happens) 4) New behavior: If new requests come in on already established keep-alive connections those are processed, but a "Connection: close" is returned so the client knows this connection can no longer be used. So at most one more request can be processed on each of those existing connections. 5) New behavior: When all keep-alive connections are gone, shutdown proceeds. If there are still connections left after the keepAliveTimeout has passed, this means no requests can have been received on those during the shutdown period (otherwise they would have been closed in #4). And since Tomcat returned the keep-alive timeout value to the client when the connection was setup, the client should know that the connection is no longer usable. Therefore it is from this point safe for Tomcat to close those remaining connections. 6) Rest of server shutdown continues Br, M. Thiim --- Background: The current behavior is problematic in e.g. a Kubernetes environment because there's no way to drain the traffic when scaling down pods. While Kubernetes will immediately stop forwarding new connections to the running pod, it can't do anything about already established connections, including keep-alive connections. Those can be partially handled by defining a preStop delay so that Kubernetes waits a certain amount of time from the time it received the stop request and stopped new connections and until it actually shuts down the application. However, even this doesn't solve the problem because even if preStop delay is configured to be longer than the keepAliveTimeout, there can still be open connections because the keepAliveTimeout requires that the connection isn't used for the whole period. A client can therefore avoid the timeout by just keep using the connection (as will happen in a system with constant traffic). So when the preStop delay ends,there can still be many open connections and once Tomcat receives the shutdown signal it will currently just close these connections. This causes problems for many different HTTP client implementations. Some can be fixed by configuring those HTTP clients (i.e. max lifetime of a keep alive connections) but that requires fixing all HTTP clients that might call (Ingress proxies, pod-to-pod communication etc.) and it seems the problem is better addressed through enforcement on the server.