Hi Abe,
Thank you for your questions! Replies below.
Why OverloadedException is insufficient
--------
There are two different situations here, and they call for different client
behavior:
* Desired client behavior on a *real* OverloadedException:
Retry the request on the next node (optionally with some temporary
backoff for that node), because the overload may be transient and the node
is still healthy.
* Desired client behavior on a node that is intentionally shutting down:
Stop sending *new* requests on that connection, allow in-flight requests
to complete if possible, and begin reconnecting with backoff.
Today, OverloadedException doesn’t let drivers consistently implement the
second behavior. Drivers typically treat it as a per-request failure: they
retry that request on another node, but the pool can still send subsequent
requests to the same node/connection. If the node is actually in the
process of going away, those subsequent requests can still hit that
connection and end up as client-visible timeouts.
Therefore we need GRACEFUL_DISCONNECT as a deterministic, connection-local
drain signal: it lets the driver immediately transition that specific
connection into draining mode, avoiding failures/retries/timeouts in the
first place.
Appendix [1] includes a Simulacron example illustrating this: the server
first responds with OverloadedException and then goes away; the client's
subsequent requests sent on the same connection time out.
Server handling of requests after GRACEFUL_DISCONNECT
--------
I think the server must assume it can still receive some requests right
after emitting the event (e.g. requests already in flight on the wire / in
buffers, or the event being delayed). I don’t see a robust way for the
server to perfectly classify “sent before vs after the client observed the
event” at the protocol level.
The safest behavior I can think of is:
* Server stops accepting new connections immediately, preventing the race
conditions where the new connections never get to receive
GRACEFUL_DISCONNECT,
* emit GRACEFUL_DISCONNECT,
* keep processing requests that arrive on already-established subscribed
connections during a bounded grace/drain window,
* then force-close anything that hasn’t drained by the deadline.
That minimizes mid-flight interruption (and late retries deep into the
client latency budget), while still giving operators a hard upper bound on
how long shutdown can be deferred.
Thanks again for the feedback!
Regards,
Jane
Appendix [1] Simulacron example
private static final SimulacronRule SIMULACRON_RULE =
new SimulacronRule(ClusterSpec.builder().withNodes(1));
private static final SessionRule<CqlSession> SESSION_RULE =
SessionRule.builder(SIMULACRON_RULE).build();
@Test
public void should_timeout_when_node_returns_overloaded_then_delays() {
// This test reproduces the case where a server shutting down throws an
OverloadedException
// and the driver still sends subsequent requests to that node, which
ends up timing out.
String query = "SELECT * FROM test";
// First request: node returns OverloadedException (simulating server
shutting down)
SIMULACRON_RULE.cluster().prime(when(query).then(overloaded("Overloaded")));
SimpleStatement st =
SimpleStatement.builder(query)
.setTimeout(Duration.ofSeconds(2))
.setConsistencyLevel(DefaultConsistencyLevel.ONE)
.build();
// Execute first request - should get OverloadedException
Throwable firstError = catchThrowable(() ->
SESSION_RULE.session().execute(st));
assertThat(firstError).isInstanceOf(OverloadedException.class);
// Clear the first prime and set up a no response
SIMULACRON_RULE.cluster().clearPrimes(true);
SIMULACRON_RULE.cluster().prime(when(query).then(noResult()));
// Execute second request - timeout
Throwable secondError = catchThrowable(() ->
SESSION_RULE.session().execute(st));
assertThat(secondError)
.isInstanceOf(DriverTimeoutException.class)
.hasMessageContaining("Query timed out after PT2S");
}
On Fri, Jan 16, 2026 at 12:28 PM Abe Ratnofsky <[email protected]> wrote:
> Thanks for the CEP Jane!
>
> Could you elaborate on why the current OverloadedException behavior is
> insufficient in 5.x (CASSANDRA-19534)? If a server instance can’t accept a
> new request, it signals that immediately to the client for retry against
> another host, then eventually the client should receive a NODE_DOWN event
> on the control connection. This does mean that clients need to have queries
> marked as idempotent in order for them to retry, and even non-idempotent
> queries are safe to retry when they fail in this way. In the past, we’ve
> discussed a new exception hierarchy that allows the server to indicate
> whether a non-idempotent query would be safe to retry.
>
> If we can address that drawback of OverloadedException for non-idempotent
> queries, are there any other drawbacks of the current approach?
>
> In your proposal, the server will still need to handle new requests issued
> from clients after GRACEFUL_DISCONNECT is sent, particularly if the EVENT
> is delayed or dropped. If those requests are going to get processed, we’ll
> either have to continue deferring shutdown, or interrupt them and trigger
> retries later into the client’s latency budget.