> A small question from my side: I see that the underlying assumption is > that > > > Sidecar is able to query Cassandra instances before bouncing/recognizing > > > the bounce. What if it could not communicate with the Cassandra instance > > > (e.g., binary protocol disabled, C* process experiencing issues, or C* > > > process starting as part of a new DC)? > > This would fall under scenario #2 in the Error Handling > <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-53%3A+Cassandra+Rolling+Restarts+via+Sidecar#CEP53:CassandraRollingRestartsviaSidecar-ErrorHandling> > section > of the CEP. If a Sidecar instance can’t communicate with Cassandra, after a > configurable timeout and amount of retries, the Sidecar instance should > mark the job as failed. >
Returning an error is indeed appropriate for communication failures, but this means operators would need to find alternative ways to bounce nodes and couldn't rely on these APIs consistently. A hard dependency on Cassandra instances would also create challenges for startup scenarios where operators need to start C* instances for the first time—such as brand new cluster creation, replacing an existing node, or starting nodes in a new datacenter. If the responsibility of updating Cassandra tables is delegated to a plugin or default implementation for communicating bounces, it would allow others to implement their own solutions more freely (potentially bypassing C* table updates for bounce communication). I understand that this CEP focuses on restart operations (stop + start), but I believe an interface and plugin pattern could address a broader range of operational challenges. On Mon, Sep 8, 2025 at 10:22 PM Andrés Beck-Ruiz <[email protected]> wrote: > Our original thinking was that we could store node UUIDs but return IP > addresses so that operators can better identify Cassandra nodes, but I > recognize that it could also cause confusion as the address is not a > persistent node identifier. I agree that the benefit of unifying the API > and schema by using node UUIDs exclusively outweighs the cost of API > ergonomics. I can update the CEP to reflect this unless there are differing > opinions. > > Best, > Andrés > > On Mon, Sep 8, 2025 at 5:03 AM Sam Tunnicliffe <[email protected]> wrote: > >> Hi Andrés, thanks for this comprehensive CEP. >> >> I have a query about the representation of nodes in the various tables >> and responses. >> >> In the sidecar_internal.cluster_ops_node_state table, "We store the node >> UUID instead of IP address to guarantee that the correct Cassandra nodes >> are restarted in case any node moves to another host.". However, in the >> main sidecar_internal.cluster_ops table the nodes participating in the >> operation are represented as a list of IP addresses. Likewise, in the >> sample HTTP responses nodes always appear to be identified by their >> address, not ID. >> >> It's true that operators are more accustomed to dealing with addresses >> than IDs but it's equally the case that the address is not a persistent >> node identifier, as noted in this CEP. For that reason, in C* trunk the >> emphasis is shifting to lean more on node IDs, so I feel it would be a >> retrograde step to introduce new APIs which include only addresses. Could >> the schema and API responses in this CEP be unified in some way, either by >> using IDs exclusively or by extending the node representation to something >> that can incorporate both an ID and address? >> >> Among other things, Accord relies on persistent node IDs for correctness >> and a unique and persistent identifier is now assigned to every node as a >> prerequisite for it joining the cluster. This ID is a simple integer and is >> encoded into the node's Host ID which is the UUID available in various >> system tables, gossip state and nodetool commands. The initial thinking >> behind encoding in the Host ID was to maintain compatibility with existing >> tooling but at some point we will start to expose the ID directly in more >> places. Right now there is a vtable which shows the IDs directly, >> system_views.cluster_metadata_directory. >> >> Thanks, >> Sam >> >> > On 8 Sep 2025, at 02:36, Andrés Beck-Ruiz <[email protected]> >> wrote: >> > >> > Hello all, >> > >> > Thanks for the feedback. I agree with the suggestions that operation >> state storage should be pluggable, with an initial implementation >> leveraging Cassandra as we have proposed. I have made edits to the >> Distributed Restart Design section in the CEP to reflect this. >> > >> > > As for the API, I think the question that needs to be answered is if >> it is >> > > worthwhile to have a distinction between single-node operations and >> > > cluster-wide operations. For example, if I wanted to restart a single >> node >> > > using the API proposed in CEP-53, I could submit a restart job with a >> > > single node in the “nodes” list. This provides API simplicity at the >> cost >> > > of ergonomics. It also means that all inter-sidecar communication >> would go >> > > through the proposed cluster_ops_node_state table. Personally, I think >> > > these are acceptable tradeoffs to provide a unified API for >> operations that >> > > is simpler for a user or operator to use and learn. >> > >> > I agree that we should provide a unified API that does not distinguish >> between single-node and cluster-wide operations. I think the benefit of API >> simplicity from a development and client perspective outweighs the cost of >> ergonomics. >> > >> > > A small question from my side: I see that the underlying assumption >> is that >> > > Sidecar is able to query Cassandra instances before >> bouncing/recognizing >> > > the bounce. What if it could not communicate with the Cassandra >> instance >> > > (e.g., binary protocol disabled, C* process experiencing issues, or C* >> > > process starting as part of a new DC)? >> > >> > This would fall under scenario #2 in the Error Handling section of the >> CEP. If a Sidecar instance can’t communicate with Cassandra, after a >> configurable timeout and amount of retries, the Sidecar instance should >> mark the job as failed. >> > >> > > 1. Have we considered introducing the concept of a datacenter >> alongside cluster? >> > > I imagine there will be cases where a user wants to perform a rolling >> restart on a >> > > single datacenter rather than across the entire cluster. >> > >> > I think this could be added in the future, but for this initial >> implementation an operator would submit the nodes part of a datacenter to >> restart a datacenter. I prefer providing a unified API that can handle >> single node and cluster (or datacenter) wide operations over separate APIs >> which might be easier to use in isolation but complicate development and >> discoverability. >> > >> > >2. Do we see this framework extending to other cluster- or >> datacenter-wide operations, >> > > such as scale-up/scale-down operations, or backups/restores, or >> nodetool rebuilds >> > > run as part of adding a new datacenter? >> > >> > Yes, our goal with this design is that it is extensible for future >> operations, as well as currently supported operations (such as node >> decommissions) that already exist in Sidecar. In the initial Cassandra >> storage implementation, all inter-sidecar communication and operation >> tracking could occur in the proposed cluster_ops_node_state table. >> > >> > >> > > The design seems focused on cluster/availability signals (ring stable, >> > > peers up), which is a great start, but doesn’t mention pluggable >> workload >> > > signals like: 1) compaction load (nodetool compactionstats) 2) >> netstats >> > > activity (nodetool netstats) 3) hints backlog / streaming pending >> flushes >> > > or memtable pressure. >> > > Since restarting during heavy compaction/hints can add risk, are these >> > > kinds of workload-aware checks in scope for the MVP, or considered >> future >> > > work? >> > >> > I agree that the health check should be pluggable as well— this was >> also proposed in CEP-1. For the first iteration of rolling restarts, we are >> thinking of providing a health check implementation that checks for all >> other Cassandra peers being up, and future work can add more robust health >> checks. >> > >> > Best, >> > Andrés >> > >> > On Fri, Aug 29, 2025 at 3:56 PM Andrés Beck-Ruiz < >> [email protected]> wrote: >> > Hello everyone, >> > >> > We would like to propose CEP 53: Cassandra Rolling Restarts via Sidecar >> ( >> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-53%3A+Cassandra+Rolling+Restarts+via+Sidecar >> ) >> > >> > This CEP builds off of CEP-1 and proposes a design for safe, efficient, >> and operator friendly rolling restarts on Cassandra clusters, as well as an >> extensible approach for persisting future cluster-wide operations in >> Cassandra Sidecar. We hope to leverage this infrastructure in the future to >> implement upgrade automation. >> > >> > We welcome all feedback and discussion. Thank you in advance for your >> time and consideration of this proposal! >> > >> > Best, >> > Andrés and Paulo >> >>
