A distinction that resonated with me: • Control Plane = Sidecar • Data Plane = DB I think that's *directionally* true, but there's no objective definition of what qualifies as one plane or the other.
On top of that, the sidecar is in a unique position where it supports functionality across multiple versions of C*, so if you're looking to implement something with a unified interface that may differ in implementation across multiple versions of C* (say, if you're running a large fleet w/different versions in it), there's pressure there driving certain functionality into the sidecar. On Thu, Oct 9, 2025, at 1:42 PM, Isaac Reath wrote: > I don't have too much of an insight on CQL as a whole, but I can offer my > views on Sidecar & the CQL Management API. > > In terms of a rubric for what belongs in Sidecar, I think taking inspiration > from CEP-1, it should be functionality needed to manage a Cassandra cluster. > My perspective on how this fits in with the CQL Management API (and authors > of CEP-38 please correct me if I'm wrong), is that CEP-38 is looking to offer > an alternative to JMX for single-node management operations, whereas Sidecar > is focused on holistic cluster-level operations. > > Using rebuild as an example from CEP-38: a user can invoke the CQL Management > API to run a rebuild for a single node, but would rely on Sidecar to rebuild > an entire datacenter, with Sidecar in turn calling the CQL Management API on > individual nodes. Similarly, a user could use the CQL Management API to > update the configurations which are able to be changed without a restart > (similar to how nodetool setconcurrency does today), but Sidecar would > provide a single interface to update all configurations, including those > which require restarts. Additionally, Sidecar will support operations which > may not involve the CQL Management API at all, such as live instance > migration as laid out in CEP-40. > > Happy to hear other perspectives on this. > Isaac > > On Wed, Oct 8, 2025 at 3:02 PM Joel Shepherd <[email protected]> wrote: >> To clarify, since I was pinged directly about this ... >> >> It's not my intent to impede any of the efforts listed below and I >> apologize if it sounded otherwise. >> >> I am deeply curious and interested in the eventual scope/charter of CQL, >> CQL Admin APIs, and Sidecar. A little overlap is probably unavoidable >> but IMO it would be detrimental to the project overall to not have a >> clear scope for each area. If those scopes have already been defined, >> I'd love pointers to decisions so I can get it straight in my head. If >> they haven't and the community is comfortable with that, okay too. If >> they haven't and anyone else is a little squirmy about that, what's the >> right way to drive a conversation? >> >> Thanks -- Joel. >> >> On 10/7/2025 4:57 PM, Joel Shepherd wrote: >> > >> > Thanks for the clarifications on CEP-38, Maxim: I actually got some >> > insights from your comments below that had slipped by me while reading >> > the CEP. >> > >> > I want to fork the thread a bit, so breaking this off from the CEP-38 >> > DISCUSS thread. >> > >> > If I can back away a bit and squint ... It seems to me that there are >> > three initiatives floating around at the moment that could make >> > Cassandra more awesome and manageable, or make it confusing and complex. >> > >> > 1) Patrick McFadin's proposal (as presented at CoC) to align CQL >> > syntax/semantics closely with PostgreSQL's. I haven't heard anyone >> > strongly object, but have heard several expressions of surprise. Maybe >> > something is already in the works, but I'd love to see and discuss a >> > proposal for this, so there's consensus that it's a good idea and (if >> > needed) guidelines on how to evolve CQL in that direction. >> > >> > 2) CQL management API (CEP-38): As mentioned in the CEP, it'll take >> > some time to implement all the functionality that could be in scope of >> > this CEP. I wonder if it'd be beneficial to have some kind of rubric >> > or guidelines for deciding what kind of things make sense to manage >> > via CQL, and what don't. For example, skimming through the PostgreSQL >> > management commands, many of them look like they could be thin >> > wrappers over SQL executed against "private" tables and views in the >> > database. I don't know that that is how they are implemented, but many >> > of the commands are ultimately just setting a value, or reading and >> > returning values that could potentially be managed in tables/views of >> > some sort. (E.g., like Cassandra virtual tables). That seems to fit >> > pretty neatly with preserving SQL as a declarative, data independent >> > language for data access, with limited side-effects. Is that a useful >> > filter for determining what kinds of things can be managed via CQL >> > management, and which should be handled elsewhere? E.g., is a >> > filesystem operation like nodetool scrub a good candidate for CQL >> > management or not? (I'd vote not: interested in what others think.) >> > >> > 3) Cassandra Sidecar: Like the CQL management API, I wonder if it'd be >> > beneficial to have a rubric for deciding what kinds of things make >> > sense to go into Sidecar. The recent discussion about CEP-55 >> > (generated role names) landed on implementing the functionality both >> > as a CQL statement and as a Sidecar API. There's also activity around >> > using SIdecar for rolling restarts, backup and restore, etc.: control >> > plane activities that are largely orthogonal to interacting with the >> > data. Should operations that are primarily generating or manipulating >> > data be available via Sidecar to give folks the option of invoking >> > them via CQL or HTTP/REST, or would Sidecar benefit from having a more >> > narrowly scope charter (e.g. data-agnostic control plane operations only)? >> > >> > I think all of these tools -- CQL, CQL Management API and Sidecar -- >> > will be more robust, easier to use, and easier to maintain if we have >> > a consistent way of deciding where a given feature should live, and a >> > minimal number of choices for accessing the feature. Orthogonal >> > controls. Since Sidecar and CQL Management API are pretty new, it's a >> > good time to clarify their charter to ensure they evolve well >> > together. And to get consensus on the long-term direction for CQL. >> > >> > Let me know if I can help -- Joel. >> > >> > >> > On 10/7/2025 12:22 PM, Maxim Muzafarov wrote: >> >> Hello Folks, >> >> >> >> >> >> First of all, thank you for your comments. Your feedback motivates me >> >> to implement these changes and refine the final result to the highest >> >> standard. To keep the vote thread clean, I'm addressing your questions >> >> in the discussion thread. >> >> >> >> The vote is here: >> >> https://lists.apache.org/thread/zmgvo2ty5nqvlz1xccsls2kcrgnbjh5v >> >> >> >> >> >> = The idea: = >> >> >> >> First, let me focus on the general idea, and then I will answer your >> >> questions in more detail. >> >> >> >> The main focus is on introducing a new API (CQL) to invoke the same >> >> node management commands. While this has an indirect effect on tooling >> >> (cqlsh, nodetool), the tooling itself is not the main focus. The scope >> >> (or Phase 1) of the initial changes is narrowed down only to the API >> >> only, to ensure the PR remains reviewable. >> >> >> >> This implies the following: >> >> - the nodetool commands and the way they are implemented won't change >> >> - the nodetool commands will be accessible via CQL, their >> >> implementation will not change (and the execution locality) >> >> - this change introduces ONLY a new way of how management commands >> >> will be invoked >> >> - this change is not about the tooling (cqlsh, nodetool), it will help >> >> them evolve, however >> >> - these changes are being introduced as an experimental API with a >> >> feature flag, disabled by default >> >> >> >> >> >> = The answers: = >> >> >> >>> how will the new CQL API behave if the user does not specify a hostname? >> >> The changes only affect the API part; improvements to the tooling will >> >> follow later. The command is executed on the node that the client is >> >> connected to. >> >> Note also that the port differs from 9042 (default) as a new >> >> management port will be introduced. See examples here [1]. >> >> >> >> cqlsh 10.20.88.164 11211 -u myusername -p mypassword >> >> nodetool -h 10.20.88.164 -p 8081 -u myusername -pw mypassword >> >> >> >> If a host is not specified, the cli tool will attempt to connect to >> >> localhost. I suppose. >> >> >> >> >> >>> My understanding is that commands like nodetool bootstrap typically run >> >>> on a single node. >> >> This is correct; however, as I don't control the implementation of the >> >> command, it may actually involve communication with other nodes. This >> >> is actually not part of this CEP. I'm only reusing the commands we >> >> already have. >> >> >> >> >> >>> Will we continue requiring users to specify a hostname/port explicitly, >> >>> or will the CQL API be responsible for orchestrating the command safely >> >>> across the entire cluster or datacenter? >> >> It seems that you are confusing the API with the tooling. The tooling >> >> (cqlsh, nodetool) will continue to work as it does now. I am only >> >> adding a new way in which commands can be invoked - CQL, >> >> orchestration, however, is the subject of other projects. Cassandra >> >> Sidecar? >> >> >> >> >> >>> It might, however, be worth verifying that the proposed CQL syntax >> >>> aligns with PostgreSQL conventions, and adjusting it if needed for >> >>> cross-compatibility. >> >> It's a bit new info to me that we're targeting PostgreSQL as the main >> >> reference and drifting towards the invoking management operations the >> >> same way. I'm inclined to agree that the syntax should probably be >> >> similar, more or less, however. >> >> >> >> We are introducing a new CQL syntax in a minimal and isolated manner. >> >> The CEP-38 defines a small set of management-oriented CQL statements >> >> (EXECUTE COMMAND / DESCRIBE COMMAND) that can be used to match all >> >> existing nodetool commands at once, introducing further aliases as an >> >> option. This eliminates the need to introduce a new antlr grammar for >> >> each management operation. >> >> >> >> The command execution syntax is the main thing that users interact >> >> with in this CEP, but I'm taking a more relaxed approach to it for the >> >> following reasons: >> >> - the tip of the iceberg, the unification of the JMX, CQL and possible >> >> REST API for Cassandra is priority; >> >> - the feature will be in experimental state in the major release, we >> >> need collect the real feedback from users and their deployments; >> >> - the aliasing will be used for some important commands like >> >> compaction, bootstrap; >> >> >> >> Taking all of the above into account, I still think it's important to >> >> reach an agreement, or at least to avoid objections. >> >> So, I've checked the PostgreSQL and SQL standards to identify areas of >> >> alignment. The latter I think is relatively easy to support as >> >> aliases. >> >> >> >> >> >> The syntax proposed in the CEP: >> >> >> >> EXECUTE COMMAND forcecompact WITH keyspace=distributed_test_keyspace >> >> AND table=tbl AND keys=["k4", "k2", "k7"]; >> >> >> >> Other Cassandra-style options that I had previously considered: >> >> >> >> 1. EXECUTE COMMAND forcecompact (keyspace=distributed_test_keyspace, >> >> table=tbl, keys=["k4", "k2", "k7"]); >> >> 2. EXECUTE COMMAND forcecompact WITH ARGS {"keyspace": >> >> "distributed_test_keyspace", "table": "tbl", "keys":["k4", "k2", >> >> "k7"]}; >> >> >> >> With the postgresql context [2] it could look like: >> >> >> >> COMPACT (keys=["k4", "k2", "k7"]) distributed_test_keyspace.tbl; >> >> >> >> The SQL-standard [3][4] procedural approach: >> >> >> >> CALL system_mgmt.forcecompact( >> >> keyspace => 'distributed_test_keyspace', >> >> table => 'tbl', >> >> keys => ['k4','k2','k7'], >> >> options => { "parallel": 2, "verbose": true } >> >> ); >> >> >> >> >> >> Please let me know if you have any questions, or if you would like us >> >> to arrange a call to discuss all the details. >> >> >> >> >> >> [1]https://www.instaclustr.com/support/documentation/cassandra/using-cassandra/connect-to-cassandra-with-cqlsh/ >> >> [2]https://www.postgresql.org/docs/current/sql-vacuum.html >> >> [3]https://en.wikipedia.org/wiki/Stored_procedure?utm_source=chatgpt.com#Implementation >> >> [4]https://www.postgresql.org/docs/9.3/functions-admin.html >> >> >> >>
