On 9/17/2025 1:21 AM, Štefan Miklošovič wrote:
On Wed, Sep 17, 2025 at 2:17 AM Joel Shepherd <[email protected]> wrote:
Could I make a suggestion? Well, I will make a suggestion :-) ,
but if it's not useful then feel free to ignore it.
Could we talk a bit about how users/operators would work with the
CREATE ROLE features you're proposing?
Somewhat related to that ... is there any need for role
"stability" across clusters: e.g. I want to create a role that can
access existing tables but not create/drop tables or keyspaces,
and for my own sanity I want that role to have the same name on
every cluster I operate. Do I have to implement a custom role name
generator to do that, or is that common enough functionality that
it should be supportable by the tooling I'm using to manage my
clusters?
I do not think we have such a requirement for "stability". If you had
this requirement then you would not use the feature we are discussing
here and you created them manually. I also do not think that having
the same name everywhere is a good idea in general. Username is
security sensitive as well.
We can agree to disagree on this. :-) I generally don't think names
should be considered especially sensitive but am really looking at this
more from how end-users are going to work with the capability.
The use-case as I understand it is that there are organizations
that have or are going to create large numbers of clusters (say >
3), and they would appreciate some automation around creating role
names and credentials for all those clusters. The proposal is to
extend the CREATE ROLE statement to enable the database to
generate those names and credentials automatically, including
persisting them in the database itself.
One thing I'm wondering about is what kind of tooling those
organizations are likely to be using for creating/managing all
those clusters. Are they going to be scripting, or are they going
to be using some third-party tooling like Terraform,
CloudFormation, Puppet, etc.? If they're using tooling like that,
which is going to be a more natural fit: making role/password
generation available through CQL, or through Sidecar APIs, or ...
? I don't have an opinion at the moment so that's not a rhetorical
question. I'd actually like to reason through what's going to work
best for the folks who actually have to manage tons of clusters
all day long.
I do not see why we should have a ton of logic / functionality outside
of Cassandra for doing basic things. I think that Cassandra is
notoriously known for its "do it yourself" approach and I think _that_
is the primary impediment for broader adoption, not if we dare to
introduce CREATE GENERATED ROLE or not. The focus on usability is
completely missed. For a lot of things you want to have you have to
have "tooling" which you need to take care of and so on. People are
sick of it. They just want to do the thing in the most efficient and
time-saving manner.
This isn't an either-or question. I'm not posing "CREATE GENERATED ROLE"
vs infra-as-code (IAC) support. I'm poking at the best way for the two
to work together. Because I think/hope that most people who run large
clusters and/or a lot of clusters (or really a lot of instances of any
kind of service) use some flavor of IAC. There is a lot more than
Cassandra to manage: there's the hosts, disk in some form, networking
OS, config, keys, schema, etc. If I already have a tool to manage all
the infra, it'd be nice for Cassandra to play nicely with that tooling
so I can do my basic cluster setup set-up via automation as well. That
doesn't exclude me from putting down my IAC tool and continuing on to do
Cassandra configuration in Cassandra if I wish ... but in my mind having
to jump between tools (including cqlsh) to configure different aspects
of all the things involved in standing up my cluster is not a usability
improvement ... especially if I have to do it a lot.
So I'm trying to shed some light on the Sidecar and/or CQL debate by
asking how people are going to be using this functionality "at scale"
(where efficient and time-saving may look very different from adhoc use)
and if there's any benefit to API access via Sidecar vs access via CQL.
(TBH, I'm actually leaning towards your CQL proposal because I think the
attack surface is actually smaller than it is with letting Sidecar
execute CQL on the API caller's behalf.)
Thanks -- Joel.
When I was introduced to this community for the first time, like
2015-16 maybe, I remember that there was somebody on the mailing list
complaining that "repair should be automatic", "that should be
provided", "this should be natively in". People see this for years. It
takes just 9 years to finally introduce automatic repairs. Thank god
for repairing people finally doing that. They should be weighted in
gold. But the response to that was that "well if you need it you need
to write it yourself, there is no "one size fits all!", you need to
take care of that yourself". Just imagine that. This was a kind of
genuinely meant response. How are we going to make this popular if
everything beyond trivial is left to an end user to figure out. Who
sane is going to put up with that? People just want to turn on the
thing and not think too much about it anymore.
I don't have strong opinions on CQL vs Sidecar, but I think one
way to frame the debate is to look at which will work best with
the tooling that people already use to manage large numbers of
clusters.
Thanks -- Joel.
On 9/16/2025 3:15 PM, Štefan Miklošovič wrote:
Oh crap, what a feedback! If nothing else this shows a lesson to
everybody that the most sure way to have a fast feedback if you
are tired of waiting or impatient so you can move quickly is to
just propose your ideas, then boldly proclaim you go to do
something and the universe will mysteriously take care of finding
out somebody who will reject it. Because people are not always
interested in agreeing. A lot of times, they take action only in
case they don't and are put in front of it. So don't be afraid to
take some flak as soon as possible!
On Tue, Sep 16, 2025 at 9:05 PM Patrick McFadin
<[email protected]> wrote:
Thanks Mick, I'm just digging into this more after a long
week of travel.
Generally, I'm -1 for adding more custom syntax. Another
concern of mine is adding control plane actions in DDL. I
understand the usefulness of a feature like this in ops. It's
a great idea.. Here would be my counter proposal:
- Leave the CQL as is and keep "CREATE ROLE" etc as is, and
avoid making changes to core Cassandra.
Why should we keep it "as is"? Genuinely asking. Why? Where is
this need for conserving stuff coming from? Is this what we are
doing here? Adding as little as possible? I think we are stifling
innovation unnecessarily. There was the same discussion about
constraints and CHECK NOT NULL / NOT NULL where we were trying to
follow "the Holy Postgres Grail". I just don't get it. Are we not
obsessed with that at this point? Literally nobody cares if there
will be CREATE GENERATED ROLE. Nobody. Cares. So I do not take
this point of yours as valid without some strong backing from
your side.
- Move the generation & policy to the sidecar project. A
sidecar endpoint will generate the role name/password, enforce
prefix/suffix/length requirements, ensure uniqueness, and
then return the role and password (or a secret handle) to the
caller.
Well the problem I see in putting this to Sidecar is that this
would be only possible to do via HTTP(S). Not everybody is
interested in it. Hardly. Zero interest. Sidecar is 0.2.0 at this
point. I think that realistically speaking I am not far from the
truth at all if I say that there is practically nobody who is
using 0.2.0 in production. 0.2.0. I do not count exceptions as
early adopters or Analytics.
Putting this to Sidecar almost guarantees nobody is going to use
this particular functionality. People have their own control
planes, their own way of generating this stuff and they are not
going to deploy Sidecar just because they want to delegate this
task to it. Come on. I think that it would, paradoxically, create
more problems for them. Not less. So again, I do not take this
point as something which is solving anything. This will have 0
users when put in Sidecar. I think it would be better if we just
flat out refuse this instead of putting that to Sidecar. It is
even worse imho.
Another problem with Sidecar I see is that the current
implementation is pluggable. How do you want to make this
pluggable in Sidecar? Pluggable how? People might have their own
opinion on how role names should be generated. That is why you
can just code your own generator / validator, put it on the class
path and be done with it. How are you supposed to "patch
Sidecar"? You create a custom implementation, then you put it on
the class path of Sidecar? Is this even supported? I think that
you have proposed it with a good will but I don't think that
would fly.
Why?
- End users will have it faster since it will work with any
version of Cassandra supporting the CREATE syntax. (No having
to backport either)
- Keeps control plane actions optional and separated. Not an
attack surface inside core Cassandra
Thirdly, what _attack surface_? I think you are pretty aware of
the fact that this feature is by default turned off. If you have
an organisation deploying hundreds of clusters and for each they
have to figure out some role name for a user which is going to
use it, how is this going to be abused concretely? There are
dedicated accounts for CQL management, creation of a role is tied
to some workflow etc. What is attacked exactly and how? Concrete
examples please.
Dineshi had the concern that "what if we just have a script which
will generate roles repeatedly nonstop?" How is this different
from having a script which would generate roles itself instead of
Cassandra and it would execute that? What's the difference
really? If you want to abuse it you will. There is no protection
against that unless we put some rate limiting in front of it -
which I do not have a problem to address in follow-up work as
already explained.
- We keep the syntax of CQL more generic and less one-off.
I don't think this is relevant, really. I think we should abandon
this mindset. At this point, to make the point, I suspect that
CQL had to "hurt you" somehow :)
Regards
- k8s/Cloud native friendly with separation of control
plane/data plane.
Patrick
On Tue, Sep 16, 2025 at 7:31 AM Mick <[email protected]> wrote:
> I think enough time passed for everybody to participate
in the discussion so I would just move on and start the
voting thread soon.
Can we give CEP discussions longer than ~one week, please.
Folk are easily away/offline for a whole week. Take for
example many who were at Community over Code and may
still be catching up on their inbox, thinking dev@ is a
less urgent folder.
I haven't look at how fast the other CEP discuss threads
have turned around, I apologise if I'm only singling one
out, my concern applies generally.